Michael Phillips, Jr.
7/17/2007 12:14:00 AM
> The above approach works perfectly for all office document types except
> Excel. Excel documents are created, however they seem to be lacking some
> information and will not open natively. I can "fix" them by dragging them
> into IE, where they can be viewed, and then saving them. This work around
> is
> not acceptable to me.
It works for excel also. However the excel spreadsheet is marked as hidden.
There are two ways to solve the problem.
1) Use OLE Automation to create an "Excel.Application" ole object and get
the global Windows collection and mark the hidden window as "visible".
This approach is not very fast as the Excel application must be present
on the system and it takes somes time to load. Using IDispatch is
cumbersome.
2) Use the IStream to find the "WINDOW1" record in the "Workbook" stream.
The record is defined as 0x003D. Once found you load the record.
The record has an options bit field.
Here is the structure:
typedef struct _WINDOW1_RECORD
{
//short window1Marker; // Should be 0x003D
//short recordSize; // size of record in bytes,
biff2-biff4(8bytes),biff5-biff8(18bytes)
short hpos;
short vpos;
short height;
short width;
short options; // bit 1, 0 - visible, 1 - hidden
short idxActiveWkSheet;
short idxVisibleTab;
short selectedWkSheet;
short widthTabBar;
} WINDOW1_RECORD;
It is a "short". If you set it to 0, and then save the stream, the
excel spreadsheet will be visible.
"Heath Kelly" <HeathKelly@discussions.microsoft.com> wrote in message
news:E9DDF098-DCB2-4541-B780-387910949720@microsoft.com...
> Michael, my goal is to extract all documents embedded within any given
> compound document. Here is the current revision of my code:
>
> //6) Get a pointer to the IPersistStorage Interface.
> //comObj = Word, Excel, etc.
> Guid IID_IPersistStorage = typeof(IPersistStorage).GUID;
> IntPtr pIUnk = Marshal.GetIUnknownForObject(comObj);
> IntPtr ptrStorage;
> Int32 r = Marshal.QueryInterface(pIUnk, ref IID_IPersistStorage, out
> ptrStorage);
>
> //7) Load the COM object from the storage.
> IPersistStorage per =
> (IPersistStorage)Marshal.GetObjectForIUnknown(ptrStorage);
> r = per.Load(Marshal.GetIUnknownForObject(store));
>
> //8) Create new storage and save to disk.
> IStorage temp;
> StgCreateDocfile(@"D:\Data\Result\Output" + count,
> STGM.READWRITE|STGM.SHARE_EXCLUSIVE|STGM.CREATE, 0, out temp);
> OleSave(per, temp, false);
>
> I understand that this approach will not work for all documents. For
> non-office documents I have a different approach where I where I write the
> "CONTENTS" stream to the file system. This works for PDFs for example.
>
> The above approach works perfectly for all office document types except
> Excel. Excel documents are created, however they seem to be lacking some
> information and will not open natively. I can "fix" them by dragging them
> into IE, where they can be viewed, and then saving them. This work around
> is
> not acceptable to me.
>
> Excel storages are associated with the clsid of Excel.Application.
> However,
> I can't get an IPersistStorage interface for this object, instead I need
> to
> use the clsid of Excel.Sheet to get any joy here at all. Wondering if my
> problem is somehow related.
>
> Regards,
> Heath.
>
>
> "Michael Phillips, Jr." wrote:
>
>> It is not clear to me what your goal is.
>>
>> A compound document file is laid out as a file system with directories as
>> storage objects and files as streams.
>>
>> The storage objects are not real directories. They represent individual
>> com
>> storage objects.
>> The files are not real files they are data represented as a stream.
>>
>> > //9) Save the file to disk.
>> > IPersistFile perFile =
>> > (IPersistFile)Marshal.GetObjectForIUnknown(ptrFile);
>> > r = perFile.Save(@"D:\Data\Result\Output.doc", true);
>> > //r = E_FAIL
>>
>> What are you trying to save?
>>
>> If for example, there was an embedded object such an excel spreadsheet.
>> You would instantiate the excel object and load its storage via the
>> IPersistStorage interface.
>>
>> You would then create a new compound document via StgCreateDocfile and
>> then
>> use OleSave to save the excel storage
>> object and all of its associated streams.
>>
>> The clsid of the com object is written to the storage object. It is that
>> clsid that you use to instantiate the com object.
>>
>> Each embedded object in your Word document has a different format. Some
>> objects are packaged like embedded pdf files.
>> For pdf files the above will not work. You would have to extract the
>> "Contents" stream and use the windows file system api
>> to write the file.
>>
>>
>>