[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Word + win32ole - how to find formatting of a word?

Mohit Sindhwani

10/25/2008 8:34:00 AM

HI! I'm trying to use Ruby and win32ole to parse a Word document. So
far, I'm able to extract the style and text of each paragraph. That
works great to convert it into individual divs (in the HTML CSS sense).

Now, inside the paragraphs, there are certain words that have special
formatting (for e.g. the name of a command which is in monospace) - I'm
trying to find how to extract those special cases. Does anyone know how
to achieve that?

Appreciate your help - thanks!

Cheers,
Mohit.
10/25/2008 | 4:33 PM.


8 Answers

Axel Etzold

10/25/2008 2:34:00 PM

0

> HI! I'm trying to use Ruby and win32ole to parse a Word document. So
> far, I'm able to extract the style and text of each paragraph. That
> works great to convert it into individual divs (in the HTML CSS sense).
>
> Now, inside the paragraphs, there are certain words that have special
> formatting (for e.g. the name of a command which is in monospace) - I'm
> trying to find how to extract those special cases. Does anyone know how
> to achieve that?
>

Dear Mohit,

you could save the Word file as an html and then extract the relevant information...
I did that using OpenOffice and got a file containing the font information in the following form.


<BODY LANG="en-US" DIR="LTR">
<P STYLE="margin-bottom: 0in">A command in <FONT FACE="Linux Libertine">Linux
Libertine</FONT></P>
<P STYLE="margin-bottom: 0in">A text in <FONT FACE="Bitstream Charter, serif">Bitstream
Charter</FONT></P>
</BODY>

If you read in the text of that file as a String, you can then find the relevant bits using regexps.

Best regards,

Axel

--
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
Ideal für Modem und ISDN: http://www.gmx.net/de/go/s...

Mohit Sindhwani

10/26/2008 9:48:00 AM

0

Axel Etzold wrote:
>> HI! I'm trying to use Ruby and win32ole to parse a Word document. So
>> far, I'm able to extract the style and text of each paragraph. That
>> works great to convert it into individual divs (in the HTML CSS sense).
>>
>> Now, inside the paragraphs, there are certain words that have special
>> formatting (for e.g. the name of a command which is in monospace) - I'm
>> trying to find how to extract those special cases. Does anyone know how
>> to achieve that?
>>
>
> Dear Mohit,
>
> you could save the Word file as an html and then extract the relevant information...
> I did that using OpenOffice and got a file containing the font information in the following form.
>
>
> <BODY LANG="en-US" DIR="LTR">
> <P STYLE="margin-bottom: 0in">A command in <FONT FACE="Linux Libertine">Linux
> Libertine</FONT></P>
> <P STYLE="margin-bottom: 0in">A text in <FONT FACE="Bitstream Charter, serif">Bitstream
> Charter</FONT></P>
> </BODY>
>

Hi Axel

Thanks for replying! Converting to HTML and working with that is my
last option actually. In a well-written document, I found that using
Word to return style information about the paragraph is a lot less work
and relatively easy to work with. I guess it's time to consider your
suggestion!

Cheers,
Mohit.
10/26/2008 | 5:44 PM.


Mohit Sindhwani

10/26/2008 1:15:00 PM

0

Mohit Sindhwani wrote:
> Axel Etzold wrote:
>>> HI! I'm trying to use Ruby and win32ole to parse a Word document.
>>> So far, I'm able to extract the style and text of each paragraph.
>>> That works great to convert it into individual divs (in the HTML CSS
>>> sense).
>>>
>>> Now, inside the paragraphs, there are certain words that have
>>> special formatting (for e.g. the name of a command which is in
>>> monospace) - I'm trying to find how to extract those special cases.
>>> Does anyone know how to achieve that?
>>>
>>
>> Dear Mohit,
>> you could save the Word file as an html and then extract the
>> relevant information...
>> I did that using OpenOffice and got a file containing the font
>> information in the following form.
>>
>
> Hi Axel
>
> Thanks for replying! Converting to HTML and working with that is my
> last option actually. In a well-written document, I found that using
> Word to return style information about the paragraph is a lot less
> work and relatively easy to work with. I guess it's time to consider
> your suggestion!
>
Actually, after digging around, I found that this gets me somewhere there:
words = doc.Words
words.each {|w|
index += 1
ft = w.Font.Name
ftHash[ft] = 1
}

Thanks for your help!

Cheers,
Mohit.
10/26/2008 | 9:14 PM.



Axel Etzold

10/26/2008 8:30:00 PM

0


-------- Original-Nachricht --------
> Datum: Sun, 26 Oct 2008 22:14:53 +0900
> Von: Mohit Sindhwani <mo_mail@onghu.com>
> An: ruby-talk@ruby-lang.org
> Betreff: Re: Word + win32ole - how to find formatting of a word?

> Mohit Sindhwani wrote:
> > Axel Etzold wrote:
> >>> HI! I'm trying to use Ruby and win32ole to parse a Word document.
> >>> So far, I'm able to extract the style and text of each paragraph.
> >>> That works great to convert it into individual divs (in the HTML CSS
> >>> sense).
> >>>
> >>> Now, inside the paragraphs, there are certain words that have
> >>> special formatting (for e.g. the name of a command which is in
> >>> monospace) - I'm trying to find how to extract those special cases.
> >>> Does anyone know how to achieve that?
> >>>
> >>
> >> Dear Mohit,
> >> you could save the Word file as an html and then extract the
> >> relevant information...
> >> I did that using OpenOffice and got a file containing the font
> >> information in the following form.
> >>
> >
> > Hi Axel
> >
> > Thanks for replying! Converting to HTML and working with that is my
> > last option actually. In a well-written document, I found that using
> > Word to return style information about the paragraph is a lot less
> > work and relatively easy to work with. I guess it's time to consider
> > your suggestion!
> >
> Actually, after digging around, I found that this gets me somewhere there:
> words = doc.Words
> words.each {|w|
> index += 1
> ft = w.Font.Name
> ftHash[ft] = 1
> }
>
> Thanks for your help!
>
> Cheers,
> Mohit.
> 10/26/2008 | 9:14 PM.
>
>

Dear Mohit,

you're welcome :)
It's always nice to best answer one's own questions , isn't it ? Thanks for the info !

Best regards,

Axel

--
Ist Ihr Browser Vista-kompatibel? Jetzt die neuesten
Browser-Versionen downloaden: http://www.gmx.net/de/...

Mohit Sindhwani

10/27/2008 3:20:00 AM

0

Axel Etzold wrote:
> you're welcome :)
> It's always nice to best answer one's own questions , isn't it ? Thanks for the info !
>
Thanks for your reply again! Yes, it's good to find the answer yourself
and then share it :)

I find that Win32ole is quite powerful, just that it needs a little
looking around to work with it.

Cheers,
Mohit.
10/27/2008 | 11:19 AM.


herg

12/20/2010 8:54:00 PM

0

Thanks for all the responses, guys.

The corner of the crate had busted, but the foam was still intact. It
could have allowed the mini-PF to bounce against the main PF, maybe.
Pinning it on the shipper would be very tough I think.

I can't see the damage in the pictures he sent before shipping, but
they're certainly not zoomed to highlight the area like my picture is.

Sounds like I really need to figure how deep the damage is and whether
it has been cleared over. If anything, I think starting with very
fine, then going more aggressive as necessary sounds like the way to
go, especially since the "clearcoat guy" is still being helpful. I
don't have any reason to not trust his experience.

kruzman

12/21/2010 6:12:00 AM

0

On Dec 20, 3:53 pm, herg <hergto...@gmail.com> wrote:
> Thanks for all the responses, guys.
>
> The corner of the crate had busted, but the foam was still intact.  It
> could have allowed the mini-PF to bounce against the main PF, maybe.
> Pinning it on the shipper would be very tough I think.
>
> I can't see the damage in the pictures he sent before shipping, but
> they're certainly not zoomed to highlight the area like my picture is.
>
> Sounds like I really need to figure how deep the damage is and whether
> it has been cleared over.  If anything, I think starting with very
> fine, then going more aggressive as necessary sounds like the way to
> go, especially since the "clearcoat guy" is still being helpful.  I
> don't have any reason to not trust his experience.

Your situation is going to take a lot more than some 3000 grit and
novus. The pic makes it look really damaged.If you want some help on
how to do it the right way, which is not the easy way, contact me, and
I will give you some quick pointers. If it were mine, and the pf is
not installed yet, I would send it back to the clearer, to be sanded,
and recoated. It looks pretty bad, but it sounds like it had flaws in
the first place. If you dont have the right tools like a da sander,
and compressor, and your clear guy does, its best to have him do it.
Or her.
Good luck, and happy holiday. ron k

herg

12/21/2010 2:36:00 PM

0

I took another, closer look at it last night, and had my wife look as
well. We both agreed that we could maybe feel a little resistance
when sliding a fingernail across it, but certainly didn't feel a
catch. It really doesn't look nearly as bad in person, even when I
create glare with a worklight. In normal lighting, from a playing
position, I can't even see it.

Since I thought it would be a relatively easy touch-up based on what
the clearer told me, I went ahead and started installing it. The
harness, most of the lamp boards, and a few other items are on the
back side. Nothing on the front, and it is in the cabinet.

I don't feel confident enough in my own skills to say that I can make
it better, and there's always the possibility that I could make things
worse. Also, packing it up, shipping it, waiting for the rework, and
paying for whatever costs are involved sounds like overkill. That,
and I'm fairly sure that shipping is what caused the problems in the
first place.

Given that the PF did have issues to begin with (http://
picasaweb.google.com/hergtoler/TZPlayfield#) and my intentions were to
protect what was there while making it look better without going
overboard, I think that my goals have been accomplished, despite the
minor flaws.

Maybe if I knew someone local (northern Virginia), it would be more
worthwhile to try to make it perfect, but that's why I shipped it. I
thought my chances were better going with a known restorer than trying
to find someone local without a RGP reputation. I thought about
contacting Ron, but he seemed more booked than the guy who ended up
doing it.