[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: Merging two Word documents with Ruby?

Graham

12/21/2005 9:18:00 AM

Several points
- What do you mean by "Merge"?.. Word documents have structure and the
interleaving of lines or words would appear to make little sense.

- Unless your application and user base is new, then you will have many
files NOT in the XML format, in which case you would need to convert
them - and would need Word installed somewhere. Perhaps you could
reconsider your platform choice (to make the problem simpler) - or if
you have no pre-existing documents reconsider your approach to make
Word unecessary? Word can read a wide variety of document types
(including HTML) - so perhaps this is another way to simplify your
problem.

More details required...
Graham

9 Answers

Denver Mike

12/21/2005 2:06:00 PM

0


> - What do you mean by "Merge"?.. Word documents have structure and the
> interleaving of lines or words would appear to make little sense.

Thanks for your thoughts on this Graham. By "merge", I meant appending
one Word document to the end of another, but to make things more
complicated, I need to add text into the headings across the entire
document.

--
Posted via http://www.ruby-....


Edwin van Leeuwen

12/21/2005 3:36:00 PM

0

Denver Mike wrote:
> Thanks for your thoughts on this Graham. By "merge", I meant appending
> one Word document to the end of another, but to make things more
> complicated, I need to add text into the headings across the entire
> document.

Microsoft word has something called a master document. Maybe you could
add a masterdocument that inclkudes both files+extra headings. This
masterdocument might be simple enouh that you can actually reverse
engineer it. (Create one in word once and just edit the parts you need
to edit with ruby).


--
Posted via http://www.ruby-....


Wilson Bilkovich

12/21/2005 4:01:00 PM

0

On 12/21/05, Denver Mike <denvermike@comcast.net> wrote:
>
> > - What do you mean by "Merge"?.. Word documents have structure and the
> > interleaving of lines or words would appear to make little sense.
>
> Thanks for your thoughts on this Graham. By "merge", I meant appending
> one Word document to the end of another, but to make things more
> complicated, I need to add text into the headings across the entire
> document.
>
This can actually be extremely complex, because a named style (such as
'Body', 'Normal', or 'Heading 1') can (and will) have different
properties (fonts, colors, sizes, margins, encoding, etc) in each of
the two documents. You will need to rename every style and style
reference in the second document in order to prevent the two from
colliding.


Daniel Calvelo

12/21/2005 9:28:00 PM

0

If your documents are properly structured using styles (which is rare)
and they share the same styles (and I mean the *same* styles), you can
try to use openoffice in remote command mode, convert the .doc into
..odt, parse the xml of both files, proceed to merge the XMLs and
rebuild an odt file; perhaps going through OOo again to have a .doc
back. But you will need to ensure that the styles are always converted
into something reliably identifiable.

FAO (the UN branch for food and agriculture) uses a template system
(thus forcing a set of styles) which is used to output RTF which is
converted into XML for storage. Are your documents existing legacy ones
or is this a new setup? If you're building it all, then you might
seriously consider using openoffice all the way.

Dave Howell

12/23/2005 5:40:00 PM

0


On Dec 21, 2005, at 7:06, Denver Mike wrote:
> Thanks for your thoughts on this Graham. By "merge", I meant appending
> one Word document to the end of another, but to make things more
> complicated, I need to add text into the headings across the entire
> document.

Does it still need to be a Word document when you're done? An entirely
different approach would be to use some kind of Word file display
program and make PDFs of the files, then chain the PDFs together. Do
the headers by slapping a white block over the existing headers and
writing a new header over them.

Personally, my approach would be to abandon the project as just too
messy for words. :)



Daniel Calvelo

12/23/2005 6:33:00 PM

0

OpenOffice.org can do the .doc to pdf conversion. I like your idea very
much, Dave. Maybe PostScript would be easier to fiddle with ex-post.

Dave Howell

12/23/2005 8:27:00 PM

0


On Dec 23, 2005, at 11:37, Daniel Calvelo wrote:

> OpenOffice.org can do the .doc to pdf conversion. I like your idea very
> much, Dave. Maybe PostScript would be easier to fiddle with ex-post.

Probably. If you have a program that lets you overlay one PDF page on
another, then your best bet is to output a PDF page with your header in
it. (I'd probably use TeX, or maybe script OSX's TextEdit program, and
my copy of full Acrobat 4 for the page overlay.) The other alternative
would be to create (or have somebody create for you) an .eps with the
white box and a line of text in a program like Freehand or Illustrator.
If you pop open the .eps file in a text editor, you'll find it not too
difficult to programmatically replace the text, although you won't
easily be able to duplicate the kerning and other textual adjustments.
Have OpenOffice print to a postscript file, then figure out what you
can use as a page marker in order to embed the .eps in that file on
each page so that it comes after (and thus covers) the original
headers, if any. Then feed the modified .ps file into a PDF distiller.

That's what I'd try, I think.



Dominic Sisneros

1/13/2006 10:28:00 AM

0

abiword can be used from the command line. See http://
www.advogato.org/person/msevior/diary.html?start=65

This might allow for this to happen
On Dec 23, 2005, at 1:27 PM, Dave Howell wrote:

>
> On Dec 23, 2005, at 11:37, Daniel Calvelo wrote:
>
>> OpenOffice.org can do the .doc to pdf conversion. I like your idea
>> very
>> much, Dave. Maybe PostScript would be easier to fiddle with ex-post.
>
> Probably. If you have a program that lets you overlay one PDF page
> on another, then your best bet is to output a PDF page with your
> header in it. (I'd probably use TeX, or maybe script OSX's TextEdit
> program, and my copy of full Acrobat 4 for the page overlay.) The
> other alternative would be to create (or have somebody create for
> you) an .eps with the white box and a line of text in a program
> like Freehand or Illustrator. If you pop open the .eps file in a
> text editor, you'll find it not too difficult to programmatically
> replace the text, although you won't easily be able to duplicate
> the kerning and other textual adjustments. Have OpenOffice print to
> a postscript file, then figure out what you can use as a page
> marker in order to embed the .eps in that file on each page so that
> it comes after (and thus covers) the original headers, if any. Then
> feed the modified .ps file into a PDF distiller.
>
> That's what I'd try, I think.
>
>



hari

1/19/2006 10:50:00 AM

0

hi guys,

i have got a doubt .hopeu guy can help

I need to build a utility ,which if i run ,i need to merge two MS wor
documents & i should be able to print the merged document enabling us to
select the options of "remove header" & "remove footer"
& consecutively should print document with footer/header removed

help -pls



--
Posted via http://www.ruby-....