Jon Wood
4/7/2006 4:10:00 PM
EdUarDo wrote:
> Hi all,
>
> Is there any gem or library which allows to extract text from a .PDF file?, any for Word or OpenOffice files?
I don't know about PDFs, but there are several programs available that
can convert a Word file into HTML - you'll probably lose formatting,
but you should then be able to process the file like any other XML to
extract the text content from it.
Jon