Asp Forum - Accessing PDF Metadata and Page Thumbnails

Ben Gribaudo

7/26/2007 4:34:00 PM

Hello,

I am putting together a PDF archive of our corporate newsletters. I'd
like to iterate though a directory of PDFs, read their metadata (title,
description, etc.) and use this info to dynamically generate a RHTML
index page. There are several Ruby PDF libraries out there but they seem
inclined towards creating PDFs instead of reading them. Any
recommendations on a library to read PDF metadata?

It would be neat to not only read metadata but also to pull the PDF's
first page's thumbnail out as an image. This would allow dynamic
creation of an index page that looks like this:
http://www.reviveourhearts.com/difference/newsletter/newsletter_a...

Any thoughts?

Thanks,
Ben

1 Answer

Eugen Minciu

7/27/2007 11:45:00 AM

Excerpts from Ben Gribaudo's message of Thu Jul 26 19:33:32 +0300 2007:
> Hello,
>
> I am putting together a PDF archive of our corporate newsletters. I'd
> like to iterate though a directory of PDFs, read their metadata (title,
> description, etc.) and use this info to dynamically generate a RHTML
> index page. There are several Ruby PDF libraries out there but they seem
> inclined towards creating PDFs instead of reading them. Any
> recommendations on a library to read PDF metadata?
>
> It would be neat to not only read metadata but also to pull the PDF's
> first page's thumbnail out as an image. This would allow dynamic
> creation of an index page that looks like this:
> http://www.reviveourhearts.com/difference/newsletter/newsletter_a...
>
> Any thoughts?
Have a look at http://extractor.rub... . You need libextractor
and its headers to compile it though. Would that work for you?
>
> Thanks,
> Ben

--
Eugen Minciu.

Wasting valuable time since 1985.

comp.lang.ruby

Accessing PDF Metadata and Page Thumbnails

Ben Gribaudo

Eugen Minciu

x Login to ForumsZone