[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: [ANN] Metadata 0.2

Ilmari Heikkinen

9/14/2007 1:08:00 PM

On 9/13/07, Konrad Meyer <konrad@tylerc.org> wrote:
>
> Any chance this could be expanded to add FLAC and OGG support?
>
> Thanks!
> --
> Konrad Meyer <konrad@tylerc.org> http://konrad.sobertil...
>

Done! Along with a preliminary list of supported formats and some cleanup.
Thanks for the request!


tarball: http://dark.fhtr.org/repos...metadata-...
git: http://dark.fhtr.org/repos...


Changelog:

README:
* added flacinfo, wmainfo, mp4info and ogginfo to the list of
dependencies
* prelim list of supported formats
lib/metadata/extract.rb:
* .ps.gz support
* list archive contents
* remove null fields from output
* support for flac and ogg
* untested support for wma and m4a
* Audio.Bitrate now in kbps to match shared-filemetadata-spec


Description
-----------

This package `Metadata' comes with a library called `metadata' and
a small program called `mdh'.

The library probes files for their metadata (e.g. jpeg dimensions
and camera make, mp3 artist, pdf word count) and returns the metadata
as a Hash.

Mdh can print out file metadata as YAML and package the metadata
with the file.

This package has many dependencies since there is no single universal
metadata header format that all files use. Blame resource forks, filename
extensions, bags of bytes and mimetypes.


Usage
-----

# print out metadata header
mdh -p myfile.jpg

# create myfile.jpg.mdh, which consists of metadata header + myfile.jpg
mdh myfile.jpg

# print out metadata header from mdh file
mdh -e -p myfile.jpg.mdh

# strip out metadata header from mdh file and save it to myfile.jpg
mdh -e myfile.jpg.mdh

irb> Metadata.extract('myfile.jpg')
irb> Metadata.extract_text('myfile.jpg')
irb> Pathname.new("myfile.jpg").metadata


List of supported formats
-------------------------

Audio:
Successfully tested with:
mp3, flac, ogg, wav
Should also work:
wma, m4a

Video:
What you manage to make mplayer play, which can be just about anything.
Then again, missing title and author data, etc. (do videos even have those?)
Successfully tested with:
wmv, mov, divx, xvid, flv, ogm, mpg

Images:
Should handle pretty much anything (apart from XCF and ORF.)
Successfully tested with:
jpeg, png, gif, nef, dng, crw, pef, psd

Documents:
Successfully tested with:
pdf, ppt, odp, sxi, ps, ps.gz, html, txt
Should work:
- OpenOffice docs work to some degree (personally, I'm using unoconv to
convert OO docs to temp PDFs for the text & dimensions extraction, so
those bits of data are missing.)
- MS Office docs to some degree (ppt at least, doc and xls should work too,
dimensions missing due to the above temp PDF -thing.)

Others:
Whatever extract spits out on the five or six bits of metadata I'm using
from it. Archive contents at least.

Requirements
------------

* Ruby 1.8

* Tons of metadata extraction programs and libs,
list of gems:
flacinfo-rb
wmainfo-rb
MP4info
list of debian packages:
dcraw
libimlib2-ruby
extract
libimage-exiftool-perl
poppler-utils
mplayer
html2text
imagemagick
unhtml
pstotext
antiword
catdoc
shared-mime-info
vorbis-tools

* You do want to install the latest versions of dcraw and
shared-mime-info to be able to handle camera raw images.
http://cybercom.net/~dcof...
http://freedesktop.org/wiki/Software/shared...

* Python + chardet library
http://chardet.feedp...

Install
-------

De-compress archive and enter its top directory.
Then type:

($ su)
# ruby setup.rb

These simple step installs this program under the default
location of Ruby libraries. You can also install files into
your favorite directory by supplying setup.rb some options.
Try "ruby setup.rb --help".


License
-------

Ruby's


Ilmari Heikkinen <ilmari.heikkinen gmail com>
http://fhtr.bl...

5 Answers

Konrad Meyer

9/14/2007 2:04:00 PM

0

Quoth Ilmari Heikkinen:
> On 9/13/07, Konrad Meyer <konrad@tylerc.org> wrote:
> >
> > Any chance this could be expanded to add FLAC and OGG support?
> >
> > Thanks!
> > --
> > Konrad Meyer <konrad@tylerc.org> http://konrad.sobertil...
> >
>
> Done! Along with a preliminary list of supported formats and some cleanup.
> Thanks for the request!
>
>
> tarball: http://dark.fhtr.org/repos...metadata-...
> git: http://dark.fhtr.org/repos...
>
>
> Changelog:
>
> README:
> * added flacinfo, wmainfo, mp4info and ogginfo to the list of
> dependencies
> * prelim list of supported formats
> lib/metadata/extract.rb:
> * .ps.gz support
> * list archive contents
> * remove null fields from output
> * support for flac and ogg
> * untested support for wma and m4a
> * Audio.Bitrate now in kbps to match shared-filemetadata-spec
>
>
> Description
> -----------
>
> This package `Metadata' comes with a library called `metadata' and
> a small program called `mdh'.
>
> The library probes files for their metadata (e.g. jpeg dimensions
> and camera make, mp3 artist, pdf word count) and returns the metadata
> as a Hash.
>
> Mdh can print out file metadata as YAML and package the metadata
> with the file.
>
> This package has many dependencies since there is no single universal
> metadata header format that all files use. Blame resource forks, filename
> extensions, bags of bytes and mimetypes.
>
>
> Usage
> -----
>
> # print out metadata header
> mdh -p myfile.jpg
>
> # create myfile.jpg.mdh, which consists of metadata header + myfile.jpg
> mdh myfile.jpg
>
> # print out metadata header from mdh file
> mdh -e -p myfile.jpg.mdh
>
> # strip out metadata header from mdh file and save it to myfile.jpg
> mdh -e myfile.jpg.mdh
>
> irb> Metadata.extract('myfile.jpg')
> irb> Metadata.extract_text('myfile.jpg')
> irb> Pathname.new("myfile.jpg").metadata
>
>
> List of supported formats
> -------------------------
>
> Audio:
> Successfully tested with:
> mp3, flac, ogg, wav
> Should also work:
> wma, m4a
>
> Video:
> What you manage to make mplayer play, which can be just about anything.
> Then again, missing title and author data, etc. (do videos even have
those?)
> Successfully tested with:
> wmv, mov, divx, xvid, flv, ogm, mpg
>
> Images:
> Should handle pretty much anything (apart from XCF and ORF.)
> Successfully tested with:
> jpeg, png, gif, nef, dng, crw, pef, psd
>
> Documents:
> Successfully tested with:
> pdf, ppt, odp, sxi, ps, ps.gz, html, txt
> Should work:
> - OpenOffice docs work to some degree (personally, I'm using unoconv to
> convert OO docs to temp PDFs for the text & dimensions extraction, so
> those bits of data are missing.)
> - MS Office docs to some degree (ppt at least, doc and xls should work
too,
> dimensions missing due to the above temp PDF -thing.)
>
> Others:
> Whatever extract spits out on the five or six bits of metadata I'm using
> from it. Archive contents at least.
>
> Requirements
> ------------
>
> * Ruby 1.8
>
> * Tons of metadata extraction programs and libs,
> list of gems:
> flacinfo-rb
> wmainfo-rb
> MP4info
> list of debian packages:
> dcraw
> libimlib2-ruby
> extract
> libimage-exiftool-perl
> poppler-utils
> mplayer
> html2text
> imagemagick
> unhtml
> pstotext
> antiword
> catdoc
> shared-mime-info
> vorbis-tools
>
> * You do want to install the latest versions of dcraw and
> shared-mime-info to be able to handle camera raw images.
> http://cybercom.net/~dcof...
> http://freedesktop.org/wiki/Software/shared...
>
> * Python + chardet library
> http://chardet.feedp...
>
> Install
> -------
>
> De-compress archive and enter its top directory.
> Then type:
>
> ($ su)
> # ruby setup.rb
>
> These simple step installs this program under the default
> location of Ruby libraries. You can also install files into
> your favorite directory by supplying setup.rb some options.
> Try "ruby setup.rb --help".
>
>
> License
> -------
>
> Ruby's
>
>
> Ilmari Heikkinen <ilmari.heikkinen gmail com>
> http://fhtr.bl...

Wow, thank you! That was fast.

--
Konrad Meyer <konrad@tylerc.org> http://konrad.sobertil...

Konrad Meyer

9/14/2007 6:42:00 PM

0

Quoth Ilmari Heikkinen:
> On 9/13/07, Konrad Meyer <konrad@tylerc.org> wrote:
> >
> > Any chance this could be expanded to add FLAC and OGG support?
> >
> > Thanks!
> > --
> > Konrad Meyer <konrad@tylerc.org> http://konrad.sobertil...
> >
>
> Done! Along with a preliminary list of supported formats and some cleanup.
> Thanks for the request!
>
>
> tarball: http://dark.fhtr.org/repos...metadata-...
> git: http://dark.fhtr.org/repos...
>
>
> Changelog:
>
> README:
> * added flacinfo, wmainfo, mp4info and ogginfo to the list of
> dependencies
> * prelim list of supported formats
> lib/metadata/extract.rb:
> * .ps.gz support
> * list archive contents
> * remove null fields from output
> * support for flac and ogg
> * untested support for wma and m4a
> * Audio.Bitrate now in kbps to match shared-filemetadata-spec
>
>
> Description
> -----------
>
> This package `Metadata' comes with a library called `metadata' and
> a small program called `mdh'.
>
> The library probes files for their metadata (e.g. jpeg dimensions
> and camera make, mp3 artist, pdf word count) and returns the metadata
> as a Hash.
>
> Mdh can print out file metadata as YAML and package the metadata
> with the file.
>
> This package has many dependencies since there is no single universal
> metadata header format that all files use. Blame resource forks, filename
> extensions, bags of bytes and mimetypes.
>
>
> Usage
> -----
>
> # print out metadata header
> mdh -p myfile.jpg
>
> # create myfile.jpg.mdh, which consists of metadata header + myfile.jpg
> mdh myfile.jpg
>
> # print out metadata header from mdh file
> mdh -e -p myfile.jpg.mdh
>
> # strip out metadata header from mdh file and save it to myfile.jpg
> mdh -e myfile.jpg.mdh
>
> irb> Metadata.extract('myfile.jpg')
> irb> Metadata.extract_text('myfile.jpg')
> irb> Pathname.new("myfile.jpg").metadata
>
>
> List of supported formats
> -------------------------
>
> Audio:
> Successfully tested with:
> mp3, flac, ogg, wav
> Should also work:
> wma, m4a
>
> Video:
> What you manage to make mplayer play, which can be just about anything.
> Then again, missing title and author data, etc. (do videos even have
those?)
> Successfully tested with:
> wmv, mov, divx, xvid, flv, ogm, mpg
>
> Images:
> Should handle pretty much anything (apart from XCF and ORF.)
> Successfully tested with:
> jpeg, png, gif, nef, dng, crw, pef, psd
>
> Documents:
> Successfully tested with:
> pdf, ppt, odp, sxi, ps, ps.gz, html, txt
> Should work:
> - OpenOffice docs work to some degree (personally, I'm using unoconv to
> convert OO docs to temp PDFs for the text & dimensions extraction, so
> those bits of data are missing.)
> - MS Office docs to some degree (ppt at least, doc and xls should work
too,
> dimensions missing due to the above temp PDF -thing.)
>
> Others:
> Whatever extract spits out on the five or six bits of metadata I'm using
> from it. Archive contents at least.
>
> Requirements
> ------------
>
> * Ruby 1.8
>
> * Tons of metadata extraction programs and libs,
> list of gems:
> flacinfo-rb
> wmainfo-rb
> MP4info
> list of debian packages:
> dcraw
> libimlib2-ruby
> extract
> libimage-exiftool-perl
> poppler-utils
> mplayer
> html2text
> imagemagick
> unhtml
> pstotext
> antiword
> catdoc
> shared-mime-info
> vorbis-tools
>
> * You do want to install the latest versions of dcraw and
> shared-mime-info to be able to handle camera raw images.
> http://cybercom.net/~dcof...
> http://freedesktop.org/wiki/Software/shared...
>
> * Python + chardet library
> http://chardet.feedp...
>
> Install
> -------
>
> De-compress archive and enter its top directory.
> Then type:
>
> ($ su)
> # ruby setup.rb
>
> These simple step installs this program under the default
> location of Ruby libraries. You can also install files into
> your favorite directory by supplying setup.rb some options.
> Try "ruby setup.rb --help".
>
>
> License
> -------
>
> Ruby's
>
>
> Ilmari Heikkinen <ilmari.heikkinen gmail com>
> http://fhtr.bl...

Hmm, am I not seeing it (just using 'mdh -p') or can metadata.rb extract
stuff like artist, title, album, track, and whatnot from ogg/flac?

--
Konrad Meyer <konrad@tylerc.org> http://konrad.sobertil...

darren kirby

9/14/2007 9:04:00 PM

0

Hi Ilmari!

quoth the Ilmari Heikkinen:
> * added flacinfo, wmainfo, mp4info and ogginfo to the list of
> dependencies

Cool, you were able to use a couple of my libraries (be afraid people...).

> Video:
> What you manage to make mplayer play, which can be just about anything.
> Then again, missing title and author data, etc. (do videos even have
> those?) Successfully tested with:
> wmv, mov, divx, xvid, flv, ogm, mpg

Just wanted to mention that despite the name, wmainfo will parse anything
wrapped in an ASF audio/video container format[0], so, you could use it to
parse wmv movies as well if your user didn't have mplayer installed.

[0] http://en.wikipedia.org/wiki/Advanced_Syst...

> Ilmari Heikkinen <ilmari.heikkinen gmail com>
> http://fhtr.bl...

Thanks for the code, and have a good one,
-d
--
darren kirby :: Part of the problem since 1976 :: http://badco...
"...the number of UNIX installations has grown to 10, with more expected..."
- Dennis Ritchie and Ken Thompson, June 1972

Ilmari Heikkinen

9/15/2007 3:16:00 AM

0

On 9/14/07, Konrad Meyer <konrad@tylerc.org> wrote:
> Quoth Ilmari Heikkinen:
> > On 9/13/07, Konrad Meyer <konrad@tylerc.org> wrote:
> > >
> > > Any chance this could be expanded to add FLAC and OGG support?
> > >
> > > Thanks!
> > > --
> > > Konrad Meyer <konrad@tylerc.org> http://konrad.sobertil...
> > >
> >
> > Done! Along with a preliminary list of supported formats and some cleanup.
> > Thanks for the request!
> > List of supported formats
> > -------------------------
> > Audio:
> > Successfully tested with:
> > mp3, flac, ogg, wav

> Hmm, am I not seeing it (just using 'mdh -p') or can metadata.rb extract
> stuff like artist, title, album, track, and whatnot from ogg/flac?

It should at least. If you're having trouble, lemme know

kig@manifold:~$ mdh -p downloads/Mists_of_Time-4T.ogg
---
Audio.Album: Favorite Things
Audio.TrackNo: 10
Audio.Samplerate: 44100
Audio.Bitrate: 128.0
Audio.Title: Mists of Time - 4T
Audio.Duration: 400.0
Audio.Comment: http://www...
Audio.ReleaseDate: 2002-01-01T00:00:00Z
File.Size: 5816848
Audio.Channels: 2
File.Modified: !timestamp 2007-09-14T13:56:51+0300
File.Format: audio/x-vorbis+ogg
Audio.Artist: 4T Thieves

kig@manifold:~$ mdh -p 05-Self-Saboteur\ \[feat.\ Kristy\ Thirsk\].flac
---
Audio.Album: Nuages du Monde
Audio.TrackNo: 5
Audio.Samplerate: 44100
Audio.Bitrate: 990331.947108105
Audio.Genre: Ambient Pop
Audio.Title: Self-Saboteur [feat. Kristy Thirsk]
Audio.Duration: 264.186666666667
Audio.ReleaseDate: 2006-01-01T00:00:00Z
Audio.VariableBitrate: true
File.Size: 32704062
Audio.Channels: 2
File.Modified: !timestamp 2006-11-17T10:46:28+0200
File.Format: audio/x-flac
Audio.Artist: Delerium

--
Ilmari Heikkinen <ilmari.heikkinen gmail com>
http://fhtr.bl...

Konrad Meyer

9/15/2007 6:37:00 AM

0

Quoth Ilmari Heikkinen:
> On 9/14/07, Konrad Meyer <konrad@tylerc.org> wrote:
> > Quoth Ilmari Heikkinen:
> > > On 9/13/07, Konrad Meyer <konrad@tylerc.org> wrote:
> > > >
> > > > Any chance this could be expanded to add FLAC and OGG support?
> > > >
> > > > Thanks!
> > > > --
> > > > Konrad Meyer <konrad@tylerc.org> http://konrad.sobertil...
> > > >
> > >
> > > Done! Along with a preliminary list of supported formats and some
cleanup.
> > > Thanks for the request!
> > > List of supported formats
> > > -------------------------
> > > Audio:
> > > Successfully tested with:
> > > mp3, flac, ogg, wav
>
> > Hmm, am I not seeing it (just using 'mdh -p') or can metadata.rb extract
> > stuff like artist, title, album, track, and whatnot from ogg/flac?
>
> It should at least. If you're having trouble, lemme know
>
> kig@manifold:~$ mdh -p downloads/Mists_of_Time-4T.ogg
> ---
> Audio.Album: Favorite Things
> Audio.TrackNo: 10
> Audio.Samplerate: 44100
> Audio.Bitrate: 128.0
> Audio.Title: Mists of Time - 4T
> Audio.Duration: 400.0
> Audio.Comment: http://www...
> Audio.ReleaseDate: 2002-01-01T00:00:00Z
> File.Size: 5816848
> Audio.Channels: 2
> File.Modified: !timestamp 2007-09-14T13:56:51+0300
> File.Format: audio/x-vorbis+ogg
> Audio.Artist: 4T Thieves
>
> kig@manifold:~$ mdh -p 05-Self-Saboteur\ \[feat.\ Kristy\ Thirsk\].flac
> ---
> Audio.Album: Nuages du Monde
> Audio.TrackNo: 5
> Audio.Samplerate: 44100
> Audio.Bitrate: 990331.947108105
> Audio.Genre: Ambient Pop
> Audio.Title: Self-Saboteur [feat. Kristy Thirsk]
> Audio.Duration: 264.186666666667
> Audio.ReleaseDate: 2006-01-01T00:00:00Z
> Audio.VariableBitrate: true
> File.Size: 32704062
> Audio.Channels: 2
> File.Modified: !timestamp 2006-11-17T10:46:28+0200
> File.Format: audio/x-flac
> Audio.Artist: Delerium
>
> --
> Ilmari Heikkinen <ilmari.heikkinen gmail com>
> http://fhtr.bl...

Yeah, I'm having some trouble. I have latest metadata (0.2).

$ mdh -p music/Wolfmother\ -\ Joker\ \&\ The\ Thief.flac
---
Doc.Created:
Doc.Subject:
Doc.Author:
Doc.Modified:
Doc.Title:
Doc.Language:
Doc.WordCount: 0
Doc.Description:
File.Content: ""
File.Software:
File.Size: 37505677
File.Modified: 2007-01-03T22:09:31-08:00
File.Format: audio/x-flac

When mplayer shows me that it is tagged:

$ mplayer music/Wolfmother\ -\ Joker\ \&\ The\ Thief.flac
...
Clip info:
Title: Joker & The Thief
Artist: Wolfmother
Album: Wolfmother
Genre: Rock

$ gem list --local | grep flacinfo
flacinfo-rb (0.4)

I have flacinfo-rb 0.4. Any ideas?

--
Konrad Meyer <konrad@tylerc.org> http://konrad.sobertil...