[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

mp3 file magic number identification

John Joyce

8/15/2007 10:17:00 PM

Does anybody know how to identify (validate) mp3 files (other audio
files would be interesting as well) by 'magic number'?
I never trust file extensions to be correct. It's to easy for users
to accidentally munge file names in a GUI or even for malicious users
to try bad things by simply changing file names.

Any library or code is welcome!
Daniel Berger said he'd even add it to Ptools or a similar library if
it gets posted on Ruby-Talk.

I did find this online as a purported mp3 magic number (in hex of
course),
49 44 33
but I'm not even going to bother using it since I don't know
definitively that all mp3's will have it, and I don't know where to
expect it in the file.

Thanks,
John Joyce

14 Answers

Stefan Mahlitz

8/15/2007 10:45:00 PM

0

John Joyce wrote:
> Does anybody know how to identify (validate) mp3 files (other audio
> files would be interesting as well) by 'magic number'?

> I did find this online as a purported mp3 magic number (in hex of course),
> 49 44 33
> but I'm not even going to bother using it since I don't know
> definitively that all mp3's will have it, and I don't know where to
> expect it in the file.

does this help: http://raa.ruby-lang.org/project/...

Stefan

Adam Shelly

8/16/2007 12:23:00 AM

0

On 8/15/07, John Joyce <dangerwillrobinsondanger@gmail.com> wrote:
> Does anybody know how to identify (validate) mp3 files (other audio
> files would be interesting as well) by 'magic number'?

I know for sure that:
= wave files start with the characters 'RIFF' followed by 4 bytes
(filesize-8) followed by 'WAVE'.
= ogg vorbis files start with 'oggS' followed by 24 bytes then 0x01
and the string 'vorbis'
= MIDI files start with 'MThd'

and according to wikipedia (and verified with one file on my system)
= MP3 files should start with 0xFF FB or 0xFF FA.

-Adam

John Joyce

8/16/2007 3:07:00 AM

0


On Aug 15, 2007, at 5:45 PM, Stefan Mahlitz wrote:

> John Joyce wrote:
>> Does anybody know how to identify (validate) mp3 files (other audio
>> files would be interesting as well) by 'magic number'?
>
>> I did find this online as a purported mp3 magic number (in hex of
>> course),
>> 49 44 33
>> but I'm not even going to bother using it since I don't know
>> definitively that all mp3's will have it, and I don't know where to
>> expect it in the file.
>
> does this help: http://raa.ruby-lang.org/project/...
>
> Stefan
>
Thanks Stefan, but I should have said, I was hoping for the pure Ruby
implementation. More portable that way.


John Joyce

8/16/2007 3:11:00 AM

0


On Aug 15, 2007, at 7:23 PM, Adam Shelly wrote:

> On 8/15/07, John Joyce <dangerwillrobinsondanger@gmail.com> wrote:
>> Does anybody know how to identify (validate) mp3 files (other audio
>> files would be interesting as well) by 'magic number'?
>
> I know for sure that:
> = wave files start with the characters 'RIFF' followed by 4 bytes
> (filesize-8) followed by 'WAVE'.
> = ogg vorbis files start with 'oggS' followed by 24 bytes then 0x01
> and the string 'vorbis'
> = MIDI files start with 'MThd'
>
> and according to wikipedia (and verified with one file on my system)
> = MP3 files should start with 0xFF FB or 0xFF FA.
>
> -Adam
>
Thanks Adam, that's the kind of thing I'm looking for exactly.
If anyone can contribute more audio file magic numbers, please do!

I guess video/AV files should be next as well, primarily things
like .mov, .wmv, etc...
Oh, and I think we should find out if smaf is the same as midi.

These kind of validation tools can be useful to us all these days.

Ben Bleything

8/16/2007 4:06:00 AM

0

On Thu, Aug 16, 2007, John Joyce wrote:
> Thanks Adam, that's the kind of thing I'm looking for exactly.
> If anyone can contribute more audio file magic numbers, please do!

If you're on a *nix system, you should have a "magic" file someplace
that describes the magic of every filetype that the "file" command can
understand.

If you're not, find someone who is that can send you the file :) You
might also look at the libmagic source or the filemagic source.

Ben

John Joyce

8/16/2007 6:15:00 AM

0

Thanks Adam, Ben, and others...
found the magic number file in
/usr/share/file/magic
(on OS X, but likely in the same place on any *nix, I'm guessing it's
one of those files that is often used by people more sophisticated
than myself who write C for a living)
There is a LOT of stuff in there!!
Wish I had looked in there before!
So I've written a minimal bit of Ruby like an lazy person. Copying
D.Berger's Ptools style basically, by simply adding to the File class
my mini methods.

Though I'm going to need some testing... the magic file (not always
easy to read)
says this :
# MPEG 1.0 Layer 3
0 beshort&0xfffe =0xfffa \bMP3

I'm not 100% sure, but Adam said 0xFFFB or 0xFFFA, and the magic file
lists only FFFA or does it mean FFFE and/or FFFA ?


Ben Bleything

8/16/2007 6:31:00 AM

0

On Thu, Aug 16, 2007, John Joyce wrote:
> Though I'm going to need some testing... the magic file (not always
> easy to read)
> says this :
> # MPEG 1.0 Layer 3
> 0 beshort&0xfffe =0xfffa \bMP3
>
> I'm not 100% sure, but Adam said 0xFFFB or 0xFFFA, and the magic file
> lists only FFFA or does it mean FFFE and/or FFFA ?

Do "man magic" (or possibly man 5 magic, or man -s 5 magic), and it
should describe the format of the file. Basically, it's offset, type,
magic, message. Numeric types can be specified with &0xnnnn, where the
number is ANDed with the magic. I'm basically just quoting from the
manpage, though, so give it a gander.

Unix is cool :)

Ben

John Joyce

8/16/2007 6:53:00 AM

0


On Aug 16, 2007, at 1:30 AM, Ben Bleything wrote:

> On Thu, Aug 16, 2007, John Joyce wrote:
>> Though I'm going to need some testing... the magic file (not always
>> easy to read)
>> says this :
>> # MPEG 1.0 Layer 3
>> 0 beshort&0xfffe =0xfffa \bMP3
>>
>> I'm not 100% sure, but Adam said 0xFFFB or 0xFFFA, and the magic file
>> lists only FFFA or does it mean FFFE and/or FFFA ?
>
> Do "man magic" (or possibly man 5 magic, or man -s 5 magic), and it
> should describe the format of the file. Basically, it's offset, type,
> magic, message. Numeric types can be specified with &0xnnnn, where
> the
> number is ANDed with the magic. I'm basically just quoting from the
> manpage, though, so give it a gander.
>
> Unix is cool :)
>
> Ben
>
Yeah, I read that already. Seemed simple. Many file descriptions are
readable, but the MP3 one is one of many that don't make sense to me.
" Numeric types can be specified with &0xnnnn, where the
> number is ANDed with the magic."
Makes no sense to me at all. I'm not a C person really.


>> # MPEG 1.0 Layer 3
>> 0 beshort&0xfffe =0xfffa \bMP3
So what does the above mean??
I see hex numbers. but what is that '=' doing ?
That's cryptic.
The other lines after that make sense. They all describe the second
byte and that it determines the bitrate.
so do I care about 0xfffe? or 0xfffa?
or both?
I'm hoping I'm doing this right.

Felipe Contreras

8/16/2007 9:58:00 AM

0

Hi,

On 8/16/07, John Joyce <dangerwillrobinsondanger@gmail.com> wrote:
> Does anybody know how to identify (validate) mp3 files (other audio
> files would be interesting as well) by 'magic number'?
> I never trust file extensions to be correct. It's to easy for users
> to accidentally munge file names in a GUI or even for malicious users
> to try bad things by simply changing file names.
>
> Any library or code is welcome!
> Daniel Berger said he'd even add it to Ptools or a similar library if
> it gets posted on Ruby-Talk.
>
> I did find this online as a purported mp3 magic number (in hex of
> course),
> 49 44 33
> but I'm not even going to bother using it since I don't know
> definitively that all mp3's will have it, and I don't know where to
> expect it in the file.

http://en.wikipedia.or...

This explains it all:
http://upload.wikimedia.org/wikipedia/commons/0/01/Mp3filestr...

So first byte should be 0xFF, second byte & 0xFE should equal 0xFA.
that is only for layer-3.

However if the MP3 has ID3v1 tags then it will start with "ID3".

Best regards.

--
Felipe Contreras

Ben Bleything

8/16/2007 3:17:00 PM

0

On Thu, Aug 16, 2007, John Joyce wrote:
> Yeah, I read that already. Seemed simple. Many file descriptions are
> readable, but the MP3 one is one of many that don't make sense to me.
> " Numeric types can be specified with &0xnnnn, where the
> >number is ANDed with the magic."
> Makes no sense to me at all. I'm not a C person really.

That's not a C thing, that's just general math.

> >># MPEG 1.0 Layer 3
> >>0 beshort&0xfffe =0xfffa \bMP3
> So what does the above mean??
> I see hex numbers. but what is that '=' doing ?

Okay, from left to right:

0: that's the offset. It means the magic starts at byte 0

beshort&0xfffe: the magic is a big-endian short (2bytes), and you should
take the value you get from the file and AND it with 0xfffe

=0xfffa: this is what you're looking for

\bMP3: this is what file will print if it matches this magic.

> That's cryptic.

Sure, but it's all explained in the man page.

> The other lines after that make sense. They all describe the second
> byte and that it determines the bitrate.
> so do I care about 0xfffe? or 0xfffa?
> or both?

Yes, that's the magic.

> I'm hoping I'm doing this right.

If your script is correctly identifying MP3 files you're using as a
control, then you're probably doing it just fine :)

One thing to be careful of is that there are multiple definitions of
what an MP3 looks like (at least, there are in my magic file). For
instance, MP3s with an ID3v2 tag will start with "ID3" instead of the
magic described above.

Make sure you search through your whole magic file for any given type
before you commit to writing code for it. You might find exceptions or
easier cases.

Cheers,
Ben