[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Newbie: working with binary files/extract png from a binary file

Jim

1/30/2008 11:13:00 AM

Hello,

I am reverse engineering a binary file that has an embedded PNG image
in it. I opened the file in a hex editor and found the png header
information, but how to use Ruby to extract the PNG to it's own file?

5 Answers

Joel VanderWerf

1/30/2008 6:38:00 PM

0

Jim wrote:
> Hello,
>
> I am reverse engineering a binary file that has an embedded PNG image
> in it. I opened the file in a hex editor and found the png header
> information, but how to use Ruby to extract the PNG to it's own file?

Do you know the absolute starting position of the embedded PNG?

You can get the data as a string like this:

start_pos = 5 # or whatever
end_pos = -1 # assume end of file
png_data = File.open('file_with_png', "rb") { |f|
f.read[start_pos..end_pos]
}

Note that "rb" means open for read ("r") and treat as binary data ("b")
(avoids munging "\r\n" on windows).

If the end_pos has to be determined by reading a field, then this will
take a little tinkering. I assume PNG has a length field somewhere? So
you'll have to extract that (using String#unpack, perhaps) and chop off
that many bytes from the start of the png_data string.

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Adam Shelly

1/30/2008 7:39:00 PM

0

On 1/30/08, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:
> Jim wrote:
> > Hello,
> >
> > I am reverse engineering a binary file that has an embedded PNG image
> > in it. I opened the file in a hex editor and found the png header
> > information, but how to use Ruby to extract the PNG to it's own file?
>
> Do you know the absolute starting position of the embedded PNG?

You can find it:

fp = File.new("has_embedded_png.dat","rb")
m= /\211PNG/.match(fp.read)
raise "no PNG" if !m
fp.seek(m.begin(0))

>
> If the end_pos has to be determined by reading a field, then this will
> take a little tinkering. I assume PNG has a length field somewhere? So
> you'll have to extract that (using String#unpack, perhaps) and chop off
> that many bytes from the start of the png_data string.
>
There's no total length field, just a series of 'chunks' each with
its own length.
Luckily, the 'IEND' chunk is always last, so you can just extract
chunks until you get to that one:


def extract_chunk(input, output)
lenword = input.read(4)
length = lenword.unpack('N')[0]
type = input.read(4)
data = length>0 ? input.read(length) : ""
crc = input.read(4)
return nil if length<0 || !(('A'..'z')===type[0,1])
#return nil if validate_crc(type+data, crc)
output.write lenword
output.write type
output.write data
output.write crc
return type
end

def extract_png(input, output)
hdr = input.read(8)
raise "Not a PNG File" if hdr[0,4]!= "\211PNG"
raise "file not in binary mode" if hdr[4,4]!="\r\n\032\n"
output.write(hdr)
loop do
chunk_type = extract_chunk(input,output)
p chunk_type
break if chunk_type.nil? || chunk_type == 'IEND'
end
end

ofp = File.new("out.png","wb")
extract_png(fp,ofp)


-Adam

Jim

1/30/2008 10:47:00 PM

0

Thank you both for your information and effort.

Adam, your code worked out of the box. Tell me you didn't write that
off the top of your head?

fp = File.new("has_embedded_png.dat","rb")
# fp is a file pointer

m= /\211PNG/.match(fp.read)
# (fp.read) is a String of binary data. Using a regular ex. to locate
the PNG header

raise "no PNG" if !m
# exception if there is not a header match

fp.seek(m.begin(0))
# I understand seek, Not sure what m.begin(0) is actually doing?

Docs say: Returns the offset of the start of the nth element of the
match array in the string.

I still don't get it.

Adam Shelly

1/31/2008 12:30:00 AM

0

On 1/30/08, Jim <jim.foltz@gmail.com> wrote:
> Thank you both for your information and effort.
>
> Adam, your code worked out of the box. Tell me you didn't write that
> off the top of your head?

I patched it together based on wikipedia's PNG entry and parts of a
WAV file reader I wrote earlier - they are both chunk-based file
formats. It took a few failed tests before I got it right.
>
> m= /\211PNG/.match(fp.read)
> # (fp.read) is a String of binary data. Using a regular ex. to locate
the PNG header

> fp.seek(m.begin(0))
> # I understand seek, Not sure what m.begin(0) is actually doing?
>
> Docs say: Returns the offset of the start of the nth element of the
> match array in the string.
>
> I still don't get it.
>
from the same docs:
"MatchData acts as an array. ... mtch[0] is equivalent to ... the
entire matched string. "

so m.begin(0) returns the offset of the start of the matched string
(in this case, the start of the png header). Since the string
contains the contents of the whole file, the string offset is the same
as the file offset we need.
-Adam

Jim

2/1/2008 11:21:00 AM

0

On Jan 30, 7:29 pm, Adam Shelly <adam.she...@gmail.com> wrote:
>
>
> from the same docs:
> "MatchData acts as an array. ... mtch[0] is equivalent to ... the
> entire matched string. "
>
> so m.begin(0) returns the offset of the start of the matched string
> (in this case, the start of the png header). Since the string
> contains the contents of the whole file, the string offset is the same
> as the file offset we need.
> -Adam

OK, it's really not that hard is it?

input.read(N) read N bytes from the stream. From the spec we know PNG
has an header (they call a signature) that is 8 bytes. Check it. If
it's a PNG, read the chunks until IEND

The only other thing is unpack("N"). From the docs it takes a string
of 4 bytes and returns a Fixnum.

Thanks again for the code. The file I'm extracting is a SketchUp 3d
model, btw.