Chris Parker
1/15/2006 2:08:00 AM
Here is a very simple example of the problem:
irb(main):001:0> i = 0
=> 0
irb(main):002:0>
File.open("tcpdump/al2ak_contents.dat"){|file|file.each_byte{|byte|
i+=1};print i}
794=> nil
irb(main):003:0> File.size("tcpdump/al2ak_contents.dat")
=> 3329
irb(main):004:0>
File.open("tcpdump/al2ak_contents.dat"){|file|file.each_byte{|byte|
};file.eof?}
=> true
i should be equal to the size of the file. This is definitely a real
difference. The file actually has 3329 bytes in it. That is what the
OS thinks and something close to that is what opening the file shows.
irb(main):019:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|};file.pos}
=> 3329
This implies that some bytes are being ignored or that each_byte just
sets pos to file size at the end.
irb(main):028:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read;}
=> "\r\016\f\021\017\022\023\024\023\022\017\030\030"
irb(main):029:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read;file.eof}
=> true
irb(main):030:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read;file.eof;file.pos}
=> 1306
irb(main):031:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i);file.read.length}
=> 13
So we are at eof after reading i bytes, as shown above, but I seek to i
and then read another 13 bytes before eof again. But look at the huge
chance in pos, which didn't go to 3329 this time. Let's try going to
i+13
irb(main):021:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+13);file.read;}
=> ""
irb(main):022:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+13);file.read;file.eof;}
=> true
irb(main):023:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+13);file.read;file.eof;file.pos}
=> 1319
That sort of makes sense. If 1306 was actually the end of file, then
adding 13 to it would produce exactly these results. Adding 14 to i
has the same results as 13. Adding 15 to i has an different outcome:
irb(main):043:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+15);file.read}
=>
"\030\030#\"\"\"#''''''''''\001\t\010\010\t\n\t\v\t\t\v\016\v\r\v\016\021\016
\016\016\016\021\023\r\r\016\r\r\023\030\021\017\017\017\017\021\030\026\027\024
\024\024\027\026"
irb(main):044:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+15);file.read;file.eof}
=> true
irb(main):045:0>
File.open("tcpdump/al2ak_contents.dat"){|file|i=0;file.each_byte{|byte|
i+=1};file.seek(i+15);file.read;file.eof;file.pos}
=> 1321
So adding 15 to i returns a new string of bytes and gets an eof again
and doesn't move the pos past 1306 + 15.
I am at a loss for explaining what is going on here. Why can't
each_byte get to the true eof? And why can I seek to the true eof (I
didn't show it, but it worked) if each_byte can't make it there?
Any help is appreciated. I'll try to answer any questions as well.
Sincerely,
Chris Parker