[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: Trouble with binary files?

Thomas Morgan

9/19/2003 7:22:00 PM

--- Heinz Werntges
<werntges@informatik.fh-wiesbaden.de> wrote:
> >files. Is there something else I have to do to be
> able
> >to work with binary data?

> Did you try IO#binmode ? Unix guys (like me)
> typically miss that
> when working in a Windows environment.

Nope, hadn't seen that. And that fixes it.

I don't particularly understand *why* though. Those
first 160 bytes were being read properly, what stops
it from getting the rest? Why 160? Is it going to
suddenly require something else to be done when I go
over to the 1.73mb file that the program is designed
to process? (I wouldn't *think* so, but then I didn't
expect this 160 thing either...)

Anyway, thanks for the help.

-Morgan.

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder...

16 Answers

Tim Hammerquist

9/19/2003 11:11:00 PM

0

<agemoagemo@yahoo.com> graced us by uttering:
> <werntges@informatik.fh-wiesbaden.de> wrote:
> > > files. Is there something else I have to do to be able to
> > > work with binary data?
>
> > Did you try IO#binmode ? Unix guys (like me) typically miss
> > that when working in a Windows environment.
>
> Nope, hadn''t seen that. And that fixes it.
>
> I don''t particularly understand *why* though. Those first 160
> bytes were being read properly, what stops it from getting the
> rest? Why 160? Is it going to suddenly require something else
> to be done when I go over to the 1.73mb file that the program
> is designed to process? (I wouldn''t *think* so, but then I
> didn''t expect this 160 thing either...)

You misunderstand the 160-byte barrier as being related to Ruby.
It''s a Win32/DOS issue.

<story-mode>

Way back in PC-/MS-DOS days, it was decided that the
non-printable ASCII-26 (^Z) character would mark the end of a
textmode file. The difference between textmode and binmode of a
DOS file is important, though it need not ever have become an
issue.

The requirement for ^Z to terminate a textfile has since been
changed. However, (for backward compatibility?) when the ^Z *is*
encountered in a textmode file, DOS (and subsequently, Windows)
still set the EOF flag and stop reading.

</story-mode>

This becomes more of an issue when the default file open mode for
DOS/Win is in text mode, creating the need for a completely new
function call almost exclusively for DOS/Win platforms; in this
case, binmode(), which explicitly sets the file read mode to
binary, preventing the OS from stopping at the first ^Z (and from
changing line endings, blah, blah...).

As someone else mentioned above, this isn''t an issue on Unix or
many other systems, since EOF on these OSes isn''t determined by
file contents. I''m not if it''s an issue for Macs, as they also
historically use different line endings. This also may have
changed with OS X; anyone know?

The moral of the story is:

Always call fh.binmode() before reading
any non-text file on non-Unix platforms.

HTH,
Tim Hammerquist
--
scanf() is evil.

Hal E. Fulton

9/19/2003 11:41:00 PM

0

Tim Hammerquist wrote:
> You misunderstand the 160-byte barrier as being related to Ruby.
> It''s a Win32/DOS issue.
>
> <story-mode>
>
> Way back in PC-/MS-DOS days, it was decided that the
> non-printable ASCII-26 (^Z) character would mark the end of a
> textmode file. The difference between textmode and binmode of a
> DOS file is important, though it need not ever have become an
> issue.
>
> The requirement for ^Z to terminate a textfile has since been
> changed. However, (for backward compatibility?) when the ^Z *is*
> encountered in a textmode file, DOS (and subsequently, Windows)
> still set the EOF flag and stop reading.
>
> </story-mode>
>
> This becomes more of an issue when the default file open mode for
> DOS/Win is in text mode, creating the need for a completely new
> function call almost exclusively for DOS/Win platforms; in this
> case, binmode(), which explicitly sets the file read mode to
> binary, preventing the OS from stopping at the first ^Z (and from
> changing line endings, blah, blah...).
>
> As someone else mentioned above, this isn''t an issue on Unix or
> many other systems, since EOF on these OSes isn''t determined by
> file contents. I''m not if it''s an issue for Macs, as they also
> historically use different line endings. This also may have
> changed with OS X; anyone know?
>
> The moral of the story is:
>
> Always call fh.binmode() before reading
> any non-text file on non-Unix platforms.

True, but let''s be fair.

MSDOS stole many things from Unix, such as the notion of a
hierarchical directory structure and the use of < > | at the
shell level. (Many things were incompletely stolen, unfortunately.)

The binmode/textmode distinction came from Unix. At that time
Unix had an EOF character of control-D (which explains the ^D we
still type occasionally at the terminal).

So historically Unix''s behavior with respect to ^D was the same as
DOS''s with respect to ^Z. But Unix/Linux moved beyond that, and
DOS/Windows never did.

Hal


Steven Jenkins

9/19/2003 11:57:00 PM

0

Hal Fulton wrote:
> True, but let''s be fair.
>
> MSDOS stole many things from Unix, such as the notion of a
> hierarchical directory structure and the use of < > | at the
> shell level. (Many things were incompletely stolen, unfortunately.)
>
> The binmode/textmode distinction came from Unix. At that time
> Unix had an EOF character of control-D (which explains the ^D we
> still type occasionally at the terminal).

No. Unix never distinguished between text and binary files. Unix did
(and does) interpret ASCII EOT (ctrl-d) as an end-of-input indicator for
terminal devices, but it never used any in-band character to mark the
end of a file. The EOT never got past the terminal driver, and was never
delivered to an application.

Steve


Hal E. Fulton

9/20/2003 2:10:00 AM

0

Steven Jenkins wrote:
> Hal Fulton wrote:
>
>> True, but let''s be fair.
>>
>> MSDOS stole many things from Unix, such as the notion of a
>> hierarchical directory structure and the use of < > | at the
>> shell level. (Many things were incompletely stolen, unfortunately.)
>>
>> The binmode/textmode distinction came from Unix. At that time
>> Unix had an EOF character of control-D (which explains the ^D we
>> still type occasionally at the terminal).
>
>
> No. Unix never distinguished between text and binary files. Unix did
> (and does) interpret ASCII EOT (ctrl-d) as an end-of-input indicator for
> terminal devices, but it never used any in-band character to mark the
> end of a file. The EOT never got past the terminal driver, and was never
> delivered to an application.

If Unix never distinguished between text and binary files, what
was the binary mode flag for?

Hal



Shashank Date

9/20/2003 2:26:00 AM

0


"Hal Fulton" <hal9000@hypermetrics.com> wrote in message
> If Unix never distinguished between text and binary files, what
> was the binary mode flag for?

For CR-LF may be ... just a wild guess.


Daniel Kelley

9/20/2003 2:39:00 AM

0

>>>>> "Hal" == Hal Fulton <hal9000@hypermetrics.com> writes:

Hal> Steven Jenkins wrote:
>> Hal Fulton wrote:
>>> True, but let''s be fair.
>>>
>>> MSDOS stole many things from Unix, such as the notion of a
>>> hierarchical directory structure and the use of < > | at the
>>> shell level. (Many things were incompletely stolen,
>>> unfortunately.)
>>>
>>> The binmode/textmode distinction came from Unix. At that time
>>> Unix had an EOF character of control-D (which explains the ^D
>>> we still type occasionally at the terminal).
>>
>>
>> No. Unix never distinguished between text and binary
>> files. Unix did (and does) interpret ASCII EOT (ctrl-d) as an
>> end-of-input indicator for terminal devices, but it never used
>> any in-band character to mark the end of a file. The EOT never
>> got past the terminal driver, and was never delivered to an
>> application.

Hal> If Unix never distinguished between text and binary files,
Hal> what was the binary mode flag for?

I recall that the whole ^Z terminator from CP/M and the fact that file
sizes were always multiples of 128 bytes (saving 7 bits in a size
field being important at the time), so a text file needed a special
character to mark the end of the text. MSDOS carried that "tradition"
on, to ease porting of CP/M applications to DOS, and, well, saving bits
was important at that time, at least it *seemed* to be important!

d.k.


--
Daniel Kelley - San Jose, CA
For email, replace the first dot in the domain with an at.

YANAGAWA Kazuhisa

9/20/2003 3:25:00 AM

0

In Message-Id: <3F6BB6FB.3020204@hypermetrics.com>
Hal Fulton <hal9000@hypermetrics.com> writes:

> If Unix never distinguished between text and binary files, what
> was the binary mode flag for?

For ANSI-C compliance. From fopen(3) of FreeBSD 4.8-RELEASE:

The mode string can also include the letter ``b'''' either as a third char-
acter or as a character between the characters in any of the two-charac-
ter strings described above. This is strictly for compatibility with
ISO/IEC 9899:1990 (``ISO C89'''') and has no effect; the ``b'''' is ignored.

I believe most of Unix like platforms stand on a similar position.


--
kjana@dm4lab.to September 20, 2003
A man is known by the company he keeps.


Jim Weirich

9/20/2003 3:29:00 AM

0

On Fri, 2003-09-19 at 22:10, Hal Fulton wrote:

> If Unix never distinguished between text and binary files, what
> was the binary mode flag for?

Unix originally didn''t have one. Only has it now for compatibility.

--
-- Jim Weirich jweirich@one.net http://onest...
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)

Hal E. Fulton

9/20/2003 5:00:00 AM

0

Jim Weirich wrote:
> On Fri, 2003-09-19 at 22:10, Hal Fulton wrote:
>
>
>>If Unix never distinguished between text and binary files, what
>>was the binary mode flag for?
>
>
> Unix originally didn''t have one. Only has it now for compatibility.
>

I''ll have to assume you''re correct, as I can''t prove my position.

But I definitely remember being led to believe that EOT was an
end-of-file marker. And I remember wondering how it worked for
binary files, did it store the length in the inode or what?

This was System III, around 1980 (out of date even then).

I''ll have to dig into the old kernel to see how it actually
worked. I only have it in hardcopy, though.

Hal



Steven Jenkins

9/20/2003 5:44:00 AM

0

Hal Fulton wrote:
> If Unix never distinguished between text and binary files, what
> was the binary mode flag for?

The ''b'' modifier was added to ANSI C to support non-Unix execution
environments that distinguish between text and binary files. It didn''t
exist in Unix until ANSI C required it; since then, it''s been a no-op.

http://www.lysator.liu.se/c/rat/d9....

Steve