[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Why does IO.readlines() keep newlines?

Randy R

11/19/2007 7:15:00 PM

At the very least, the win32 implementation of Ruby's IO.readlines()
method keeps the newline character on each string in the array. Considering
that it is the newline that defines a "line," it would not be wholly
unreasonable to omit it from the array, returned. I would have imagined
that it was implemented using String.split(), which omits the splitting
character. On a simply practical note, I'm sure the former is more popular
than the latter in the following:


out = File.open('file.txt', 'r'){|file| file.readlines.collect{|line|
line.chomp}}
out = File.open('file.txt', 'r'){|file|
}


...in that rarely do people actually want newlines in their strings.
Interestingly enough, I discovered this behaviour from a bug in a
program which was hidden by another peculiar function, puts(). Can you
imagine my surprise that puts() not only appends a newline to a string
printed to stdout but, if a newline already exists, it doesn't bother
appending one! So, printing strings with puts() can hide whether strings
have a newline or not. Weird...
So, who thinks my suggested change is a good idea? How do I go about
popularizing my opinion?
Thank you...


8 Answers

Daniel Brumbaugh Keeney

11/19/2007 11:43:00 PM

0

On Nov 19, 2007 1:15 PM, Just Another Victim of the Ambient Morality
<ihatespam@hotmail.com> wrote:
> At the very least, the win32 implementation of Ruby's IO.readlines()
> method keeps the newline character on each string in the array. Considering
> that it is the newline that defines a "line," it would not be wholly
> unreasonable to omit it from the array, returned. I would have imagined
> that it was implemented using String.split(), which omits the splitting
> character. On a simply practical note, I'm sure the former is more popular
> than the latter in the following:
>
>
> out = File.open('file.txt', 'r'){|file| file.readlines.collect{|line|
> line.chomp}}
> out = File.open('file.txt', 'r'){|file|
> }
>
>
> ...in that rarely do people actually want newlines in their strings.
> Interestingly enough, I discovered this behaviour from a bug in a
> program which was hidden by another peculiar function, puts(). Can you
> imagine my surprise that puts() not only appends a newline to a string
> printed to stdout but, if a newline already exists, it doesn't bother
> appending one! So, printing strings with puts() can hide whether strings
> have a newline or not. Weird...
> So, who thinks my suggested change is a good idea? How do I go about
> popularizing my opinion?
> Thank you...
>


I'm going to speculate that readlines does this because of operating
system differences in line endings.
For compatibility between most systems, it would have to remove line
feeds (\x0A) or line-feed/carriage return combinations (\x0D\x0A).

I personally rather prefer the current behavior of readline. I don't
think puts matters, and is certainly not worth changing. I'm aware of
their behavior and if it matters, I code accordingly.

humbly,
Daniel Brumbaugh Keeney

Phrogz

11/20/2007 12:06:00 AM

0

On Nov 19, 12:14 pm, "Just Another Victim of the Ambient Morality"
<ihates...@hotmail.com> wrote:
....
> character. On a simply practical note, I'm sure the former is more popular
> than the latter in the following:
>
> out = File.open('file.txt', 'r'){|file| file.readlines.collect{|line|
> line.chomp}}
> out = File.open('file.txt', 'r'){|file|
> ...in that rarely do people actually want newlines in their strings.

FWIW, I never use readlines for this exact reason. I find its
preservation of line endings entirely annoying. I always
IO.read().split when I can.

As much as I'd personally like it changed, and know that such a change
would not affect any of my scripts, I'm concerned that such a change
must fall into the category of "not backwards compatible", and thus
unlikely to be effected without very strong support.


> How do I go about popularizing my opinion?

Discuss the issue here as you are doing. If you don't get a large
vocal outcry against the proposal, or are not swayed by any arguments
that come against it, file an RCR[1] (preferably with a source code
patch attached) and hope that Matz accepts your change into the core.

[1] http://rcr...

Xavier Noria

11/20/2007 8:53:00 AM

0

On Nov 20, 2007, at 12:43 AM, Daniel Brumbaugh Keeney wrote:

> I'm going to speculate that readlines does this because of operating
> system differences in line endings.
> For compatibility between most systems, it would have to remove line
> feeds (\x0A) or line-feed/carriage return combinations (\x0D\x0A).

Indeed that's not the case.

In CRLF platforms the I/O layer handles newlines in text mode so that
the programmer *always* works with "\n", no CRLF ever goes up on
Windows. Nor you need to print CRLFs by hand at the Ruby level. At the
Ruby level a newline is always == "\n" and has always length 1.

The string "\n" is the logical newline in Ruby meaning it is portable
and the I/O layer takes care of its actual representation on disk
according to the runtime platform. In Java for example this works in a
different way, "\n" is not portable, to write a portable newline in
Java you invoke some println().

This article explains how newlines work in C-based languages. It is
Perl-based but in general it applies to Ruby except that in Ruby
there's no platform where "\n" == "\015". In Ruby "\n" == "\012"
everywhere and that simplifies things a bit. The I/O layer in MRI is
C's stdio instead of PerlIO, but the explained newline mangling in and
out is analogous:

http://www.onlamp.com/pub/a/onlamp/2006/08/17/understanding-new...

I am the author but that doesn't matter.

-- fxn






Robert Dober

11/20/2007 11:56:00 AM

0

On Nov 20, 2007 12:43 AM, Daniel Brumbaugh Keeney
<devi.webmaster@gmail.com> wrote:
<snip>
> I personally rather prefer the current behavior of readline.
But than you could do
readlines/(\n\r?)/,
as default behavior I find it most annoying too.

Robert
--
what do I think about Ruby?
http://ruby-smalltalk.blo...

Daniel Brumbaugh Keeney

11/20/2007 9:18:00 PM

0

On Nov 20, 2007 2:53 AM, Xavier Noria <fxn@hashref.com> wrote:
> On Nov 20, 2007, at 12:43 AM, Daniel Brumbaugh Keeney wrote:
>
> > I'm going to speculate that readlines does this because of operating
> > system differences in line endings.
> > For compatibility between most systems, it would have to remove line
> > feeds (\x0A) or line-feed/carriage return combinations (\x0D\x0A).
>
> Indeed that's not the case.
>
> In CRLF platforms the I/O layer handles newlines in text mode so that
> the programmer *always* works with "\n", no CRLF ever goes up on
> Windows. Nor you need to print CRLFs by hand at the Ruby level. At the
> Ruby level a newline is always == "\n" and has always length 1.
> -- fxn

Unfortunately, files created on one platform inevitably make their way
to another. When an IO with \r\n is read on a UNIX, it preserves the
carriage return.

Daniel Brumbaugh Keeney

Xavier Noria

11/20/2007 9:45:00 PM

0

On Nov 20, 2007, at 10:17 PM, Daniel Brumbaugh Keeney wrote:

> Unfortunately, files created on one platform inevitably make their way
> to another. When an IO with \r\n is read on a UNIX, it preserves the
> carriage return.

Yes, that's covered in the article I mentioned as well:

http://www.onlamp.com/pub/a/onlamp/2006/08/17/understanding-newlines.h...

-- fxn


Michael Black

10/31/2013 7:38:00 PM

0

Stan Brown

10/31/2013 11:20:00 PM

0

On Thu, 31 Oct 2013 14:49:15 -0400, Rhino wrote:
>
> On 2013-10-31 2:35 PM, Vignesh wrote:
>
>
> You need to unsubscribe YOURSELF. Use your newsreader's features to
> cancel your subscription to this newsgroup. This is not a mailing list
> so an "unsubscribe" email doesn't work here on Usenet.
>
> If you're not sure how to unsubscribe, tell us what newsreader you are
> using and someone here may be able to tell you what to do.

I think you've been trolled.

--
Stan Brown, Oak Road Systems, Tompkins County, New York, USA
http://OakRoadS...
Shikata ga nai...