[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: [rcr] String#split behaves odd

Peña, Botp

12/7/2004 5:16:00 AM

Yukihiro Matsumoto [mailto:matz@ruby-lang.org] wrote:
// on Tue, 7 Dec 2004 04:20:37 +0900, Simon Strandgaard
//|Maybe the return value of String#split is wrong.
//|If I invoke split on an empty string, then it
//|results in an empty Array (which I think is odd).
//
//Feeling odd is subjective. Could you tell me why you felt
//String#split is "wrong"?

imho, I think he meant

[] != [""]

I myself thought that string#split would return an array of strings w a
minimum element of [""]

//
// matz.

kind regards -botp


10 Answers

Yukihiro Matsumoto

12/7/2004 5:31:00 AM

0

Hi,

In message "Re: [rcr] String#split behaves odd"
on Tue, 7 Dec 2004 14:16:06 +0900, "Peña, Botp" <botp@delmonte-phil.com> writes:

|imho, I think he meant
|
|[] != [""]
|
|I myself thought that string#split would return an array of strings w a
|minimum element of [""]

I don't get it. [] is an array of strings with zero elements. ;-)

matz.



Simon Strandgaard

12/7/2004 7:21:00 AM

0

On Tue, 7 Dec 2004 14:31:27 +0900, Yukihiro Matsumoto
<matz@ruby-lang.org> wrote:
> In message "Re: [rcr] String#split behaves odd"
> on Tue, 7 Dec 2004 14:16:06 +0900, "Peña, Botp" <botp@delmonte-phil.com> writes:
>
> |imho, I think he meant
> |
> |[] != [""]
> |
> |I myself thought that string#split would return an array of strings w a
> |minimum element of [""]
>
> I don't get it. [] is an array of strings with zero elements. ;-)


In the past I had the impression, that as long as there are no newline's
in the string, then split would always returns an array with one string.

"a".split(/\n/) #=> ["a"]
"a\nb".split(/\n/) #=> ["a", "b"]

However yesterday accidential the string I were about to split were empty,
and I had to add a specialcase (that only deals with the empty string).

I think many people don't have to make specialcases for the empty string,
if just split returns at least an array with one String element.


maybe title of this rcr should have been: change split result to
reduce specialcases.


--
Simon Strandgaard



Zach Dennis

12/7/2004 7:45:00 AM

0

Yukihiro Matsumoto wrote:
> Hi,
>
> In message "Re: [rcr] String#split behaves odd"
> on Tue, 7 Dec 2004 14:16:06 +0900, "Peña, Botp" <botp@delmonte-phil.com> writes:
>
> |imho, I think he meant
> |
> |[] != [""]
> |
> |I myself thought that string#split would return an array of strings w a
> |minimum element of [""]
>
> I don't get it. [] is an array of strings with zero elements. ;-)

There is misleading behavior though with the current implementation. For
example:

Example1: "aaaab".split( /a/ ) => [ "", "", "", "", "b" ]
Example2: "a".split( /a/ ) => []
Example3: "aaaa".split( /a/ ) => []

You would think all three cases would respond the same, but the last
examples respond very differently then the first. Should the behavior
not be consistent?

Zach


Jim Weirich

12/7/2004 12:40:00 PM

0

On Tuesday 07 December 2004 12:31 am, Yukihiro Matsumoto wrote:
> Hi,
>
> In message "Re: [rcr] String#split behaves odd"
>
> on Tue, 7 Dec 2004 14:16:06 +0900, "Peña, Botp" <botp@delmonte-phil.com>
writes:
> |imho, I think he meant
> |
> |[] != [""]
> |
> |I myself thought that string#split would return an array of strings w a
> |minimum element of [""]
>
> I don't get it. [] is an array of strings with zero elements. ;-)

This is a tough call, IMHO. It all depends on your mental description of
split. If you think of split as constructing an array of elements found in a
string, separated by a delimited, then returning [] makes sense because there
are no elements found. This progression makes a lot of sense ...

"a,b".split(',') => ['a', 'b'] # two elements found
"a".split(',') => ['a'] # one element found
"".split(',') => [] # zero elements found

However, if your mental model of split is that it starts with the original
string (well, a copy thereof) and breaks it apart whereever it finds a
delimiter, then this sequence makes sense...

"a,b".split(',') => ['a', 'b'] # Split between a and b
"a".split(',') => ['a'] # No delimiter found
"".split(',') => [''] # Again, no delimiter found

So when no delimiter is found, a list containing just the original string with
no splits makes sense in this model.

I will confess to finding myself in the first camp. It took some
experimentation before I saw the viewpoint of the second camp.

--
-- Jim Weirich jim@weirichhouse.org http://onest...
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)


Simon Strandgaard

12/7/2004 8:50:00 PM

0

Ah I had forgotten this..
Ruby wipes tailing empty-strings

irb(main):001:0> "aaabbb".split(/b/)
=> ["aaa"]
irb(main):002:0> "aaabbbc".split(/b/)
=> ["aaa", "", "", "c"]
irb(main):003:0> "aaabbbcbb".split(/b/)
=> ["aaa", "", "", "c"]
irb(main):004:0> "aaabbbcbbc".split(/b/)
=> ["aaa", "", "", "c", "", "c"]
irb(main):005:0>

Why is it smart to remove tailing empty strings?

--
Simon Strandgaard


Ryan Davis

12/7/2004 10:55:00 PM

0

To weigh in, I think the behavior of split should necessarily be
compared to splits in other languages, as long as our split acts in a
consistent and well behaved way. For me, that roughly means that there
should be a 1:1 correlation between join and split. Namely, anything
split (with a constant string/pattern) should be able to be joined back
to the original using that constant string. This is not currently the
case in ruby:

irb(main):008:0> %w( aabb bbaa ).map do |s| s.split(/a/).join('a') ==
s; end
=> [true, false]



Ryan Davis

12/7/2004 10:57:00 PM

0


On Dec 7, 2004, at 2:55 PM, Ryan Davis wrote:

> To weigh in, I think the behavior of split should necessarily be
> compared to splits in other

argh... should _NOT_ be compared... not not... stupid brain.



Ryan Davis

12/7/2004 10:58:00 PM

0

On Dec 7, 2004, at 2:55 PM, Ryan Davis wrote:

> To weigh in, I think the behavior of split should necessarily be
> compared to splits in other

argh... should _NOT_ be compared... not not... stupid brain.



T. Onoma

12/7/2004 11:49:00 PM

0

On Tuesday 07 December 2004 05:55 pm, Ryan Davis wrote:
| To weigh in, I think the behavior of split should necessarily be
| compared to splits in other languages, as long as our split acts in a
| consistent and well behaved way. For me, that roughly means that there
| should be a 1:1 correlation between join and split. Namely, anything
| split (with a constant string/pattern) should be able to be joined back
| to the original using that constant string. This is not currently the
| case in ruby:
|
| irb(main):008:0> %w( aabb bbaa ).map do |s| s.split(/a/).join('a') ==
| s; end
| => [true, false]

This is good thinkings.

T.


Jim Weirich

12/8/2004 2:52:00 AM

0

On Tuesday 07 December 2004 05:55 pm, Ryan Davis wrote:
> To weigh in, I think the behavior of split should necessarily be
> compared to splits in other languages, as long as our split acts in a
> consistent and well behaved way. [...] Namely, anything
> split (with a constant string/pattern) should be able to be joined back
> to the original using that constant string. This is not currently the
> case in ruby:
>
> irb(main):008:0> %w( aabb bbaa ).map do |s| s.split(/a/).join('a') ==
> s; end
> => [true, false]

%w( aabb bbaa ).map do |s| s.split(/a/,-1).join('a') == s; end
# => [true, true]

Don't drop trailing splits.

--
-- Jim Weirich jim@weirichhouse.org http://onest...
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)