Asp Forum - Question regarding design of the String Class

Michael W. Ryder

4/23/2007 2:00:00 AM

Was there a reason the string class was implemented with str[i]
returning the code of position i in str? The reason I ask this is that
in other languages str[i] returns the string starting at position i.
For example C uses t = strcpy(str[i]) and Business Basic uses S$=T$(I)
to copy a string from position i.
I can see no way to do this in Ruby other than using something like: t =
str[i,9999]. It seemed strange that copying ranges of strings uses the
same format as C (t =strncpy(str[i],n)) but not when copying the remainder.

18 Answers

Roland Crosby

4/23/2007 2:05:00 AM

On Apr 22, 2007, at 10:00 PM, Michael W. Ryder wrote:

> Was there a reason the string class was implemented with str[i]
> returning the code of position i in str? The reason I ask this is
> that in other languages str[i] returns the string starting at
> position i. For example C uses t = strcpy(str[i]) and Business
> Basic uses S$=T$(I) to copy a string from position i.
> I can see no way to do this in Ruby other than using something
> like: t = str[i,9999]. It seemed strange that copying ranges of
> strings uses the same format as C (t =strncpy(str[i],n)) but not
> when copying the remainder.

Try str[i,-1], or one of the myriad other ways to access ranges of a
string as defined in String#[]

Michael W. Ryder

4/23/2007 2:30:00 AM

Roland Crosby wrote:
> On Apr 22, 2007, at 10:00 PM, Michael W. Ryder wrote:
>
>> Was there a reason the string class was implemented with str[i]
>> returning the code of position i in str? The reason I ask this is
>> that in other languages str[i] returns the string starting at position
>> i. For example C uses t = strcpy(str[i]) and Business Basic uses
>> S$=T$(I) to copy a string from position i.
>> I can see no way to do this in Ruby other than using something like: t
>> = str[i,9999]. It seemed strange that copying ranges of strings uses
>> the same format as C (t =strncpy(str[i],n)) but not when copying the
>> remainder.
>
> Try str[i,-1], or one of the myriad other ways to access ranges of a
> string as defined in String#[]
>
If I enter:
a = "This is a test."
b = a[1, -1]
puts b
irb returns nil. Obviously this is not what I want. If instead of -1 I
use 9999 it returns "his a test." which is what I was looking for. This
seems like a kludge and an inconsistency. Like I pointed out other
languages just use b = a[1] to get the remainder of the string instead
of 104. The string class already has methods like each_byte for
converting characters in a string to a number, so why does it need
another shortcut for something that is probably very rarely used.

Roland Crosby

4/23/2007 2:46:00 AM

On Apr 22, 2007, at 10:35 PM, Michael W. Ryder wrote:
> Roland Crosby wrote:
>> On Apr 22, 2007, at 10:00 PM, Michael W. Ryder wrote:
>>> Was there a reason the string class was implemented with str[i]
>>> returning the code of position i in str? The reason I ask this
>>> is that in other languages str[i] returns the string starting at
>>> position i. For example C uses t = strcpy(str[i]) and Business
>>> Basic uses S$=T$(I) to copy a string from position i.
>>> I can see no way to do this in Ruby other than using something
>>> like: t = str[i,9999]. It seemed strange that copying ranges of
>>> strings uses the same format as C (t =strncpy(str[i],n)) but not
>>> when copying the remainder.
>> Try str[i,-1], or one of the myriad other ways to access ranges of
>> a string as defined in String#[]
> If I enter:
> a = "This is a test."
> b = a[1, -1]
> puts b
> irb returns nil. Obviously this is not what I want. If instead of
> -1 I use 9999 it returns "his a test." which is what I was looking
> for. This seems like a kludge and an inconsistency. Like I
> pointed out other languages just use b = a[1] to get the remainder
> of the string instead of 104. The string class already has methods
> like each_byte for converting characters in a string to a number,
> so why does it need another shortcut for something that is probably
> very rarely used.

Sorry, I meant a[1..-1] rather than a[1,-1]. I don't know why Ruby
returns the character codes like that, but for what it's worth, I
believe Ruby 1.9 is going to switch to returning a single-character
string when you put one integer in String#[].

Daniel Martin

4/23/2007 3:55:00 AM

"Michael W. Ryder" <_mwryder@worldnet.att.net> writes:

> Was there a reason the string class was implemented with str[i]
> returning the code of position i in str? The reason I ask this is
> that in other languages str[i] returns the string starting at position
> i. For example C uses t = strcpy(str[i]) and Business Basic uses
> S$=T$(I) to copy a string from position i.

I can't comment on what "Business Basic" uses, but your C code is
completely wrong. In C, str[i] returns a char which, since C has
"char" as one of its integral types, is equivalent to returning the
character code.

The usual usage of strcpy to copy only from the second (index 1)
character onward is:

strcpy(dest, src + 1);

(And incidentally, using strcpy instead of strncpy is a practice that
often leads to security vulnerabilities)

In other words, ruby's behavior with str[i] matches the behavior of C
- it returns the character at that position, where "character" is
viewed simply as a number.

--
s=%q( Daniel Martin -- martin@snowplow.org
puts "s=%q(#{s})",s.map{|i|i}[1] )
puts "s=%q(#{s})",s.map{|i|i}[1]

Rick DeNatale

4/23/2007 1:23:00 PM

On 4/22/07, Daniel Martin <martin@snowplow.org> wrote:
> "Michael W. Ryder" <_mwryder@worldnet.att.net> writes:
>
> > Was there a reason the string class was implemented with str[i]
> > returning the code of position i in str? The reason I ask this is
> > that in other languages str[i] returns the string starting at position
> > i. For example C uses t = strcpy(str[i]) and Business Basic uses
> > S$=T$(I) to copy a string from position i.
>
> I can't comment on what "Business Basic" uses, but your C code is
> completely wrong. In C, str[i] returns a char which, since C has
> "char" as one of its integral types, is equivalent to returning the
> character code.

Daniel, your points are well taken, but if the rusty old neurons in
my brain which contain knowledge of C aren't mistaken, str[i] isn't a
function, and therefore doesn't 'return' anything.

C doesn't really have a string type. A string literal is really an
array of chars, although in almost all cases (i.e. either than when
it's used in a string initializer, or as the argument to sizeof), it's
interpreted as a pointer to the first character, due to the
relationship between arrays and pointers in C.

So if str is declared either as:

char str[];
or
char *str;

the expression str[i] is equivalent to *((str) + (i)), it's really a
pointer to a char, which because of the relationship between arrays
and pointers in c, can be interpreted as an array of chars.

And in Ruby the whole notion of pointers is meaningless.

The point here, of course, is that when learning Ruby, or any other
language, one needs to be aware that things one knows from other
languages often don't carry over without conceptual modification, if
at all.

If all languages did everything exactly the same way, there'd be no
need for so many of them.

To sum it up, let Ruby be Ruby, don't expect it to be Java, C++,
Visual Basic, or anything else.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denh...

Michael W. Ryder

4/23/2007 5:35:00 PM

Rick DeNatale wrote:
> On 4/22/07, Daniel Martin <martin@snowplow.org> wrote:
>> "Michael W. Ryder" <_mwryder@worldnet.att.net> writes:
>>
>> > Was there a reason the string class was implemented with str[i]
>> > returning the code of position i in str? The reason I ask this is
>> > that in other languages str[i] returns the string starting at position
>> > i. For example C uses t = strcpy(str[i]) and Business Basic uses
>> > S$=T$(I) to copy a string from position i.
>>
>> I can't comment on what "Business Basic" uses, but your C code is
>> completely wrong. In C, str[i] returns a char which, since C has
>> "char" as one of its integral types, is equivalent to returning the
>> character code.
>
> Daniel, your points are well taken, but if the rusty old neurons in
> my brain which contain knowledge of C aren't mistaken, str[i] isn't a
> function, and therefore doesn't 'return' anything.
>
> C doesn't really have a string type. A string literal is really an
> array of chars, although in almost all cases (i.e. either than when
> it's used in a string initializer, or as the argument to sizeof), it's
> interpreted as a pointer to the first character, due to the
> relationship between arrays and pointers in C.
>
> So if str is declared either as:
>
> char str[];
> or
> char *str;
>
> the expression str[i] is equivalent to *((str) + (i)), it's really a
> pointer to a char, which because of the relationship between arrays
> and pointers in c, can be interpreted as an array of chars.
>
> And in Ruby the whole notion of pointers is meaningless.
>
> The point here, of course, is that when learning Ruby, or any other
> language, one needs to be aware that things one knows from other
> languages often don't carry over without conceptual modification, if
> at all.
>
> If all languages did everything exactly the same way, there'd be no
> need for so many of them.
>
> To sum it up, let Ruby be Ruby, don't expect it to be Java, C++,
> Visual Basic, or anything else.
>

I guess my point was that str[i] behaves totally different from all the
other implementations of []. All of the others return a string. This
seems to be an inconsistency. If there is a valid reason for it I have
no problem, it just makes it harder to transfer over 25 years of
experience to a new language.
In Business Basic or C if I want the numeric value of a character in a
string I specify that. Likewise if I want to copy a string from an
arbitrary position I don't have to specify an ending character like
Ruby. I just find s = t[i, -1] to be much harder to understand in a
quick read then s = t[i]. Others may not have this problem.

Michael W. Ryder

4/23/2007 5:37:00 PM

Roland Crosby wrote:
> On Apr 22, 2007, at 10:35 PM, Michael W. Ryder wrote:
>> Roland Crosby wrote:
>>> On Apr 22, 2007, at 10:00 PM, Michael W. Ryder wrote:
>>>> Was there a reason the string class was implemented with str[i]
>>>> returning the code of position i in str? The reason I ask this is
>>>> that in other languages str[i] returns the string starting at
>>>> position i. For example C uses t = strcpy(str[i]) and Business Basic
>>>> uses S$=T$(I) to copy a string from position i.
>>>> I can see no way to do this in Ruby other than using something like:
>>>> t = str[i,9999]. It seemed strange that copying ranges of strings
>>>> uses the same format as C (t =strncpy(str[i],n)) but not when
>>>> copying the remainder.
>>> Try str[i,-1], or one of the myriad other ways to access ranges of a
>>> string as defined in String#[]
>> If I enter:
>> a = "This is a test."
>> b = a[1, -1]
>> puts b
>> irb returns nil. Obviously this is not what I want. If instead of -1
>> I use 9999 it returns "his a test." which is what I was looking for.
>> This seems like a kludge and an inconsistency. Like I pointed out
>> other languages just use b = a[1] to get the remainder of the string
>> instead of 104. The string class already has methods like each_byte
>> for converting characters in a string to a number, so why does it need
>> another shortcut for something that is probably very rarely used.
>
> Sorry, I meant a[1..-1] rather than a[1,-1].

I figured that out right after I posted my reply. I had forgotten about
ranges as I have never programmed in a language that used them before.
The little differences can really get you, especially when you find so
many similarities.

I don't know why Ruby
> returns the character codes like that, but for what it's worth, I
> believe Ruby 1.9 is going to switch to returning a single-character
> string when you put one integer in String#[].
>

Brian Candler

4/23/2007 7:37:00 PM

On Tue, Apr 24, 2007 at 02:40:09AM +0900, Michael W. Ryder wrote:
> I guess my point was that str[i] behaves totally different from all the
> other implementations of []. All of the others return a string.

You clearly know a lot of languages then :-)

As pointed out before, in C, str[i] is an expression whose value is an
integer for the character at position i, exactly as in Ruby.

In Perl, it doesn't do what you expect either:

$ perl -e '$a = "abcde"; print $a[2], "\n";'

$

(what this actually does is extract an element from the array @a, which I
have not initialised, and is completely unrelated to the scalar $a)

> In Business Basic or C if I want the numeric value of a character in a
> string I specify that. Likewise if I want to copy a string from an
> arbitrary position I don't have to specify an ending character like
> Ruby. I just find s = t[i, -1] to be much harder to understand in a
> quick read then s = t[i]. Others may not have this problem.

Personally I would be *very* surprised if str[i] returned all the characters
from 'i' to the end of the string. But then I don't program in Business
Basic.

I do program in C though. If I wanted the string from position i to the end
of the string, I would write str + i, or possibly &str[i]

In Perl you have to be explicit and call substr()

Brian.

Michael W. Ryder

4/23/2007 8:34:00 PM

Brian Candler wrote:
> On Tue, Apr 24, 2007 at 02:40:09AM +0900, Michael W. Ryder wrote:
>> I guess my point was that str[i] behaves totally different from all the
>> other implementations of []. All of the others return a string.
>
> You clearly know a lot of languages then :-)
>

I probably should have phrased that differently. What I meant that all
of the other implementations of [] in Ruby for the String class return a
string, only str[i] returns a number.

> As pointed out before, in C, str[i] is an expression whose value is an
> integer for the character at position i, exactly as in Ruby.
>
> In Perl, it doesn't do what you expect either:
>
> $ perl -e '$a = "abcde"; print $a[2], "\n";'
>
> $
>
> (what this actually does is extract an element from the array @a, which I
> have not initialised, and is completely unrelated to the scalar $a)
>
>> In Business Basic or C if I want the numeric value of a character in a
>> string I specify that. Likewise if I want to copy a string from an
>> arbitrary position I don't have to specify an ending character like
>> Ruby. I just find s = t[i, -1] to be much harder to understand in a
>> quick read then s = t[i]. Others may not have this problem.
>
> Personally I would be *very* surprised if str[i] returned all the characters
> from 'i' to the end of the string. But then I don't program in Business
> Basic.
>
Business Basic has been doing this for over the 25 years I have been
programming in it. For example if I enter: A$="abcdefg" and then say:
Print A$(3) it prints cdefg. Like Ruby, if I enter B$=A$(3,3) B$
contains cde. Other than the beginning number of the string they act
the same.

> I do program in C though. If I wanted the string from position i to the end
> of the string, I would write str + i, or possibly &str[i]
>

But you do not have to provide a length or ending position for the copy
which was part of my confusion. I specify a starting position and the
language copies the rest of the string. In Ruby just providing a
starting position gives me a numeric value.

> In Perl you have to be explicit and call substr()
>
> Brian.
>

Robert Dober

4/23/2007 9:03:00 PM

On 4/23/07, Michael W. Ryder <_mwryder@worldnet.att.net> wrote:
> Brian Candler wrote:
> > On Tue, Apr 24, 2007 at 02:40:09AM +0900, Michael W. Ryder wrote:
> >> I guess my point was that str[i] behaves totally different from all the
> >> other implementations of []. All of the others return a string.
> >
> > You clearly know a lot of languages then :-)
> >
>
> I probably should have phrased that differently. What I meant that all
> of the other implementations of [] in Ruby for the String class return a
> string, only str[i] returns a number.
I copy that, you have made a somehow valid point, that has been
discussed before and do not like either that "ab"[0] == ?a (instead
of "a"). But it is not a clearcut error either.

The overloading (in human terms not computer science terms) of [] to
get elements and substrings of a string might not be the best choice
either. And that there is String#each_byte and not
String#each_character might hurt too.
But there are other tools around that make up for it.
>
> > As pointed out before, in C, str[i] is an expression whose value is an
> > integer for the character at position i, exactly as in Ruby.
> >
> > In Perl, it doesn't do what you expect either:
> >
> > $ perl -e '$a = "abcde"; print $a[2], "\n";'
> >
> > $
> >
> > (what this actually does is extract an element from the array @a, which I
> > have not initialised, and is completely unrelated to the scalar $a)
> >
> >> In Business Basic or C if I want the numeric value of a character in a
> >> string I specify that. Likewise if I want to copy a string from an
> >> arbitrary position I don't have to specify an ending character like
> >> Ruby. I just find s = t[i, -1] to be much harder to understand in a
> >> quick read then s = t[i]. Others may not have this problem.
> >
I guess that the influence of *Basic and C* to Ruby are minimal. In
order to convince a rubyist that other features might be nice because
they are present in language X, I'd rather chose X from Python, Lisp,
Smalltalk, Self, IO or Lua (and I am leaving out some by laziness and
ignorance)
> > Personally I would be *very* surprised if str[i] returned all the characters
> > from 'i' to the end of the string. But then I don't program in Business
> > Basic.
> >
> Business Basic has been doing this for over the 25 years I have been
> programming in it. For example if I enter: A$="abcdefg" and then say:
> Print A$(3) it prints cdefg. Like Ruby, if I enter B$=A$(3,3) B$
> contains cde. Other than the beginning number of the string they act
> the same.
>
> > I do program in C though. If I wanted the string from position i to the end
> > of the string, I would write str + i, or possibly &str[i]
> >
>
> But you do not have to provide a length or ending position for the copy
> which was part of my confusion. I specify a starting position and the
> language copies the rest of the string. In Ruby just providing a
> starting position gives me a numeric value.
Well if you want to get the maximum from Ruby I'd advice you, sorry if
this is sounding blunt, to take a break from too much comparing with
other languages.
Paradigm shifts are tough, after that break you might still think that
"ab"[0] == ?a is
not a good thing, but I am sure that you will be able to bring your
point across much better.

Sorry if I became lecturing, just thought it might help, after all ;).
I remember very well when I was lectured about duck typing, first I
was angry, and I said lots of stupid things (they were very clever in
my Ada world of course), but when I let go and looked at things as
they were I really shifted into the paradigm of Ruby, and yes I still
get bitten by duck typing and no I do not introduce type checking, I
just write better tests.

Welcome to Ruby.

Cheers
Robert

--
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

comp.lang.ruby

Question regarding design of the String Class

Michael W. Ryder

Roland Crosby

Michael W. Ryder

Roland Crosby

Daniel Martin

Rick DeNatale

Michael W. Ryder

Michael W. Ryder

Brian Candler

Michael W. Ryder

Robert Dober

x Login to ForumsZone