[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

surprise in sub

matt

4/11/2008 4:16:00 PM

irb(main):001:0> s = "\\\\"
=> "\\\\"
irb(main):002:0> s.length
=> 2
irb(main):003:0> s = "howdy".sub("howdy", s)
=> "\\"
irb(main):004:0> s.length
=> 1

So merely using a string as the second param of sub (the replacement
value) can cause that string to be altered.

Now, the documentation does "warn" that sequences \1, \2 etc. are valid
in the replacement string. This suggests that the replacement string is
processed before use; to be sure, it says nothing about "\\" explicitly,
but I do see of course that one must deal with "\\" in order to escape
the escaping. Furthermore, there's a "workaround", namely to write the
third line as follows:

s = "howdy".sub("howdy") {|x| s}

Still, I got seriously caught by this behavior and it was tricky to
track down. m.

--
matt neuburg, phd = matt@tidbits.com, http://www.tidbits...
Leopard - http://www.takecontrolbooks.com/leopard-custom...
AppleScript - http://www.amazon.com/gp/product/...
Read TidBITS! It's free and smart. http://www.t...
8 Answers

Munagala Ramanath

4/11/2008 6:12:00 PM

0

On Apr 11, 9:15 am, m...@tidbits.com (matt neuburg) wrote:
> irb(main):001:0> s = "\\\\"
> => "\\\\"
> irb(main):002:0> s.length
> => 2
> irb(main):003:0> s = "howdy".sub("howdy", s)
> => "\\"
> irb(main):004:0> s.length
> => 1
>
> So merely using a string as the second param of sub (the replacement
> value) can cause that string to be altered.
>
> Now, the documentation does "warn" that sequences \1, \2 etc. are valid
> in the replacement string. This suggests that the replacement string is
> processed before use; to be sure, it says nothing about "\\" explicitly,
> but I do see of course that one must deal with "\\" in order to escape
> the escaping. Furthermore, there's a "workaround", namely to write the
> third line as follows:
>
> s = "howdy".sub("howdy") {|x| s}
>
> Still, I got seriously caught by this behavior and it was tricky to
> track down. m.
>
> --
> matt neuburg, phd = m...@tidbits.com,http://www.tidbits...
> Leopard -http://www.takecontrolbooks.com/leopard-custom...
> AppleScript -http://www.amazon.com/gp/product/...
> Read TidBITS! It's free and smart.http://www.t...

s is changing because you assigned to it, not because of using it
as the second parameter of sub(). Try assigning the result to a
different variable like so:

ss = "howdy".sub("howdy", s)

matt

4/11/2008 10:46:00 PM

0

x17y19 <amberarrow@gmail.com> wrote:

> On Apr 11, 9:15 am, m...@tidbits.com (matt neuburg) wrote:
> > irb(main):001:0> s = "\\\\"
> > => "\\\\"
> > irb(main):002:0> s.length
> > => 2
> > irb(main):003:0> s = "howdy".sub("howdy", s)
> > => "\\"
> > irb(main):004:0> s.length
> > => 1
> >
> > So merely using a string as the second param of sub (the replacement
> > value) can cause that string to be altered.
> >
> > Now, the documentation does "warn" that sequences \1, \2 etc. are valid
> > in the replacement string. This suggests that the replacement string is
> > processed before use; to be sure, it says nothing about "\\" explicitly,
> > but I do see of course that one must deal with "\\" in order to escape
> > the escaping. Furthermore, there's a "workaround", namely to write the
> > third line as follows:
> >
> > s = "howdy".sub("howdy") {|x| s}
> >
> > Still, I got seriously caught by this behavior and it was tricky to
> > track down. m.
> >
> > --
> > matt neuburg, phd = m...@tidbits.com,http://www.tidbits...
> > Leopard -http://www.takecontrolbooks.com/leopard-custom...
> > AppleScript -http://www.amazon.com/gp/product/...
> > Read TidBITS! It's free and smart.http://www.t...
>
> s is changing because you assigned to it, not because of using it
> as the second parameter of sub(). Try assigning the result to a
> different variable like so:
>
> ss = "howdy".sub("howdy", s)

You're missing the point... m.

--
matt neuburg, phd = matt@tidbits.com, http://www.tidbits...
Leopard - http://www.takecontrolbooks.com/leopard-custom...
AppleScript - http://www.amazon.com/gp/product/...
Read TidBITS! It's free and smart. http://www.t...

Arlen Cuss

4/12/2008 1:44:00 AM

0

[Note: parts of this message were removed to make it a legal post.]

Hi,

On Sat, Apr 12, 2008 at 2:20 AM, matt neuburg <matt@tidbits.com> wrote:

> irb(main):001:0> s = "\\\\"
> => "\\\\"
> irb(main):002:0> s.length
> => 2
> irb(main):003:0> s = "howdy".sub("howdy", s)
> => "\\"
> irb(main):004:0> s.length
> => 1
>

Yeah, escaping and escaping-of-escaping with substition and Strings being
used as a poor-man's-regexp always catches me out. Thanks for the heads up
on this one.

Arlen

Peña, Botp

4/12/2008 4:14:00 AM

0

From: matt neuburg [mailto:matt@tidbits.com]=20
# You're missing the point... m.

i think i missed the point too :)=20

you did mention: "So merely using a string as the second param of sub =
(the replacement value) can cause that string to be altered."...

kind regards -botp



Christopher Dicely

4/12/2008 2:26:00 PM

0

On Fri, Apr 11, 2008 at 9:20 AM, matt neuburg <matt@tidbits.com> wrote:
> irb(main):001:0> s = "\\\\"
> => "\\\\"
> irb(main):002:0> s.length
> => 2
> irb(main):003:0> s = "howdy".sub("howdy", s)
> => "\\"
> irb(main):004:0> s.length
> => 1
>
> So merely using a string as the second param of sub (the replacement
> value) can cause that string to be altered.

Nope, using the string (s) as the second parameter of sub did nothing to alter
it. This is clear if you use a different variable as the assignment target:


irb(main):001:0> s='\\\\'
=> "\\\\"
irb(main):002:0> s.length
=> 2
irb(main):003:0> foo = "howdy".sub("howdy",s)
=> "\\"
irb(main):004:0> s
=> "\\\\"
irb(main):005:0> s.length
=> 2
irb(main):006:0> foo
=> "\\"
irb(main):007:0> foo.length
=> 1

s isn't changed by being used as the second argument to sub, instead, the
string sent as the second argument to sub is processed for escape sequences
so that the substring '\\' occurring in that string is treated as a
single literal '\'
when used in the replacement.

But its not changed, as the above irb session shows. s is unmodified.

Todd Benson

4/12/2008 4:04:00 PM

0

On Sat, Apr 12, 2008 at 9:25 AM, Christopher Dicely <cmdicely@gmail.com> wrote:
> On Fri, Apr 11, 2008 at 9:20 AM, matt neuburg <matt@tidbits.com> wrote:
> > irb(main):001:0> s = "\\\\"
> > => "\\\\"
> > irb(main):002:0> s.length
> > => 2
> > irb(main):003:0> s = "howdy".sub("howdy", s)
> > => "\\"
> > irb(main):004:0> s.length
> > => 1
> >
>
> > So merely using a string as the second param of sub (the replacement
> > value) can cause that string to be altered.
>
> Nope, using the string (s) as the second parameter of sub did nothing to alter
> it. This is clear if you use a different variable as the assignment target:
>
>
>
> irb(main):001:0> s='\\\\'
> => "\\\\"
> irb(main):002:0> s.length
> => 2
> irb(main):003:0> foo = "howdy".sub("howdy",s)
>
> => "\\"
> irb(main):004:0> s
> => "\\\\"
> irb(main):005:0> s.length
> => 2
> irb(main):006:0> foo
> => "\\"
> irb(main):007:0> foo.length
> => 1
>
> s isn't changed by being used as the second argument to sub, instead, the
> string sent as the second argument to sub is processed for escape sequences
> so that the substring '\\' occurring in that string is treated as a
> single literal '\'
> when used in the replacement.
>
> But its not changed, as the above irb session shows. s is unmodified.

When I first read the post, I immediately wanted to strike out with a
"well, you're assigning" response.

I'm not sure, but I think the OP was referring to what you said;
namely, how the escaping happens before subbing.

Todd

matt

4/13/2008 1:31:00 PM

0

Christopher Dicely <cmdicely@gmail.com> wrote:

> On Fri, Apr 11, 2008 at 9:20 AM, matt neuburg <matt@tidbits.com> wrote:
> > irb(main):001:0> s = "\\\\"
> > => "\\\\"
> > irb(main):002:0> s.length
> > => 2
> > irb(main):003:0> s = "howdy".sub("howdy", s)
> > => "\\"
> > irb(main):004:0> s.length
> > => 1
> >
> > So merely using a string as the second param of sub (the replacement
> > value) can cause that string to be altered.
>
> Nope, using the string (s) as the second parameter of sub did nothing to alter
> it.

I didn't say that s was altered. I said that the string you provide as
the second param of sub might not be the string that gets substituted in
- as the example demonstrates. If you don't find this counterintuitive,
you don't; great. But some people might. Those are the people I'm trying
to help here. m.

--
matt neuburg, phd = matt@tidbits.com, http://www.tidbits...
Leopard - http://www.takecontrolbooks.com/leopard-custom...
AppleScript - http://www.amazon.com/gp/product/...
Read TidBITS! It's free and smart. http://www.t...

Peña, Botp

4/15/2008 3:10:00 AM

0

From: matt neuburg [mailto:matt@tidbits.com]=20
# I didn't say that s was altered. I said that the string you=20
# provide as the second param of sub might not be the string
# that gets substituted in - as the example demonstrates. If=20
# you don't find this counterintuitive, you don't; great. But
# some people might. Those are the people I'm trying
# to help here. m.

i think the confusion stems fr the fact that sub/gsub has to =
reprocess/unescape the string twice

1 for the string as usual for possible escaping chars like \ and "

and=20

2 for the group references like \1

note that this behaviour is present in other languages too.

it's been a long time i have *not used the string(as 2nd param) form. I =
have been getting used to w the block form since not only does it =
handles the double escaping issue/confusion but it also caters the match =
vars $1, $`, $& among others..

so, this one eg

irb(main):019:0> "hello".gsub(/([aeiou])/, "<\\1>")
=3D> "h<e>ll<o>"

now becomes this

irb(main):020:0> "hello".gsub(/([aeiou])/) {|s| "<#{s}>"}
=3D> "h<e>ll<o>"

or this

irb(main):026:0> "hello".gsub(/([aeiou])/) {"<#$1>"}
=3D> "h<e>ll<o>"

your choice though.

kind regards -botp