[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

backslash sequences\1\2 in regexs (backreferences

7stud --

11/2/2007 1:47:00 AM

Is this behavior documented anywhere:

1)
puts "fred:smith".gsub(/(\w+):(\w+)/, '\2, \1')

--output:--
smith, fred

2)
puts "abc".gsub(/a(b)(c)/, "a\2\1")

--output:--
a

The double quotes surrounding the replacement string cause the backslash
sequences to stop working. With single quotes the backslash sequences
work. I can't find anything in pickaxe2 about that. .My understanding
was that double quotes allowed for more substitutions than single
quotes. This appears to be a case where double quotes allow fewer
substitutions than single quotes.
--
Posted via http://www.ruby-....

5 Answers

Mike Stok

11/2/2007 2:32:00 AM

0


On 1-Nov-07, at 9:47 PM, 7stud -- wrote:

> Is this behavior documented anywhere:
>
> 1)
> puts "fred:smith".gsub(/(\w+):(\w+)/, '\2, \1')
>
> --output:--
> smith, fred
>
> 2)
> puts "abc".gsub(/a(b)(c)/, "a\2\1")
>
> --output:--
> a
>
> The double quotes surrounding the replacement string cause the
> backslash
> sequences to stop working. With single quotes the backslash sequences
> work. I can't find anything in pickaxe2 about that. .My
> understanding
> was that double quotes allowed for more substitutions than single
> quotes. This appears to be a case where double quotes allow fewer
> substitutions than single quotes.
> --
> Posted via http://www.ruby-....


The double quotes interpolate the \1 and \2 as characters before gsub
ever sees it.

ratdog:~ mike$ ruby -e 'puts "abc".gsub(/a(b)(c)/, "a\2\1")' | od -c
0000000 a 002 001 \n
0000004

ratdog:~ mike$ irb
irb(main):001:0> 'a\1\2'.length
=> 5
irb(main):002:0> "a\1\2".length
=> 3
irb(main):003:0> "a\2\1"
=> "a\002\001"

the \2 and \1 are interpolated into two single characters in the
double quotes.

Table 22.2 in The Basic Types says \nnn goes to Octal nnn, and here
you see 8 (not a valid octal digit) doesn't get treated the same way
as 1 and 2:

irb(main):004:0> "a\2\1\8"
=> "a\002\0018"

Hope this helps,

Mike

--

Mike Stok <mike@stok.ca>
http://www.stok...

The "`Stok' disclaimers" apply.





Phrogz

11/2/2007 3:55:00 AM

0

On Nov 1, 7:47 pm, bbxx789_0...@yahoo.com wrote:
> Is this behavior documented anywhere:

Yes. In many Ruby books, in at least one Ruby FAQ, and many, many
times on the ruby mailing list/forum/newsgroup.

7stud --

11/2/2007 5:00:00 AM

0

Mike Stok wrote:
>
> Table 22.2 in The Basic Types says \nnn goes to Octal nnn,
>

Ah. So, \1 and \2 are interpreted as octal character codes. I was
using the following puts statement to debug:

puts "abc".gsub(/a(b)(c)/, "a\2\1") + "<---"

--output:--
a<---


I should have been using:

p "abc".gsub(/a(b)(c)/, "a\2\1")

--output:--
"a\002\001"

Since the ascii codes 1 and 2 represent non-printable characters, I got
no output for them using puts.

My question stemmed from this passage about gsub() in pickaxe2 on p.
613:

"If a string is used as the replacement, special variables from the
match (such as $& and $1) cannot be substituted into it, as the
substitution into the string occurs before the pattern match starts.
However, the sequences \1, \2 and so on may be used to interpolate
successive groups in the match."

That makes it sound like \1 and \2 can be freely used in the replacement
string. There is no mention of the fact that single quotes are required
to keep them from being interpreted as chars written in octal. That
description is very misleading


--
Posted via http://www.ruby-....

Morton Goldberg

11/2/2007 5:33:00 AM

0

On Nov 2, 2007, at 12:59 AM, 7stud -- wrote:

> Mike Stok wrote:
>>
>> Table 22.2 in The Basic Types says \nnn goes to Octal nnn,
>>
>
> Ah. So, \1 and \2 are interpreted as octal character codes. I was
> using the following puts statement to debug:
>
> puts "abc".gsub(/a(b)(c)/, "a\2\1") + "<---"
>
> --output:--
> a<---
>
>
> I should have been using:
>
> p "abc".gsub(/a(b)(c)/, "a\2\1")
>
> --output:--
> "a\002\001"
> Since the ascii codes 1 and 2 represent non-printable characters, I
> got
> no output for them using puts.
>
> My question stemmed from this passage about gsub() in pickaxe2 on p.
> 613:
>
> "If a string is used as the replacement, special variables from the
> match (such as $& and $1) cannot be substituted into it, as the
> substitution into the string occurs before the pattern match starts.
> However, the sequences \1, \2 and so on may be used to interpolate
> successive groups in the match."
>
> That makes it sound like \1 and \2 can be freely used in the
> replacement
> string. There is no mention of the fact that single quotes are
> required
> to keep them from being interpreted as chars written in octal. That
> description is very misleading

No, it's not, That single quotes are required has nothing to do with
gsub. It's something you should know from your understanding of how
the Ruby interpreter handles double quoted strings. As Mike Stok said
the string literal is converted to "a\002\001" long before gsub is
called.

Regards, Morton



Wolfgang Nádasi-donner

11/2/2007 5:52:00 AM

0

Morton Goldberg wrote:
> On Nov 2, 2007, at 12:59 AM, 7stud -- wrote:
>
>> --output:--
>> got
>>
>> That makes it sound like \1 and \2 can be freely used in the
>> replacement
>> string. There is no mention of the fact that single quotes are
>> required
>> to keep them from being interpreted as chars written in octal. That
>> description is very misleading
>
> No, it's not, That single quotes are required has nothing to do with
> gsub. It's something you should know from your understanding of how
> the Ruby interpreter handles double quoted strings. As Mike Stok said
> the string literal is converted to "a\002\001" long before gsub is
> called.
>
> Regards, Morton
You should simply use "double-quote-double-quote"

irb(main):001:0> puts "fred:smith".gsub(/(\w+):(\w+)/, '\\2, \\1')
smith, fred

Wolfgang Nádasi-Donner

--
Posted via http://www.ruby-....