[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

regex backreference weirdness...

Kyle Schmitt

7/19/2007 10:24:00 PM

It all started with trying to convert some strings with underscores in
them, to camel case...and me thinking of regexes as sed regexes

It looks like gsub saves back references, but only after the whole
method exits, so it can't use what it found. Is this right?

Below is what I tried, which leads me up to the question of... what
would the _right_ way be of doing this?

irb>"camel_case".gsub(/_(.)/,$1.upcase)
NoMethodError: undefined method `upcase' for nil:NilClass
from (irb):1
#Which made me say Hu? it looks like a valid back reference to me...
#So I tried
"camel_case"=~/_(.)/
irb>puts $1
=>"c"
#ok really wierd...
irb>"camel_case".gsub(/_(.)/,$1.upcase)
=>"camelCase"
#Right, I'll buy that since $1 is still hanging around
irb>"camel_face".gsub(/_(.)/,$1.upcase)
=>"camelCase"
irb>"camel_face".gsub(/_(.)/,$1.upcase)
=>"camelFase"
#Ha ha! gsub DOES save a backreference... so why isn't this working?! :(

6 Answers

Morton Goldberg

7/20/2007 12:20:00 AM

0

On Jul 19, 2007, at 6:23 PM, Kyle Schmitt wrote:

> Below is what I tried, which leads me up to the question of... what
> would the _right_ way be of doing this?
>
> irb>"camel_case".gsub(/_(.)/,$1.upcase)
> NoMethodError: undefined method `upcase' for nil:NilClass
> from (irb):1

I think, in this case, you will have to use a block. For example:

"camel_case".gsub(/_(.)/) { $1.upcase } # => "camelCase"

or

"camel_case".gsub(/_./) { |m| m[1, 1].upcase } # => "camelCase"

Regards, Morton

Mikel Lindsaar

7/20/2007 4:48:00 AM

0

Instead of reinventing the wheel, you could always use the camelcase
converter that Rails has and pull out what you need:

http://api.rubyonrails.com/classes/ActiveSupport/CoreExtensions/String/Inflec...

Regards

Mikel

On 7/20/07, Kyle Schmitt <kyleaschmitt@gmail.com> wrote:
> It all started with trying to convert some strings with underscores in
> them, to camel case...and me thinking of regexes as sed regexes
>
> It looks like gsub saves back references, but only after the whole
> method exits, so it can't use what it found. Is this right?
>
> Below is what I tried, which leads me up to the question of... what
> would the _right_ way be of doing this?
>
> irb>"camel_case".gsub(/_(.)/,$1.upcase)
> NoMethodError: undefined method `upcase' for nil:NilClass
> from (irb):1
> #Which made me say Hu? it looks like a valid back reference to me...
> #So I tried
> "camel_case"=~/_(.)/
> irb>puts $1
> =>"c"
> #ok really wierd...
> irb>"camel_case".gsub(/_(.)/,$1.upcase)
> =>"camelCase"
> #Right, I'll buy that since $1 is still hanging around
> irb>"camel_face".gsub(/_(.)/,$1.upcase)
> =>"camelCase"
> irb>"camel_face".gsub(/_(.)/,$1.upcase)
> =>"camelFase"
> #Ha ha! gsub DOES save a backreference... so why isn't this working?! :(
>
>

Robert Klemme

7/20/2007 6:18:00 AM

0

2007/7/20, Kyle Schmitt <kyleaschmitt@gmail.com>:
> It all started with trying to convert some strings with underscores in
> them, to camel case...and me thinking of regexes as sed regexes
>
> It looks like gsub saves back references, but only after the whole
> method exits, so it can't use what it found. Is this right?
>
> Below is what I tried, which leads me up to the question of... what
> would the _right_ way be of doing this?
>
> irb>"camel_case".gsub(/_(.)/,$1.upcase)

You need to be aware that $1.upcase is evaluated *before* the method
call. So it can *never* be able to do calculations based on match
state. You rather want to use the block for, where the block is
invoked once per match. For example, you can do

irb(main):005:0> "camel_case".gsub(/(?:\A|_)(.)/) {|m| $1.capitalize }
=> "CamelCase"

> NoMethodError: undefined method `upcase' for nil:NilClass
> from (irb):1
> #Which made me say Hu? it looks like a valid back reference to me...

No, with the non block form you need to use \1, \2 etc. as has been
mentioned already.

> #So I tried
> "camel_case"=~/_(.)/
> irb>puts $1
> =>"c"
> #ok really wierd...
> irb>"camel_case".gsub(/_(.)/,$1.upcase)
> =>"camelCase"
> #Right, I'll buy that since $1 is still hanging around
> irb>"camel_face".gsub(/_(.)/,$1.upcase)
> =>"camelCase"
> irb>"camel_face".gsub(/_(.)/,$1.upcase)
> =>"camelFase"
> #Ha ha! gsub DOES save a backreference... so why isn't this working?! :(

You're still working on the value of $1 from the last invocation.
Proper backreferencing in the non block form looks like this:

irb(main):010:0> "camel_case".gsub /[cde]/, '<\\&>'
=> "<c>am<e>l_<c>as<e>"
irb(main):011:0> "camel_case".gsub /c(.)/, '<\\1>'
=> "<a>mel_<a>se"

Regards

robert

Peña, Botp

7/20/2007 7:52:00 AM

0

From: Kyle Schmitt [mailto:kyleaschmitt@gmail.com]
# #Ha ha! gsub DOES save a backreference... so why isn't this
# working?! :(

i think the behaviour is documented.

root@pc4all:~# qri string#gsub
------------------------------------------------------------ String#gsub
str.gsub(pattern, replacement) => new_str
str.gsub(pattern) {|match| block } => new_str
------------------------------------------------------------------------
Returns a copy of str with all occurrences of pattern replaced
with either replacement or the value of the block. The pattern
will typically be a Regexp; if it is a String then no regular
expression metacharacters will be interpreted (that is /\d/ will
match a digit, but '\d' will match a backslash followed by a 'd').

If a string is used as the replacement, special variables from the
match (such as $& and $1) cannot be substituted into it, as
substitution into the string occurs before the pattern match
starts. However, the sequences \1, \2, and so on may be used to
interpolate successive groups in the match.

In the block form, the current match string is passed in as a
parameter, and variables such as $1, $2, $`, $&, and $' will be
set appropriately. The value returned by the block will be
substituted for the match on each call.

The result inherits any tainting in the original string or any
supplied replacement string.

"hello".gsub(/[aeiou]/, '*') #=> "h*ll*"
"hello".gsub(/([aeiou])/, '<\1>') #=> "h<e>ll<o>"
"hello".gsub(/./) {|s| s[0].to_s + ' '} #=> "104 101 108 108 111 "

root@pc4all:~#

kind regards -botp

Kyle Schmitt

7/20/2007 1:46:00 PM

0

I completely, and utterly forgot about the block form of gsub.
Perfect. Thanks everyone!

But it does make me wonder, for the non block form, when you use the
\1 variable, I can see how to use it inside of other strings, but how
would you go about running other methods on it? In this case upcase.
Or is there no way?


As far as re-inventing the wheel, it's important to know the hows and
whys, even if you don't end up implementing it yourself :)

--Kyle

Robert Klemme

7/20/2007 2:49:00 PM

0

On 20.07.2007 15:45, Kyle Schmitt wrote:
> But it does make me wonder, for the non block form, when you use the
> \1 variable, I can see how to use it inside of other strings, but how
> would you go about running other methods on it? In this case upcase.
> Or is there no way?

Precisely.

robert