Stefano Crocco
8/20/2008 8:15:00 PM
On Wednesday 20 August 2008, Nick Brown wrote:
> I was surprised to discover that the code
>
> astring.sub!(/hi/, 'bye')
>
> behaves subtly differently from
>
> astring = astring.sub(/hi/, 'bye')
>
> Intuitively, to me, these should be identical. Perhaps the documentation
> should make mention of this difference? A note about this unexpected
> behavior would have saved me a lot of frustration, and would likely do
> the same for many others new to Ruby.
>
> To be honest, I'm still trying to find out exactly why these do
> different things. The difference does not manifest itself with trivial
> cases in irb; rather it shows up when I'm getting a string from cgi,
> modifying it, then inserting it into a database. When using sub!, the
> database ends up containing the pre-sub'd value of astring, even though
> astring appears to contain the modified version when printed with a
> debug statement immediately preceding my database insert.
>
> I'm willing to except the criticism that my intuition is perverse in
> some way, but when I started writing in Ruby I was really hoping it
> would be a language one could use without having to understand how the C
> underneath it all worked (defeating part of the purpose of "high level"
> languages).
>
> So what do you think? Would warnings in the documentation on exclamation
> functions be useful or pointless?
Unless I misunderstood you, you're asking why two different methods
(String#sub and String#sub!) work differently. The answer is simple: because
they're different. It's like asking why String#upcase and String#downcase work
differently.
The documentation do speak of this difference:
ri String#sub gives:
------------------------------------------------------------- String#sub
str.sub(pattern, replacement) => new_str
str.sub(pattern) {|match| block } => new_str
------------------------------------------------------------------------
Returns a copy of _str_ with the _first_ occurrence of _pattern_
replaced with either _replacement_ or the value of the block. [...]
while ri String#sub! gives:
------------------------------------------------------------ String#sub!
str.sub!(pattern, replacement) => str or nil
str.sub!(pattern) {|match| block } => str or nil
------------------------------------------------------------------------
Performs the substitutions of +String#sub+ in place, returning
_str_, or +nil+ if no substitutions were performed.
You don't need to know about the C implementation of class String, of
String#sub or of String#sub! to understand how these methods work. The
documentation says that sub returns a copy of the string with the replacement
done, which means a different object, which has nothing to do with the
original. In the case of sub!, instead, the substitution is done in place,
that is, the receiver itself (str) is modified, not a copy of it.
As for the fact that the difference doesn't show in irb, this is not true.
Look at this:
irb(main):001:0> str = "this is a test string"
=> "this is a test string"
irb(main):002:0> str1 = str.sub "h", "H"
=> "tHis is a test string"
irb(main):003:0> str
=> "this is a test string"
The above lines show that str is not changed by sub
irb(main):004:0> str.sub "k", "K"
=> "this is a test string"
irb(main):005:0> str.sub! "k", "K"
=> nil
This shows the different behavior concerning the return value when there's
nothing to replace. sub returns a copy of the string without modifications,
while sub! returns nil
irb(main):006:0> str.sub! "a", "A"
=> "this is A test string"
irb(main):007:0> str
=> "this is A test string"
irb(main):008:0>
Here you can see that sub!, unlike sub, changes the original string.
In short, here's the difference between sub and sub!:
* sub creates a new string which has the same contents of the original one,
but is indipendent from, then replaces the pattern with the replacement text
in the copy. The original is not altered in any way. It always returns the
copy and you can see whether a replacement has been made by comparing the
original and the copy.
* sub! performs the replacement on the string itself, thus changing it.
Obviously, you can't compare the 'new' and the 'original' string to see
whether a replacement has been made (since there's no 'new string' and the
original has been changed), so you have to look at the return value: if it is
nil, nothing has been changed; if it is the string itself then a replacement
has been made.
I hope this helps
Stefano