[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

escaping single quotes in a string with gsub

prubel

11/3/2004 6:12:00 PM

Hi,

I'm trying to take a string and escape a single quote if it is not
already escaped. My first thought was to look at the string and if I
see a quote without a backslash before it put the backslash there.

This has a problem when there is an escaped slash before the quote:
\\'. I believe the fix should be to look two characters back. If
anyone has a canned solution I'm all ears. Would look-behind be an
option here out of the box?

While I was experimenting I saw some behavior I don't understand and
am hoping someone can explain it to me:

prubel@cornet /tmp> cat /tmp/t.rb ; ruby /tmp/t.rb
2.times do
# replace not a slash followed by a quote with not a slash
# and an escaped quote.
puts("\\'Summer's Day".gsub(/([^\\\\])'/,"*#{$1}\\\\'*"))
puts $1
end
#end
\'Summe*\'*s Day
r
\'Summe*r\'*s Day
r
prubel /tmp> ruby --version
ruby 1.8.1 (2004-02-06) [i686-linux-gnu]


I'm confused at to why the output is different for the two
iterations? Why doesn't the r get placed in the first output?


thank you for your help,
Paul


5 Answers

James Gray

11/3/2004 6:46:00 PM

0

On Nov 3, 2004, at 12:12 PM, Paul Rubel wrote:

> Hi,
>
> I'm trying to take a string and escape a single quote if it is not
> already escaped. My first thought was to look at the string and if I
> see a quote without a backslash before it put the backslash there.

What about:

gsub(/(\\*)'/) { |m| $1.length % 2 == 0 ? $1 + "\\'" : m }

> Would look-behind be an option here out of the box?

Surprisingly, I don't believe Ruby yet supports lookbehind.

> While I was experimenting I saw some behavior I don't understand and
> am hoping someone can explain it to me:
>
> prubel@cornet /tmp> cat /tmp/t.rb ; ruby /tmp/t.rb
> 2.times do
> # replace not a slash followed by a quote with not a slash
> # and an escaped quote.
> puts("\\'Summer's Day".gsub(/([^\\\\])'/,"*#{$1}\\\\'*"))

The above line is problematic for two reasons. First, when using the
replacement string version of gsub(), your string is interpolated
before the method is even called let alone before any matches are made
so $1 and friends are not set. Instead, try using a \1 in a single
quoted string or \\1 in a double to get the value you're after.

Two, I don't understand your pattern. [^\\\\] means ONE character that
is not a slash and also not a slash. It's identical to [^\\]. I think
you meant to say, not two slashes, but that's a little harder to
express in a regex. And what if there are three slashes? See my
solution above for a different approach.

> puts $1
> end
> #end
> \'Summe*\'*s Day
> r
> \'Summe*r\'*s Day
> r
> prubel /tmp> ruby --version
> ruby 1.8.1 (2004-02-06) [i686-linux-gnu]
>
>
> I'm confused at to why the output is different for the two
> iterations? Why doesn't the r get placed in the first output?

Because $1 isn't set in time for the first replacement, but it is set
when the second string is built (set by the first match).

Hope that helps.

James Edward Gray II



dblack

11/3/2004 7:11:00 PM

0

prubel

11/3/2004 8:03:00 PM

0

David,

David A. Black writes:
> Hi --
>
> On Thu, 4 Nov 2004, Paul Rubel wrote:

> I admit I get confused by escaping and stuff... but I can't quite
> picture the case you're describing. If a string contains a single
> quote:
>
> "abc'def"
>
> that's the same as:
>
> "abc\'def"
>
> So I don't think you'll actually see that backslash before the single
> quote when you scan the string.

Insightful. I suspect you're right and that I made things needlessly
complicated (at least at this point). When my code sees the string the
escaping should already have occurred.

>
> If you do see a slash -- i.e., if the string is:
>
> abc\'def
>
> then that would probably be generated with "abc\\'def", which would be
> equivalent to:
>
> "abc\\\'def"
>
> I'm afraid I didn't quite follow the Summer's Day example. Can you
> give another?


The context that I saw the problem was the following:

The code takes in a name and a value and then evals them. If the
var_value has an unescaped single quote it would give an error that
the string was malformed.

to_eval = "#{var_name} = '#{var_value}'";
eval(to_eval, binding)

Looking at it now I expect that a backslash in the var_value will
cause problems most of the time as the strings contents get
interpolated a second time during the eval. Is there a better way to
set a value in a binding? The implementation has the option to set
values in a hash rather than in the binding but I'd like to keep both
if possible.

thank you,
Paul


prubel

11/3/2004 8:09:00 PM

0



James Edward Gray II writes:
> On Nov 3, 2004, at 12:12 PM, Paul Rubel wrote:
>
> > Hi,
> >
> > I'm trying to take a string and escape a single quote if it is not
> > already escaped. My first thought was to look at the string and if I
> > see a quote without a backslash before it put the backslash there.
>
> What about:
>
> gsub(/(\\*)'/) { |m| $1.length % 2 == 0 ? $1 + "\\'" : m }

That does look much better.

> > While I was experimenting I saw some behavior I don't understand and
> > am hoping someone can explain it to me:
> >
> > prubel@cornet /tmp> cat /tmp/t.rb ; ruby /tmp/t.rb
> > 2.times do
> > # replace not a slash followed by a quote with not a slash
> > # and an escaped quote.
> > puts("\\'Summer's Day".gsub(/([^\\\\])'/,"*#{$1}\\\\'*"))
>
> The above line is problematic for two reasons. First, when using the
> replacement string version of gsub(), your string is interpolated
> before the method is even called let alone before any matches are made
> so $1 and friends are not set. Instead, try using a \1 in a single
> quoted string or \\1 in a double to get the value you're after.

I should have know. Thanks for the explanation.

> Two, I don't understand your pattern. [^\\\\] means ONE character that
> is not a slash and also not a slash. It's identical to [^\\]. I think
> you meant to say, not two slashes, but that's a little harder to
> express in a regex. And what if there are three slashes? See my
> solution above for a different approach.

I mean to say not a slash but the interpolation in the replacement
confused me. After reading your response and thinking a bit I believe
my head has been wrapped around the issue.


> Hope that helps.

Very much.
thank you,
Paul


Florian Gross

11/3/2004 9:52:00 PM

0

James Edward Gray II wrote:

>> I'm trying to take a string and escape a single quote if it is not
>> already escaped. My first thought was to look at the string and if I
>> see a quote without a backslash before it put the backslash there.
>> Would look-behind be an option here out of the box?
>
> Surprisingly, I don't believe Ruby yet supports lookbehind.

However it does support look-ahead which is enough in this case:

"foo bar don't \\'".gsub(/((?!\\).(?:\\{2})*)'/, "\\1\\\\'")
# result: foo bar don\'t \'


And since the escape string is only a single character:

"foo bar don't \\'".gsub(/([^\\](?:\\{2})*)'/, "\\1\\\\'")
# result: foo bar don\'t \'


(Note that this is basically your Regexp, but with some of the filtering
logic moved from the block to the Regexp itself. The replacement string
looks disgusting. I think your solution is way clearer.)

Here is a sample with a multiple-width escape string:

"foo bar don't ESC'".gsub(/((?!ESC).{3}(?:(?:ESC){2})*)'/, "\\1ESC'")
# result: foo bar donESC't ESC'