Tim Pease
8/17/2007 2:54:00 PM
On 8/17/07, Phlip <phlip2005@gmail.com> wrote:
> Rubies:
>
> Someone didn't escape their & in their HTML correctly. Let's fix it.
>
> This regexp correctly does not escape &dude, because we only want to escape
> raw & markers:
>
> p "yo &dude".gsub(/&([^a-z])/i, '&\1')
>
> That passed "yo &dude" thru unchanged. (I am aware "dude" has no ; on the
> end; we are leaving that optional, for whatever reason...)
>
> Now escape & followed by a non-alphabetic character:
>
> p "yo & dude".gsub(/&([^a-z])/i, '&\1')
>
> That correctly provides: "yo & dude"
>
> Now how to escape "yo && dude"? Note that the ([^a-z]) consumes the second
> &, leading to this incorrect output:
>
> "yo && dude"
>
> The only workaround I can think of is to run the Regexp twice:
>
> x = "yo && dude"
> 2.times{ x.gsub!(/&([^a-z])/i, '&\1') }
> p x
>
> Can someone help my feeb Regexp skills and get a "yo && dude" in one
> line?
>
str = "yo && dude"
str.gsub!( %r/&(?=[^a-z])/i, '&')
p str
=> "yo && dude"
The regular expression trick here is the (?=re) That's called the
"zero-width positive lookahead". It matches, but it does not consume
the string; so the gsub! will only replace the characters that are NOT
inside (?=re).
Blessings,
TwP