[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Regexp help - Negative lookahead before across word boundaries

Gavin Kistner

2/18/2005 9:40:00 PM

Given a string like this:
"this.position.x = foo.bar.whee * jim.jam - yow / this.jorgle"

I want to match all the global identifiers which are not 'this', and I
'need' to do so without consuming any other characters.

This regexp:
/[^.]\b(?!this)[a-zA-Z_]\w*\b/
works, but it consumes the preceding character.

I thought this regexp would work:
/(?!\.)\b(?!this)[a-zA-Z_]\w*\b/i
but now I realize why it doesn't. (Because the position after the
period satifies the negative lookahead and the word boundary.)


Help?

2 Answers

Robert Klemme

2/18/2005 10:58:00 PM

0


"Phrogz" <gavin@refinery.com> schrieb im Newsbeitrag
news:1108762779.477156.48390@f14g2000cwb.googlegroups.com...
> Given a string like this:
> "this.position.x = foo.bar.whee * jim.jam - yow / this.jorgle"
>
> I want to match all the global identifiers which are not 'this', and I
> 'need' to do so without consuming any other characters.
>
> This regexp:
> /[^.]\b(?!this)[a-zA-Z_]\w*\b/
> works, but it consumes the preceding character.
>
> I thought this regexp would work:
> /(?!\.)\b(?!this)[a-zA-Z_]\w*\b/i
> but now I realize why it doesn't. (Because the position after the
> period satifies the negative lookahead and the word boundary.)
>
>
> Help?

That's a tough one. I think you need negative lookbehind - something that
the std Ruby regexp engine does not have. I think oniguruma will suit you
better.
http://raa.ruby-lang.org/project/...

However, you can do with the std engine if you allow for more processing
steps:

>> s.scan(/[\w.]+/).reject{|m| /^this(\.|$)/ =~ m}.map{|m| m.split('.')[0]}
=> ["foo", "jim", "yow"]
>> s.scan(/[\w.]+/).reject {|m| /^this(\.|$)/ =~ m}.map{|m|
>> /^\w+/.match(m)[0]}
=> ["foo", "jim", "yow"]

Kind regards

robert

William James

2/19/2005 2:19:00 AM

0

Phrogz wrote:
> Given a string like this:
> "this.position.x = foo.bar.whee * jim.jam - yow / this.jorgle"
>
> I want to match all the global identifiers which are not 'this', and
I
> 'need' to do so without consuming any other characters.
>
> This regexp:
> /[^.]\b(?!this)[a-zA-Z_]\w*\b/
> works, but it consumes the preceding character.
>

s="bar this.position.x = foo.bar.whee * jim.jam - yow / this.jorgle"
p s.scan( /(?:^|[^.])\b(?!this)([a-zA-Z_]\w*)\b/ ).flatten

produces

["bar", "foo", "jim", "yow"]