[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

if VAR =~ one of a list?

healyzh

6/6/2006 11:02:00 PM

I'm in the process of learning Ruby, and just finished my first script that
is intended to do actual work. I doing so I came across an interesting
question.

How would I efficiently write the following statement?

if my_variable =~ /^some/i || my_variable =~ /^soome/i || my_variable
=~ /^save/i || my_variable =~ /^Catch/i || my_variable =~ /^BLARG/i ||
my_variable =~ /^safe/i || my_variable =~ /^blarg/i || my_variable =~
/^wrest/i ||my_variable =~ /^template/

Zane

14 Answers

Tim Hunter

6/6/2006 11:39:00 PM

0

healyzh@aracnet.com wrote:
> I'm in the process of learning Ruby, and just finished my first script that
> is intended to do actual work. I doing so I came across an interesting
> question.
>
> How would I efficiently write the following statement?
>
> if my_variable =~ /^some/i || my_variable =~ /^soome/i || my_variable
> =~ /^save/i || my_variable =~ /^Catch/i || my_variable =~ /^BLARG/i ||
> my_variable =~ /^safe/i || my_variable =~ /^blarg/i || my_variable =~
> /^wrest/i ||my_variable =~ /^template/
>
> Zane
>
Untested, but I believe this will work. What's more interesting is _why_
it works.

case my_variable
when /^some/i, /^soome/i, /^save/i, ...all your other regexps
do whatever
else
do something else
end

Jeff Schwab

6/7/2006 3:47:00 AM

0

healyzh@aracnet.com wrote:
> I'm in the process of learning Ruby, and just finished my first script that
> is intended to do actual work. I doing so I came across an interesting
> question.
>
> How would I efficiently write the following statement?
>
> if my_variable =~ /^some/i || my_variable =~ /^soome/i || my_variable
> =~ /^save/i || my_variable =~ /^Catch/i || my_variable =~ /^BLARG/i ||
> my_variable =~ /^safe/i || my_variable =~ /^blarg/i || my_variable =~
> /^wrest/i ||my_variable =~ /^template/

if my_variable =~
/^(?:some|soome|save|catch|blarg|safe|wrest|template)/i

Bit Twiddler

6/7/2006 5:55:00 PM

0

> if my_variable =~
> /^(?:some|soome|save|catch|blarg|safe|wrest|template)/i

Why is the "?:" required?

Thanks,
BT


Jeff Schwab

6/8/2006 1:16:00 AM

0

Bit Twiddler wrote:
>> if my_variable =~
>> /^(?:some|soome|save|catch|blarg|safe|wrest|template)/i
>
> Why is the "?:" required?

Because the OP asked that the statement be written "efficiently." Text
between parentheses ordinarily is captured, i.e. stored in a special
variable, after each match. The ?: tells the regular expression parser
that the parentheses are being used only for grouping, and thereby
avoids the overhead of capturing text.

Bit Twiddler

6/8/2006 1:53:00 PM

0

"Jeffrey Schwab" <jeff@schwabcenter.com> wrote in message
news:vTKhg.15710$Qg.11779@tornado.southeast.rr.com...
> Bit Twiddler wrote:
>>> if my_variable =~
>>> /^(?:some|soome|save|catch|blarg|safe|wrest|template)/i
>>
>> Why is the "?:" required?
>
> Because the OP asked that the statement be written "efficiently." Text
> between parentheses ordinarily is captured, i.e. stored in a special
> variable, after each match. The ?: tells the regular expression parser
> that the parentheses are being used only for grouping, and thereby avoids
> the overhead of capturing text.

Ahh... Thanks!!


Robert Klemme

6/8/2006 2:37:00 PM

0

Bit Twiddler wrote:
> "Jeffrey Schwab" <jeff@schwabcenter.com> wrote in message
> news:vTKhg.15710$Qg.11779@tornado.southeast.rr.com...
>> Bit Twiddler wrote:
>>>> if my_variable =~
>>>> /^(?:some|soome|save|catch|blarg|safe|wrest|template)/i
>>> Why is the "?:" required?
>> Because the OP asked that the statement be written "efficiently." Text
>> between parentheses ordinarily is captured, i.e. stored in a special
>> variable, after each match. The ?: tells the regular expression parser
>> that the parentheses are being used only for grouping, and thereby avoids
>> the overhead of capturing text.

When talking about efficiency then the pattern can be made even better
(manual, probably incomplete optimization):

/^(?:s(?:o(?:me|ome)|a(?:ve|fe))|catch|blarg|wrest|template)/i

I.e. build a Trie and convert that into a regexp.

Cheers

robert

Jeff Schwab

6/12/2006 3:03:00 PM

0

Robert Klemme wrote:
> Bit Twiddler wrote:
>> "Jeffrey Schwab" <jeff@schwabcenter.com> wrote in message
>> news:vTKhg.15710$Qg.11779@tornado.southeast.rr.com...
>>> Bit Twiddler wrote:
>>>>> if my_variable =~
>>>>> /^(?:some|soome|save|catch|blarg|safe|wrest|template)/i
>>>> Why is the "?:" required?
>>> Because the OP asked that the statement be written "efficiently."
>>> Text between parentheses ordinarily is captured, i.e. stored in a
>>> special variable, after each match. The ?: tells the regular
>>> expression parser that the parentheses are being used only for
>>> grouping, and thereby avoids the overhead of capturing text.
>
> When talking about efficiency then the pattern can be made even better
> (manual, probably incomplete optimization):
>
> /^(?:s(?:o(?:me|ome)|a(?:ve|fe))|catch|blarg|wrest|template)/i

True, since Ruby has only an NFA (rather than a DFA) regex engine. I
find this equivalent code a little bit more readable, though:

/^(?:s(?:oo?m|a[vf])e|catch|blarg|wrest|template)/i

Robert Klemme

6/12/2006 7:52:00 PM

0

Jeffrey Schwab wrote:
> Robert Klemme wrote:
>> Bit Twiddler wrote:
>>> "Jeffrey Schwab" <jeff@schwabcenter.com> wrote in message
>>> news:vTKhg.15710$Qg.11779@tornado.southeast.rr.com...
>>>> Bit Twiddler wrote:
>>>>>> if my_variable =~
>>>>>> /^(?:some|soome|save|catch|blarg|safe|wrest|template)/i
>>>>> Why is the "?:" required?
>>>> Because the OP asked that the statement be written "efficiently."
>>>> Text between parentheses ordinarily is captured, i.e. stored in a
>>>> special variable, after each match. The ?: tells the regular
>>>> expression parser that the parentheses are being used only for
>>>> grouping, and thereby avoids the overhead of capturing text.
>>
>> When talking about efficiency then the pattern can be made even better
>> (manual, probably incomplete optimization):
>>
>> /^(?:s(?:o(?:me|ome)|a(?:ve|fe))|catch|blarg|wrest|template)/i
>
> True, since Ruby has only an NFA (rather than a DFA) regex engine. I
> find this equivalent code a little bit more readable, though:
>
> /^(?:s(?:oo?m|a[vf])e|catch|blarg|wrest|template)/i

Probably. But readable != efficient. There's definitively more
backtracking in this RX than in the other one.

Cheers

robert

Jeff Schwab

6/12/2006 11:58:00 PM

0

Robert Klemme wrote:
> Jeffrey Schwab wrote:
>> Robert Klemme wrote:
>>> Bit Twiddler wrote:
>>>> "Jeffrey Schwab" <jeff@schwabcenter.com> wrote in message
>>>> news:vTKhg.15710$Qg.11779@tornado.southeast.rr.com...
>>>>> Bit Twiddler wrote:
>>>>>>> if my_variable =~
>>>>>>> /^(?:some|soome|save|catch|blarg|safe|wrest|template)/i
>>>>>> Why is the "?:" required?
>>>>> Because the OP asked that the statement be written "efficiently."
>>>>> Text between parentheses ordinarily is captured, i.e. stored in a
>>>>> special variable, after each match. The ?: tells the regular
>>>>> expression parser that the parentheses are being used only for
>>>>> grouping, and thereby avoids the overhead of capturing text.
>>>
>>> When talking about efficiency then the pattern can be made even
>>> better (manual, probably incomplete optimization):
>>>
>>> /^(?:s(?:o(?:me|ome)|a(?:ve|fe))|catch|blarg|wrest|template)/i
>>
>> True, since Ruby has only an NFA (rather than a DFA) regex engine. I
>> find this equivalent code a little bit more readable, though:
>>
>> /^(?:s(?:oo?m|a[vf])e|catch|blarg|wrest|template)/i
>
> Probably. But readable != efficient. There's definitively more
> backtracking in this RX than in the other one.

Where? I don't see it. If anything, it looks like there is less
potential backtracking, since the e after the first alternation is not
duplicated.

Robert Klemme

6/13/2006 11:26:00 AM

0

Jeffrey Schwab wrote:
> Robert Klemme wrote:
>> Jeffrey Schwab wrote:
>>> Robert Klemme wrote:
>>>> Bit Twiddler wrote:
>>>>> "Jeffrey Schwab" <jeff@schwabcenter.com> wrote in message
>>>>> news:vTKhg.15710$Qg.11779@tornado.southeast.rr.com...
>>>>>> Bit Twiddler wrote:
>>>>>>>> if my_variable =~
>>>>>>>> /^(?:some|soome|save|catch|blarg|safe|wrest|template)/i
>>>>>>> Why is the "?:" required?
>>>>>> Because the OP asked that the statement be written "efficiently."
>>>>>> Text between parentheses ordinarily is captured, i.e. stored in a
>>>>>> special variable, after each match. The ?: tells the regular
>>>>>> expression parser that the parentheses are being used only for
>>>>>> grouping, and thereby avoids the overhead of capturing text.
>>>>
>>>> When talking about efficiency then the pattern can be made even
>>>> better (manual, probably incomplete optimization):
>>>>
>>>> /^(?:s(?:o(?:me|ome)|a(?:ve|fe))|catch|blarg|wrest|template)/i
>>>
>>> True, since Ruby has only an NFA (rather than a DFA) regex engine. I
>>> find this equivalent code a little bit more readable, though:
>>>
>>> /^(?:s(?:oo?m|a[vf])e|catch|blarg|wrest|template)/i
>>
>> Probably. But readable != efficient. There's definitively more
>> backtracking in this RX than in the other one.
>
> Where? I don't see it. If anything, it looks like there is less
> potential backtracking, since the e after the first alternation is not
> duplicated.

Every "?", "*", "+" and "|" can cause backtracking. The plain tree
approach (my RX) immediately fails on the first character of every
branch if it doesn't match while IMHO your RX needs to test more
characters before it can fail. That's where the backtracking occurs.

Do you know the regexp coach? It makes some aspects of RX more visible
- unfortunately I could not find a way that highlights backtracking
but the tree view is quite informative.

http://www.weitz.de/re...

If you test both RX against "soomt" on the "Step" tab you'll notice that
you need 16 clicks on "next" for my RX and 18 for yours until the match
finally failed.

Kind regards

robert