[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Timeout and Exponetial Regexes

eden li

9/28/2006 8:11:00 AM

Side-stepping the debate about running exponential regular expressions
first place, I'd like to catch a long running regex using
Timeout.timeout():

$ ruby -v && irb -r timeout
ruby 1.8.4 (2005-12-24) [i486-linux]
irb(main):001:0> Timeout.timeout 5 do ("a"*300)+"b" =~ /^(a*)*$/ end

Except that it appears the mechanism the Timeout module uses can't seem
to do this. After looking at the source of Timeout.timeout(), it seems
that the regex is blocking all threads and disallowing the Timeout
mechanism from detecting the runaway block. Indeed, the following
never returns:

irb(main):002:0> t = Thread.new do ("a"*300)+"b" =~ /^(a*)*$/ end

A few questions:
- Why do regexes seem to stop up all threads? Is this fixed in ruby
1.9?
- Is there a way to offload this processing into another thread
without resorting to DRb?
- Is there a better way to detect this sort of thing?

4 Answers

MonkeeSage

9/28/2006 5:51:00 PM

0

eden li wrote:
> irb(main):001:0> Timeout.timeout 5 do ("a"*300)+"b" =~ /^(a*)*$/ end

Hmm...is that a valid expression? Capture "a" zero or more times, and
then repeat that capture zero or more times?! What exactly does that
mean?

perl5 runs it but finds no match:

("a" x 300) . "b" =~ /^(a*)*$/

python spits out an error:

import re
s = ("a"*300)+"b"
re.search(r'^(a*)*$', s)

# ...
# raise error, v # invalid expression
# sre_constants.error: nothing to repeat

And ruby 1.9. runs it but finds no match (same as perl).

So I think that the expresion is broken and ruby 1.8 has a bug.

Regards,
Jordan

Ara.T.Howard

9/28/2006 6:17:00 PM

0

eden li

10/5/2006 8:47:00 AM

0

It's a perfectly valid regex that happens to run in exponential time if
the regex engine that tries to run it isn't smart about backtracking
(it appears, in this case, Perl and Ruby 1.9 are, Ruby 1.8 is not, and
Python is smart enough to reject it because it knows its regex engine
can't handle it).

However, the point of my post is not the regex. What I was trying to
find out was why Ruby seems to block threads whenever a regex is
running. I just gave an example that would fit into one line.

I run a batch of regexes on a batch of input on a regular basis, and I
want some way to catch an exponential regex if I happen to accidentally
write one. Right now it appears there's no way to do this in Ruby 1.8.

... or is there?

MonkeeSage wrote:
> eden li wrote:
> > irb(main):001:0> Timeout.timeout 5 do ("a"*300)+"b" =~ /^(a*)*$/ end
>
> Hmm...is that a valid expression? Capture "a" zero or more times, and
> then repeat that capture zero or more times?! What exactly does that
> mean?
>
> perl5 runs it but finds no match:
>
> ("a" x 300) . "b" =~ /^(a*)*$/
>
> python spits out an error:
>
> import re
> s = ("a"*300)+"b"
> re.search(r'^(a*)*$', s)
>
> # ...
> # raise error, v # invalid expression
> # sre_constants.error: nothing to repeat
>
> And ruby 1.9. runs it but finds no match (same as perl).
>
> So I think that the expresion is broken and ruby 1.8 has a bug.
>
> Regards,
> Jordan


Rick DeNatale

10/5/2006 4:17:00 PM

0

On 10/5/06, eden <eden.li@gmail.com> wrote:

> However, the point of my post is not the regex. What I was trying to
> find out was why Ruby seems to block threads whenever a regex is
> running.

I'm not entirely sure, but I suspect that it's because of the use of
green threads, rather than native threads.

Native threads can get scheduled whenever there's a hardware interrupt.

Thread switching for green threads only happens when certain internal
functions are called e.g. rb_thread_schedule. I suspect that the
regex engine isn't calling this, or if it does it doesn't call it very
often.

If this is the case it's not really blocking threads explicitly, but
the effect is the same.

Just a theory, I haven't slogged through the code thoroughly enough to prove it.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denh...