[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Anchored Regexp get stalled or hung

rnicz

11/18/2004 9:17:00 PM

Dear rubyists!

I dare to post this question in spite of fact that lately there were
many posts about false RE bug reports.

I've tried to make following regexp

a=/^-{150}/

and it turns out that such expression hangs ruby interpreter.
I've checked that re{m,n} expression allows by design as much as
32766 repetitions. Expression such as

/-{32766}/

works fine, and

/-{32767}/

produces error message:

'too big quantifier in {,}: /-{32767})/'

But when regexp is anchored at front you can not specify more
than 127 repetitions:

/^-{127}/

is ok, but

/^-{128}/

hangs interpreter. Is it a bug or not?

--
Best regards
RNicz





6 Answers

Joel VanderWerf

11/18/2004 9:21:00 PM

0

rnicz wrote:
> /^-{128}/
>
> hangs interpreter. Is it a bug or not?

Fwiw, I can reproduce the hang in 1.8.2 but not in 1.9.0. Maybe it's
something that Oniguruma fixes.


Simon Strandgaard

11/18/2004 9:35:00 PM

0

On Thursday 18 November 2004 22:20, Joel VanderWerf wrote:
> rnicz wrote:
> > /^-{128}/
> >
> > hangs interpreter. Is it a bug or not?
>
> Fwiw, I can reproduce the hang in 1.8.2 but not in 1.9.0. Maybe it's
> something that Oniguruma fixes.


I can reproduce this problem in 1.8.1. I guess this is a bug in the GNU
engine.


This is show off.. my own regexp engine can deal with /^-{128}/

bash-2.05b$ irb
irb(main):001:0> re = NewRegexp.new("^-{128}")
=> #<NewRegexp:0x403a50f0 @source="^-{128}", @scanner=#<Scanner:0x403a50b4
@root=#<Root:0x403a4e5c @number_of_captures=2,
@node=#<ScannerHierarchy::Capture:0x403a4de4
@succ=#<ScannerHierarchy::Anchor:0x403a4d80 @anchor_type=:line_begin,
@succ=#<ScannerHierarchy::BeginMatch:0x403a4d44
@succ=#<ScannerHierarchy::RepeatGreedy:0x403a4d6c @max=128,
@succ=#<ScannerHierarchy::Capture:0x403a4df8
@succ=#<ScannerHierarchy::Last:0x403a4e34>, @register=1>, @min=128,
@index=nil, @pattern=#<ScannerHierarchy::Inside:0x403a4d1c @succ=EndPattern,
@set=#<RangeSet:0x403a4f24 @codepoints=[45]>>>>>, @register=0>,
@parser=+-Sequence
+-Anchor line_begin
+-Repeat greedy{128,128}
+-Inside set="-">>>
irb(main):002:0> re.match(("-"*128)+"x")
=> #<NewMatchData:0x403fb068 @captures=[],
@matched_string="--------------------------------------------------------------------------------------------------------------------------------",
@positions=[[0, 128]], @post_match="x",
@string="--------------------------------------------------------------------------------------------------------------------------------x",
@match_array=["--------------------------------------------------------------------------------------------------------------------------------"],
@pre_match="", @length=128, @offset=0>
irb(main):003:0> re.match(("-"*127)+"x")
=> nil
irb(main):004:0> puts re.tree
+-Sequence
+-Anchor line_begin
+-Repeat greedy{128,128}
+-Inside set="-"
=> nil
irb(main):005:0>


(sorry for show off)

--
Simon Strandgaard


Jamis Buck

11/18/2004 9:42:00 PM

0

Joel VanderWerf wrote:
> rnicz wrote:
>
>> /^-{128}/
>>
>> hangs interpreter. Is it a bug or not?
>
>
> Fwiw, I can reproduce the hang in 1.8.2 but not in 1.9.0. Maybe it's
> something that Oniguruma fixes.

I'm using 1.8.2 with Oniguruma, and it does not hang. I'm guessing it's
something with the legacy regexp engine.

- Jamis

--
Jamis Buck
jgb3@email.byu.edu
http://www.jamisbuck...


rnicz

11/18/2004 10:39:00 PM

0

Joel VanderWerf wrote:
>
> Maybe it's something that Oniguruma fixes.

How could it happen that I didn't know Oniguruma? Thank you very much!
--
RNicz


ts

11/19/2004 11:31:00 AM

0

>>>>> "r" == rnicz <rnicz@fibernet.pl> writes:

r> I've tried to make following regexp

r> a=/^-{150}/

try this

uln% diff -u regex.c~ regex.c
--- regex.c~ 2004-10-27 04:46:51.000000000 +0200
+++ regex.c 2004-11-19 12:25:39.000000000 +0100
@@ -1011,8 +1011,8 @@
{
int mcnt;
int max = 0;
- char *p = start;
- char *pend = end;
+ unsigned char *p = start;
+ unsigned char *pend = end;
char *must = 0;

if (start == NULL) return 0;
uln%


uln% ./ruby -e 'a = "-" * 152; /^-{150}/ =~ a; p $&.size'
150
uln%




Guy Decoux


rnicz

11/19/2004 9:08:00 PM

0

That's great: first answer after 4 minutes, and patch within 12 hours.

Thank you rubyists!

--
RNicz