[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

#scan with or'd (`|`) subexpressions.

T. Onoma

11/11/2004 2:30:00 PM

Does the new Ruby regexp engine do this?

irb(main):001:0> '1234'.scan(/(1)(2)|(3)(4)/)
=> [["1", "2", nil, nil], [nil, nil, "3", "4"]]
irb(main):002:0>

Why would all the subexpressions be listed when there is an `|` (or) used? I
expected:

=> [["1", "2"], ["3", "4"]]

T.


4 Answers

Yukihiro Matsumoto

11/11/2004 4:52:00 PM

0

Hi,

In message "Re: #scan with or'd (`|`) subexpressions."
on Thu, 11 Nov 2004 23:29:58 +0900, "trans. (T. Onoma)" <transami@runbox.com> writes:

|Does the new Ruby regexp engine do this?
|
| irb(main):001:0> '1234'.scan(/(1)(2)|(3)(4)/)
| => [["1", "2", nil, nil], [nil, nil, "3", "4"]]
| irb(main):002:0>
|
|Why would all the subexpressions be listed when there is an `|` (or) used? I
|expected:
|
| => [["1", "2"], ["3", "4"]]

You will never know which subexpression is matched, if you get your
expected result. Is there any reason /(1|3)(2|4)/ is not sufficient?

matz.


Peter

11/11/2004 5:00:00 PM

0

T. Onoma

11/11/2004 5:32:00 PM

0

Hi Matz,

On Thursday 11 November 2004 11:52 am, Yukihiro Matsumoto wrote:
| Hi,
|
| In message "Re: #scan with or'd (`|`) subexpressions."
|
| on Thu, 11 Nov 2004 23:29:58 +0900, "trans. (T. Onoma)"
<transami@runbox.com> writes:
| |Does the new Ruby regexp engine do this?
| |
| | irb(main):001:0> '1234'.scan(/(1)(2)|(3)(4)/)
| | => [["1", "2", nil, nil], [nil, nil, "3", "4"]]
| | irb(main):002:0>
| |
| |Why would all the subexpressions be listed when there is an `|` (or) used?
| | I expected:
| |
| | => [["1", "2"], ["3", "4"]]
|
| You will never know which subexpression is matched, if you get your
| expected result.

Actually, trying to figure out which subexpression is matched is _exactly_ my
problem. I have a dozens of regexp in the form of:

(#{spre})(#{start})(#{spost})(.*?)(#{epre})(#{end})(#{epost})

All of these are in an array (r) and strung together:

re = Regexp.new( r.join('|') )

Then

m = []
str.scan( re ) { m << $~ }

How do I know which array index (r[?]) produced the match? How does the
current behavior allow me to figure out which match?

| Is there any reason /(1|3)(2|4)/ is not sufficient?

Hmm... well with a good bit of refactoring I might be able to do it this way.
Although some of my regexp's have zero-width look ahead and I suspect they
might be a problem here.

Thanks,
T.


Mark Hubbart

11/11/2004 7:12:00 PM

0

On Fri, 12 Nov 2004 01:52:00 +0900, Yukihiro Matsumoto
<matz@ruby-lang.org> wrote:
> Hi,
>
> In message "Re: #scan with or'd (`|`) subexpressions."
> on Thu, 11 Nov 2004 23:29:58 +0900, "trans. (T. Onoma)" <transami@runbox.com> writes:
>
> |Does the new Ruby regexp engine do this?
>
>
> |
> | irb(main):001:0> '1234'.scan(/(1)(2)|(3)(4)/)
> | => [["1", "2", nil, nil], [nil, nil, "3", "4"]]
> | irb(main):002:0>
> |
> |Why would all the subexpressions be listed when there is an `|` (or) used? I
> |expected:
> |
> | => [["1", "2"], ["3", "4"]]
>
> You will never know which subexpression is matched, if you get your
> expected result. Is there any reason /(1|3)(2|4)/ is not sufficient?

This reminds me... when will Ruby support named subexpressions?
Oniguruma fully supports them now; but there doesn't appear to be a
way to access this in the ruby code.

thanks,
Mark