T. Onoma
11/11/2004 5:32:00 PM
Hi Matz,
On Thursday 11 November 2004 11:52 am, Yukihiro Matsumoto wrote:
| Hi,
|
| In message "Re: #scan with or'd (`|`) subexpressions."
|
| on Thu, 11 Nov 2004 23:29:58 +0900, "trans. (T. Onoma)"
<transami@runbox.com> writes:
| |Does the new Ruby regexp engine do this?
| |
| | irb(main):001:0> '1234'.scan(/(1)(2)|(3)(4)/)
| | => [["1", "2", nil, nil], [nil, nil, "3", "4"]]
| | irb(main):002:0>
| |
| |Why would all the subexpressions be listed when there is an `|` (or) used?
| | I expected:
| |
| | => [["1", "2"], ["3", "4"]]
|
| You will never know which subexpression is matched, if you get your
| expected result.
Actually, trying to figure out which subexpression is matched is _exactly_ my
problem. I have a dozens of regexp in the form of:
(#{spre})(#{start})(#{spost})(.*?)(#{epre})(#{end})(#{epost})
All of these are in an array (r) and strung together:
re = Regexp.new( r.join('|') )
Then
m = []
str.scan( re ) { m << $~ }
How do I know which array index (r[?]) produced the match? How does the
current behavior allow me to figure out which match?
| Is there any reason /(1|3)(2|4)/ is not sufficient?
Hmm... well with a good bit of refactoring I might be able to do it this way.
Although some of my regexp's have zero-width look ahead and I suspect they
might be a problem here.
Thanks,
T.