[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: regexp question - look for parentheses then remove them

Felix Windt

10/8/2007 12:54:00 PM

> -----Original Message-----
> From: Jesús Gabriel y Galán [mailto:jgabrielygalan@gmail.com]
> Sent: Monday, October 08, 2007 5:25 AM
> To: ruby-talk ML
> Subject: Re: regexp question - look for parentheses then remove them
>
> On 10/8/07, Max Williams <toastkid.williams@gmail.com> wrote:
> > I'm struggling with a regular expression problem, can anyone help?
> >
> > I want to take a string, look for anything in parentheses,
> and if i find
> > anything, put it into an array, minus the parentheses.
> >
> > currently i'm doing this:
> >
> > parentheses = /\(.*\)/
> > array = string.scan(parentheses)
> >
> > This gives me eg
> >
> > "3 * (1 + 2)" => ["(1 + 2)"]
> >
> > - but is there an easy way to strip the parentheses off
> before putting
> > it into the array?
> >
> > eg
> > "3 * (1 + 2)" => ["1 + 2"]
> >
> > In addition, if i have nested parentheses inside the outer
> parentheses,
> > i want to keep them, eg
> >
> > "3 * (1 + (4 / 2))" => ["1 + (4 / 2)"]
> >
> > can anyone show me how to do this?
>
> x = "3 * (1 + 2)".match(/\((.*)\)/)
> x.captures
> => ["1 + 2"]
> x = "3 * (2 + (1 + 3))".match(/\((.*)\)/)
> x.captures
> => ["2 + (1 + 3)"]
>
> Hope this helps,
>
> Jesus.
>

That can fail if you have more than one bracket pair on the lowest level:

irb(main):002:0> "3 * (2 + (1 + 3)) + (1 * 4)".match(/\((.*)\)/).to_a
=> ["(2 + (1 + 3)) + (1 * 4)", "2 + (1 + 3)) + (1 * 4"]

I'm not sure you can match more complex examples using a regular expression
- you may be able to pull something off with lookaheads, but I think it'd be
easier to just parse the string manually and count opened brackets:

$ cat para.rb
def para(str)
open = 0
matches = []
current = ""
str.split(/\s*/).each do |char|
if char == ")"
open -= 1
if open == 0
matches << current
current = ""
else
current << char
end
elsif char == "("
open += 1
if open > 1
current << char
end
elsif open > 0
current << char
end
end
matches
end


$ irb
irb(main):001:0> require 'para'
=> true
irb(main):002:0> para("1+2")
=> []
irb(main):003:0> para("(1+2)")
=> ["1+2"]
irb(main):004:0> para("(1+2)*3")
=> ["1+2"]
irb(main):005:0> para("((1+2)*3)")
=> ["(1+2)*3"]
irb(main):006:0> para("((1+2)*3)+(5*6)")
=> ["(1+2)*3", "5*6"]
irb(main):007:0> para("((1+2)*3)+(5*6*(1-3*(1-4)))")
=> ["(1+2)*3", "5*6*(1-3*(1-4))"]
irb(main):008:0>

There are probably far more elegant ways.

HTH,

Felix
3 Answers

Jesús Gabriel y Galán

10/8/2007 1:04:00 PM

0

On 10/8/07, Felix Windt <fwmailinglists@gmail.com> wrote:
> > From: Jesús Gabriel y Galán [mailto:jgabrielygalan@gmail.com]
> > On 10/8/07, Max Williams <toastkid.williams@gmail.com> wrote:
> > > "3 * (1 + 2)" => ["1 + 2"]
> > > "3 * (1 + (4 / 2))" => ["1 + (4 / 2)"]
> > >
> > > can anyone show me how to do this?
> >
> > x = "3 * (1 + 2)".match(/\((.*)\)/)
> > x.captures
> > => ["1 + 2"]
> > x = "3 * (2 + (1 + 3))".match(/\((.*)\)/)
> > x.captures
> > => ["2 + (1 + 3)"]

> That can fail if you have more than one bracket pair on the lowest level:
>
> irb(main):002:0> "3 * (2 + (1 + 3)) + (1 * 4)".match(/\((.*)\)/).to_a
> => ["(2 + (1 + 3)) + (1 * 4)", "2 + (1 + 3)) + (1 * 4"]

True, what would be the expected result for this?

["2 + (1 + 3)", "1 * 4"] ???

I agree that for complex cases a regexp is not the solution. A
solution like yours counting parens (or with a stack) should be
preferred way.

Cheers,

Jesus.

Jesús Gabriel y Galán

10/9/2007 7:11:00 AM

0

On 10/9/07, Michael Bevilacqua-Linn <michael.bevilacqualinn@gmail.com> wrote:
> [snip]
>
> ??
> >
> > I agree that for complex cases a regexp is not the solution. A
> > solution like yours counting parens (or with a stack) should be
> > preferred way.

> Yep, parsing something with an arbitrarily stacked parentheses is the
> classic example of something that can't be done with a regex. (Well,
> assuming you actually care about the nested parens.)

I've read that the .NET regex engine has some constructs to recognize
balanced constructs like parens:

http://puzzleware.net/blogs/archive/2005/08/...

Interesting !!

Jesus.

Wolfgang Nádasi-donner

10/9/2007 7:31:00 AM

0

Jesús Gabriel y Galán wrote:
> I've read that the .NET regex engine has some constructs to recognize
> balanced constructs like parens...

It's possible in Ruby 1.9 or Ruby 1.8 and the Oniguruma library too:

module Matchelements
def bal(lpar='(', rpar=')')
raise RegexpError,
"wrong length of left bracket '#{lpar}' in bal" unless lpar.length
== 1
raise RegexpError,
"wrong length of right bracket '#{rpar}' in bal" unless
rpar.length == 1
raise RegexpError,
"identical left and right bracket '#{lpar}' in bal" if
lpar.eql?(rpar)
lclass, rclass = lpar, rpar
lclass = '\\' + lclass if lclass.match(/[\-\[\]]/)
rclass = '\\' + rclass if rclass.match(/[\-\[\]]/)
return "(?<bal>" +
"[^#{lclass}#{rclass}]*?" +
"(?:\\#{lpar}\\g<bal>\\#{rpar}" +
"[^#{lclass}#{rclass}]*?" +
")*?" +
")"
end
end
include Matchelements

result = "3 * (2 + (1 + 3)) + (1 * 4)".scan(/\(#{bal()}\)/)

p result # => [["2 + (1 + 3)"], ["1 * 4"]]

Wolfgang Nádasi-Donner
--
Posted via http://www.ruby-....