[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

regex extension to handle matching parens?

ivowel

1/23/2009 10:33:00 PM

Dear Experts: I am very new to ruby, literally having just read the
ruby book.

I want to write a program that does basic LaTeX parsing, so I need to
match '}' closings to the opening '{'. (yes, I understand that LaTeX
has very messy syntax, so this will only work for certain LaTeX docs.)
Does a gem exist that facilitates closing-paren-matching fairly
painlessly? For example,

sample = " \caption{my table \label{table-label} example:
$\sqrt{2+\sqrt{2}}$} more here {}"

so, I want to find my "\caption" matcher ruby program to be able to
detect the closing paren, and provide me with everything in between
the opener and closer (i.e., "my table \label{table-label} example:
$\sqrt{2+\sqrt{2}}$"). Possible?

I searched this mailing list first, but I only found discussions from
years back about this issue. I understand that this is not strictly
speaking a regular expression. I come from a perl background. There
are now some regex extension libraries that make it possible for the
built-in regex engine to parse matching parens
(Regexp::Common::balanced and Text::Balanced). I was hoping I could
find some similar gem for ruby.

help appreciated.

Sincerely,

/iaw

5 Answers

Rob Biedenharn

1/24/2009 12:15:00 AM

0

I think that you need to look at what Oniguruma might be able to do.
http://oniguruma.ruby...

I believe I've seen it demonstrated that balanced open/close pairs can
be found with this regular expression engine. It might be ugly,
however, but then you probably expected that.

-Rob

On Jan 23, 2009, at 5:33 PM, ivo welch wrote:

> Dear Experts: I am very new to ruby, literally having just read the
> ruby book.
>
> I want to write a program that does basic LaTeX parsing, so I need to
> match '}' closings to the opening '{'. (yes, I understand that LaTeX
> has very messy syntax, so this will only work for certain LaTeX docs.)
> Does a gem exist that facilitates closing-paren-matching fairly
> painlessly? For example,
>
> sample = " \caption{my table \label{table-label} example:
> $\sqrt{2+\sqrt{2}}$} more here {}"
>
> so, I want to find my "\caption" matcher ruby program to be able to
> detect the closing paren, and provide me with everything in between
> the opener and closer (i.e., "my table \label{table-label} example:
> $\sqrt{2+\sqrt{2}}$"). Possible?
>
> I searched this mailing list first, but I only found discussions from
> years back about this issue. I understand that this is not strictly
> speaking a regular expression. I come from a perl background. There
> are now some regex extension libraries that make it possible for the
> built-in regex engine to parse matching parens
> (Regexp::Common::balanced and Text::Balanced). I was hoping I could
> find some similar gem for ruby.
>
> help appreciated.
>
> Sincerely,
>
> /iaw
>


Rob Biedenharn

1/24/2009 12:24:00 AM

0

Ah, specifically, the "Back reference with nest level" section of
http://oniguruma.ruby...oniguruma/files/Synta...


On Jan 23, 2009, at 7:14 PM, Rob Biedenharn wrote:

> I think that you need to look at what Oniguruma might be able to do.
> http://oniguruma.ruby...
>
> I believe I've seen it demonstrated that balanced open/close pairs
> can be found with this regular expression engine. It might be ugly,
> however, but then you probably expected that.
>
> -Rob
>
> On Jan 23, 2009, at 5:33 PM, ivo welch wrote:
>
>> Dear Experts: I am very new to ruby, literally having just read the
>> ruby book.
>>
>> I want to write a program that does basic LaTeX parsing, so I need to
>> match '}' closings to the opening '{'. (yes, I understand that LaTeX
>> has very messy syntax, so this will only work for certain LaTeX
>> docs.)
>> Does a gem exist that facilitates closing-paren-matching fairly
>> painlessly? For example,
>>
>> sample = " \caption{my table \label{table-label} example:
>> $\sqrt{2+\sqrt{2}}$} more here {}"
>>
>> so, I want to find my "\caption" matcher ruby program to be able to
>> detect the closing paren, and provide me with everything in between
>> the opener and closer (i.e., "my table \label{table-label} example:
>> $\sqrt{2+\sqrt{2}}$"). Possible?
>>
>> I searched this mailing list first, but I only found discussions from
>> years back about this issue. I understand that this is not strictly
>> speaking a regular expression. I come from a perl background. There
>> are now some regex extension libraries that make it possible for the
>> built-in regex engine to parse matching parens
>> (Regexp::Common::balanced and Text::Balanced). I was hoping I could
>> find some similar gem for ruby.
>>
>> help appreciated.
>>
>> Sincerely,
>>
>> /iaw
>>
>
>


ivowel

1/24/2009 1:00:00 AM

0


thank you, rob. great reference. now I know that it can be done.
alas, this doc is a little over my head. can someone who has used
this construct possibly please show me how I would try it on my simple
example?

sample = " \caption{my table \label{table-label} example: $\sqrt{2+
\sqrt{2}}$} more here {}"


accomplishing this is actually not ugly at all in perl:

use Regexp::Common;
my $matchingarg = qr/$RE{balanced}{-parens=>'{ }'})/;
/\\caption$matchingarg/;
print "The \\caption argument is $1\n";

of course, perl is ugly in many other respects, but here, it does
nicely.

regards, /iaw


William James

1/24/2009 2:28:00 AM

0

ivowel wrote:

>
> thank you, rob. great reference. now I know that it can be done.
> alas, this doc is a little over my head. can someone who has used
> this construct possibly please show me how I would try it on my simple
> example?
>
> sample = " \caption{my table \label{table-label} example: $\sqrt{2+
> \sqrt{2}}$} more here {}"
>
>
> accomplishing this is actually not ugly at all in perl:
>
> use Regexp::Common;
> my $matchingarg = qr/$RE{balanced}{-parens=>'{ }'})/;
> /\\caption$matchingarg/;
> print "The \\caption argument is $1\n";
>
> of course, perl is ugly in many other respects, but here, it does
> nicely.
>
> regards, /iaw


sample = " \\caption{my table \\label{table-label}
example: $\\sqrt{2+\\sqrt{2}}$} more here {}"


def bal_fences str
left = str[0,1]
fences = /[#{Regexp.escape "(){}[]<>"[ /#{Regexp.escape left}./ ]}]/
accum = "" ; count = 0
str.scan( /.*?#{fences}/ ){|s|
count += if s[-1,1] == left ; 1 else -1 end
accum << s
break if 0 == count
}
accum
end


p bal_fences( sample[ /caption(.*)/m, 1 ] )

Robert Klemme

1/24/2009 4:35:00 PM

0

On 24.01.2009 02:00, ivowel wrote:
> thank you, rob. great reference. now I know that it can be done.
> alas, this doc is a little over my head. can someone who has used
> this construct possibly please show me how I would try it on my simple
> example?
>
> sample = " \caption{my table \label{table-label} example: $\sqrt{2+
> \sqrt{2}}$} more here {}"
>
>
> accomplishing this is actually not ugly at all in perl:
>
> use Regexp::Common;
> my $matchingarg = qr/$RE{balanced}{-parens=>'{ }'})/;
> /\\caption$matchingarg/;
> print "The \\caption argument is $1\n";
>
> of course, perl is ugly in many other respects, but here, it does
> nicely.

Ugliness often means bad maintainability... I'd probably use a
different approach which also works with simpler regular expressions:

# untested
Node = Struct.new :parent, :children

current = root = Node.new nil, []
tokens = input.split(%r{([](){}])})

tokens.each do |token|
case token
when %r{[({]}
current = Node.new current, []
when %r{[])}]}
current = current.parent
else
current.children << token
end
end

In other words: build a rudimentary context free parser. Depends of
course on what you want to do.

Cheers

robert

--
remember.guy do |as, often| as.you_can - without end