[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Parser as an alternative to RegExen

S. Robert James

2/22/2007 2:12:00 AM

I'm parsing a large file, currently using compound regexen:

PREAMBLE = 'AA'
USERID = '\d{8}'
USER_HELLO = "#{PREMABLE}(#{USERID})"

Is there a simple way to do this using a parser such as ANTLR? I've
never used one before, so if it requires a learning curve, I'll stick
to my regexen.

But if there is a cleaner way to do this, I'd certainly like to.

7 Answers

James Gray

2/22/2007 3:26:00 AM

0

On Feb 21, 2007, at 8:15 PM, S. Robert James wrote:

> I'm parsing a large file, currently using compound regexen:
>
> PREAMBLE = 'AA'
> USERID = '\d{8}'
> USER_HELLO = "#{PREMABLE}(#{USERID})"
>
> Is there a simple way to do this using a parser such as ANTLR? I've
> never used one before, so if it requires a learning curve, I'll stick
> to my regexen.

I really don't think there's any value in going all the way to a
parser generator here. This job looks to be squarely in the Regexp
domain, so there's no reason to feel bad about using them.

James Edward Gray II

Logan Capaldo

2/22/2007 3:50:00 AM

0

On Thu, Feb 22, 2007 at 12:26:12PM +0900, James Edward Gray II wrote:
> On Feb 21, 2007, at 8:15 PM, S. Robert James wrote:
>
> >I'm parsing a large file, currently using compound regexen:
> >
> >PREAMBLE = 'AA'
> >USERID = '\d{8}'
> >USER_HELLO = "#{PREMABLE}(#{USERID})"
> >
> >Is there a simple way to do this using a parser such as ANTLR? I've
> >never used one before, so if it requires a learning curve, I'll stick
> >to my regexen.
>
> I really don't think there's any value in going all the way to a
> parser generator here. This job looks to be squarely in the Regexp
> domain, so there's no reason to feel bad about using them.
>
Agreed.

OTOH, Parsers are sure fun to write! (esp. rec descent ones for simple
grammars).

If you do decide to go with a parser generator, check out Dhaka,
http://dhaka.ruby...

> James Edward Gray II

Robert Klemme

2/22/2007 7:43:00 AM

0

On 22.02.2007 04:26, James Edward Gray II wrote:
> On Feb 21, 2007, at 8:15 PM, S. Robert James wrote:
>
>> I'm parsing a large file, currently using compound regexen:
>>
>> PREAMBLE = 'AA'
>> USERID = '\d{8}'
>> USER_HELLO = "#{PREMABLE}(#{USERID})"
>>
>> Is there a simple way to do this using a parser such as ANTLR? I've
>> never used one before, so if it requires a learning curve, I'll stick
>> to my regexen.
>
> I really don't think there's any value in going all the way to a parser
> generator here. This job looks to be squarely in the Regexp domain, so
> there's no reason to feel bad about using them.

Agree. Also, in Ruby Regexp objects can nicely be used to build larger
expressions because Regexp#to_s is nicely implemented to retain all the
settings:

irb(main):001:0> PREAMBLE = /AA/
=> /AA/
irb(main):002:0> USERID = /\d{8}/
=> /\d{8}/
irb(main):003:0> USER_HELLO = /#{PREAMBLE}(#{USERID})/
=> /(?-mix:AA)((?-mix:\d{8}))/

That way you can make sure that all sub expressions are valid and you
can nicely mix options - if you need to (for example, preamble case
insensitive).

Kind regards

robert

Ola Bini

2/22/2007 8:13:00 AM

0

S. Robert James wrote:
> I'm parsing a large file, currently using compound regexen:
>
> PREAMBLE = 'AA'
> USERID = '\d{8}'
> USER_HELLO = "#{PREMABLE}(#{USERID})"
>
> Is there a simple way to do this using a parser such as ANTLR? I've
> never used one before, so if it requires a learning curve, I'll stick
> to my regexen.
>
> But if there is a cleaner way to do this, I'd certainly like to.
>


As other people has mentioned, there is no biggie using Regexps for
this. BUT, another approach which I deem really nice is to use Ragel.
Ragel is a generator for Finite State Machines which recently got a
backend for Ruby (it's only in version control yet).

The regexps would look almost the same, but the speed would be increase
greatly.

--
Ola Bini (http://ola-bini.bl...)
JvYAML, RbYAML, JRuby and Jatha contributor
System Developer, Karolinska Institutet (http:/...)
OLogix Consulting (http://www....)

"Yields falsehood when quined" yields falsehood when quined.


David Vallner

2/22/2007 10:34:00 AM

0

On Thu, 22 Feb 2007 03:15:09 +0100, S. Robert James =

<srobertjames@gmail.com> wrote:

> I'm parsing a large file, currently using compound regexen:
>
> PREAMBLE =3D 'AA'
> USERID =3D '\d{8}'
> USER_HELLO =3D "#{PREMABLE}(#{USERID})"
>
> Is there a simple way to do this using a parser such as ANTLR? I've
> never used one before, so if it requires a learning curve, I'll stick
> to my regexen.
>
> But if there is a cleaner way to do this, I'd certainly like to.
>
>

One instance where I'd be thinking of picking up parser-fu would be if t=
he =

data contains recursively nested structures of some sort. Either the =

regexes, or the ancillary code juggling them gets hairy anyway, losing y=
ou =

the simplicity, and you still have to work your way through the nesting =
=

levels manually, which an AST parser would do for you.

David Vallner

El Castor

1/22/2013 8:57:00 PM

0

On Tue, 22 Jan 2013 07:36:55 -0500, Josh <user@verizon.net> wrote:

>On 1/22/2013 1:26 AM, El Castor wrote:
>> On Mon, 21 Jan 2013 14:25:53 -0800 (PST), Josh Rosenbluth
>> <jrosenbluth@comcast.net> wrote:
>>>
>>> Once again, you refuse to address the argument and instead change the
>>> subject.
>>
>> The fact that the government misappropriated the SS Trust Fund does
>> not turn FICA taxes into income taxes. Argue that it does, but I
>> vehemently disagree.
>
>At least, we have identified the precise area of disagreement.
>
>My argument: 1) FICA taxes financed non-FICA spending, 2) assume the
>Trust Fund is a myth and thus the money owed to it will not be paid back
>(instead, FICA taxes will be raised or SS benefits cut), and thus 3)
>FICA taxes are treated the same as income taxes.
>
>Your counter argument: ???
>
>> If you wish instead to argue that the $2.7 trillion trust fund is a
>> solemn obligation of the federal government, that would seem like a
>> fair argument to make, but then we are forced to confront the
>> political reality of where does the $2.7 trillion come from? And just
>> as important, what should we do to insure that another trillion will
>> not be misappropriated?
>
>I do acknowledge that if we pay back the Trust Fund (i.e., the Trust
>Fund is real - with the money would coming from a combination of
>publicly-held debt, increased non-FICA taxes, and reduced non-FICA
>spending), then your 47% statement holds.
>
>But recall, I said your two statements (Trust Fund is a myth and 47% pay
>no income taxes) can't both be true. In this case, the second statement
>is true, the first is false. Up above at the top of this post, the
>first is true and the second is false.

The trust fund is a myth and the bottom 46.4% pay no federal income
tax. Believe what you wish.

Josh Rosenbluth

1/22/2013 9:55:00 PM

0

On Jan 22, 3:56 pm, El Castor <DrE...@justuschickens.com> wrote:
> On Tue, 22 Jan 2013 07:36:55 -0500, Josh <u...@verizon.net> wrote:
> >On 1/22/2013 1:26 AM, El Castor wrote:
> >> On Mon, 21 Jan 2013 14:25:53 -0800 (PST), Josh Rosenbluth
> >> <jrosenbl...@comcast.net> wrote:
>
> >>> Once again, you refuse to address the argument and instead change the
> >>> subject.
>
> >> The fact that the government misappropriated the SS Trust Fund does
> >> not turn FICA taxes into income taxes. Argue that it does, but I
> >> vehemently disagree.
>
> >At least, we have identified the precise area of disagreement.
>
> >My argument:  1) FICA taxes financed non-FICA spending, 2) assume the
> >Trust Fund is a myth and thus the money owed to it will not be paid back
> >(instead, FICA taxes will be raised or SS benefits cut), and thus 3)
> >FICA taxes are treated the same as income taxes.
>
> >Your counter argument: ???

???

> >> If you wish instead to argue that the $2.7 trillion trust fund is a
> >> solemn obligation of the federal government, that would seem like a
> >> fair argument to make, but then we are forced to confront the
> >> political reality of where does the $2.7 trillion come from? And just
> >> as important, what should we do to insure that another trillion will
> >> not be misappropriated?
>
> >I do acknowledge that if we pay back the Trust Fund (i.e., the Trust
> >Fund is real - with the money would coming from a combination of
> >publicly-held debt, increased non-FICA taxes, and reduced non-FICA
> >spending), then your 47% statement holds.
>
> >But recall, I said your two statements (Trust Fund is a myth and 47% pay
> >no income taxes) can't both be true.  In this case, the second statement
> >is true, the first is false.  Up above at the top of this post, the
> >first is true and the second is false.
>
> The trust fund is a myth and the bottom 46.4% pay no federal income
> tax. Believe what you wish.

That's a conclusion, not an argument.