[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

DRY fanatics?

Giles Bowkett

10/22/2006 1:03:00 AM

Anybody know a way to make this DRYer?

when /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/

a literal regex with a subpattern repeated three times

I could probably split on the ', but it seems that might have unwanted
side effects.

--
Giles Bowkett
http://www.gilesg...

14 Answers

Joel VanderWerf

10/22/2006 1:55:00 AM

0

Giles Bowkett wrote:
> Anybody know a way to make this DRYer?
>
> when /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/
>
> a literal regex with a subpattern repeated three times
>
> I could probably split on the ', but it seems that might have unwanted
> side effects.

This doesn't help much, unless the 3 comes from a variable, but ...

case "a, 'b', 'c', 'd'"
when /^([A-Za-z0-9,]+)((?:, '[^']+'){3,3})/
p $1, $2.scan(/'([^']+)'/)
# when /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/
# p $1, $2, $3, $4
end


--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Eero Saynatkari

10/22/2006 2:01:00 AM

0

On 2006.10.22 10:02, Giles Bowkett wrote:
> Anybody know a way to make this DRYer?
>
> when /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/

\1, \2 etc. may be used to refer to previous numbered groups.

I wish the same could be applied to the egregious overuse of
the term DRY by the Rails club.

Ken Bloom

10/22/2006 2:18:00 AM

0

On Sun, 22 Oct 2006 10:02:37 +0900, Giles Bowkett wrote:

> Anybody know a way to make this DRYer?
>
> when /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/
>
> a literal regex with a subpattern repeated three times
>
> I could probably split on the ', but it seems that might have unwanted
> side effects.
>

That's fine. I see no reason to make it more obfuscated. A couple tips
though:

* Use .+? instead of [^']+
.+? does a non-greedy match, which is what you're really trying to say
with the [^']+

* If you want to match the *same* text three times, for example
"a, '1', '1', '1'" but not "a, '1', '2', '3'", then you should use a
backsubstitution in the match, using \2 twice, instead of the second
two groups, so the pattern becomes
/^([A-Za-z0-9,]+), '([^']+)', '\2', '\2'/

--Ken

--
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu...
I've added a signing subkey to my GPG key. Please update your keyring.

James Britt

10/22/2006 3:16:00 AM

0

Eero Saynatkari wrote:
> On 2006.10.22 10:02, Giles Bowkett wrote:
>
>>Anybody know a way to make this DRYer?
>>
>> when /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/
>
>
> \1, \2 etc. may be used to refer to previous numbered groups.
>
> I wish the same could be applied to the egregious overuse of
> the term DRY by the Rails club.

Indeed. Irony at its best.




--
James Britt

"Trying to port the desktop metaphor to the Web is like working
on how to fuel your car with hay because that is what horses eat."
- Dare Obasanjo

Robert Klemme

10/22/2006 1:12:00 PM

0

Eero Saynatkari wrote:
> On 2006.10.22 10:02, Giles Bowkett wrote:
>> Anybody know a way to make this DRYer?
>>
>> when /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/
>
> \1, \2 etc. may be used to refer to previous numbered groups.

Which does not help in this case because that makes it match the same
stuff again:

irb(main):031:0> %w{aaa abc bcd}.map {|s| /(\w)\1\1/ =~ s}
=> [0, nil, nil]

Other suggestions

when /^([A-Za-z0-9,]+)((?:, '([^']+)'){3})/
# and then apply a second match to group 2

when /^([A-Za-z0-9,]+)#{", '([^']+)'" * 3}/o

irb(main):032:0> /^([A-Za-z0-9,]+)#{", '([^']+)'" * 3}/o
=> /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/

Kind regards

robert

Eero Saynatkari

10/22/2006 11:33:00 PM

0

On 2006.10.22 11:00, Eero Saynatkari wrote:
> On 2006.10.22 10:02, Giles Bowkett wrote:
> > Anybody know a way to make this DRYer?
> >
> > when /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/
>
> \1, \2 etc. may be used to refer to previous numbered groups.
>
> I wish the same could be applied to the egregious overuse of
> the term DRY by the Rails club.

Giles, if the above came across snide--it was meant to. I do,
however, apologise that you had to bear the mighty brunt of
it. I have grown somewhat tired of the sheer number of DRY
this and DRY that I have seen recently and you are most likely
not behind all of it at least :)

Kev Jackson

10/23/2006 2:36:00 AM

0

> > I wish the same could be applied to the egregious overuse of
> > the term DRY by the Rails club.

I'm actually grateful that the Rails people have pushed the idea of
Don't repeat yourself - it was one of those 'good idea, wish I'd have
thought of it' moments that I had when starting with ruby last year.

Not sure about the acronym, but the concept is (although about as
basic as it gets) worthwhile

Kev

Jacob Fugal

10/23/2006 3:05:00 PM

0

On 10/21/06, Ken Bloom <kbloom@gmail.com> wrote:
> On Sun, 22 Oct 2006 10:02:37 +0900, Giles Bowkett wrote:
> > when /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/
>
> * Use .+? instead of [^']+
> .+? does a non-greedy match, which is what you're really trying to say
> with the [^']+

Actually, no, the way he had it is better. Check this out:

http://perlmonks.org/?node=Death%20to%20...!

Using /.+?/ (which is really equivalent to /..*?/ and thus in the same
camp as the article) can be incorrect and also slower. In this case --
I think -- the '.+?' would yield correct results since there's a one
character terminator, but the speed is still an issue. For a simple
string and pair of regexes, it's not much:

$ cat regex-bm.rb
require 'benchmark'

TIMES = 10_000_000
REGEX1 = /'([^']+)'/
REGEX2 = /'(.+?)'/
STRING = "'Woah,' John said, 'there're multiple quotes!'"

Benchmark.bmbm do |x|
x.report("[^']+") { TIMES.times{ STRING =~ REGEX1 } }
x.report(".+?") { TIMES.times{ STRING =~ REGEX2 } }
end

$ ruby -v regex-bm.rb
ruby 1.8.4 (2005-12-24) [powerpc-darwin7.9.0]
Rehearsal -----------------------------------------
[^']+ 24.130000 0.050000 24.180000 ( 24.261712)
.+? 25.490000 0.040000 25.530000 ( 25.799146)
------------------------------- total: 49.710000sec

user system total real
[^']+ 24.160000 0.070000 24.230000 ( 24.729463)
.+? 25.410000 0.060000 25.470000 ( 25.572987)

But it is present. And it gets worse as both the regex being used and
the string being matched get more complex. For a simple case like
Giles, I wouldn't worry too much about the performance difference
between .+? and [^']+, and the correctness is fine. So you can use .+?
-- it *is* more readable. But it also hides subtleties, which raises
flags for me.

Jacob Fugal

Giles Bowkett

10/23/2006 5:49:00 PM

0

On 10/21/06, Ken Bloom <kbloom@gmail.com> wrote:
> On Sun, 22 Oct 2006 10:02:37 +0900, Giles Bowkett wrote:
>
> > Anybody know a way to make this DRYer?
> >
> > when /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/
> >
> > a literal regex with a subpattern repeated three times
> >
> > I could probably split on the ', but it seems that might have unwanted
> > side effects.
> >
>
> That's fine. I see no reason to make it more obfuscated.

Well, see, I used to be a Perl guy. I'm used to thinking of
obfuscation as its own reward.

> A couple tips though:
>
> * Use .+? instead of [^']+
> .+? does a non-greedy match, which is what you're really trying to say
> with the [^']+

Cheers. You're absolutely right there. I knew there was a way to do
non-greedy matching but couldn't recall it.

> * If you want to match the *same* text three times, for example
> "a, '1', '1', '1'" but not "a, '1', '2', '3'", then you should use a
> backsubstitution in the match, using \2 twice, instead of the second
> two groups, so the pattern becomes
> /^([A-Za-z0-9,]+), '([^']+)', '\2', '\2'/

ah ok. that is a crucial difference. the data was in the same pair of
' -s each time, but it was different data each time also. so that \2
wouldn't actually have worked for me in this case.

--
Giles Bowkett
http://www.gilesg...

Giles Bowkett

10/23/2006 5:54:00 PM

0

> > > Anybody know a way to make this DRYer?
> > >
> > > when /^([A-Za-z0-9,]+), '([^']+)', '([^']+)', '([^']+)'/
> >
> > \1, \2 etc. may be used to refer to previous numbered groups.
> >
> > I wish the same could be applied to the egregious overuse of
> > the term DRY by the Rails club.
>
> Giles, if the above came across snide--it was meant to.

well DUH

> I do,
> however, apologise that you had to bear the mighty brunt of
> it. I have grown somewhat tired of the sheer number of DRY
> this and DRY that I have seen recently and you are most likely
> not behind all of it at least :)

OK, first off, you don't need to apologize to me for making fun of the
Rails club. I got kicked out of that club for being intolerant of
script kiddies. As much as I admire Rails for its brilliance and
elegance, you can go ahead and say anything you want about that
community for all I care.

Second, no offense, but although I did bear the brunt, I don't think
the brunt was as mighty as all that. It wasn't a runt of a brunt but
it wasn't mighty either. I'll live.

Third! There was a bit of tongue-in-cheek going on there. I mean you'd
have to be insane, or a Perl coder, to think that a straightforward
regex could genuinely **benefit** from being compacted into something
terser merely for the sake of compacting things. I really just wanted
to see if it was possible at all.

--
Giles Bowkett
http://www.gilesg...