[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

string mangling

Martin Pirker

12/14/2005 12:16:00 PM

Imagine an input string
aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee...

I have regexp for the parts a,b,c
and e can be considered as else.

So how can I efficiently search/step through the string from left to
right, while calling for each section the fitting handler, kind of

case section
/aaaa/ ...

/bbb/ ...

/cccc/ ..

else

end


Thanks for ideas!
Martin
6 Answers

Robert Klemme

12/14/2005 1:09:00 PM

0

Martin Pirker wrote:
> Imagine an input string
> aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee...
>
> I have regexp for the parts a,b,c
> and e can be considered as else.
>
> So how can I efficiently search/step through the string from left to
> right, while calling for each section the fitting handler, kind of
>
> case section
> /aaaa/ ...
>
> /bbb/ ...
>
> /cccc/ ..
>
> else
>
> end

You are pretty close:

>> s='aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee'
=> "aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee"
>> s.scan /a+|b+|c+/ do |m|
?> case m
>> when /a+/
>> puts "A"
>> when /b+/
>> puts "B"
>> when /c+/
>> puts "C"
>> end
>> end
A
B
C
B
A
B
A
C
=> "aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee"

Or, if you want to avoid a second match:

>> s.scan /(a+)|(b+)|(c+)/ do |m|
?> case ( m.inject(0) {|i,e| break i if e; i + 1} )
>> when 0
>> puts "A"
>> when 1
>> puts "B"
>> when 2
>> puts "C"
>> end
>> end
A
B
C
B
A
B
A
C
=> "aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee"
>>

Kind regards

robert

Alex Fenton

12/14/2005 1:09:00 PM

0

Martin Pirker wrote:
> Imagine an input string
> aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee...
>
> I have regexp for the parts a,b,c
> and e can be considered as else.
>
> So how can I efficiently search/step through the string from left to
> right, while calling for each section the fitting handler, kind of

You could use String#scan to find bits that find sections that match any of your requirements, then check to see which matched (your patterns could be more complicated, but still distinguishable from one another)

str = 'aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee'

a_rx = /a+/
b_rx = /b+/
c_rx = /c+/

str.scan(/(?:#{a_rx}|#{b_rx}|#{c_rx})/) do | part |
case part
when a_rx
# ...
when b_rx
# ...
when c_rx
# ...
end
end

Martin Pirker

12/14/2005 1:18:00 PM

0

Robert Klemme <bob.news@gmx.net> wrote:
>> Imagine an input string
>> aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee...
>>
>> I have regexp for the parts a,b,c
>> and e can be considered as else.
>>
>> So how can I efficiently search/step through the string from left to
>> right, while calling for each section the fitting handler, kind of
>>
>> case section
>> /aaaa/ ...
>>
>> /bbb/ ...
>>
>> /cccc/ ..
>>
>> else
>>
>> end
[...]
> Or, if you want to avoid a second match:

of course I want :-)

>>> s.scan /(a+)|(b+)|(c+)/ do |m|
> ?> case ( m.inject(0) {|i,e| break i if e; i + 1} )
>>> when 0
>>> puts "A"
>>> when 1
>>> puts "B"
>>> when 2
>>> puts "C"
>>> end
>>> end
> A
> B
> C
> B
> A
> B
> A
> C
> => "aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee"

....but this doesn't allow a processing step in the "else" case?

Martin

James Gray

12/14/2005 1:39:00 PM

0

On Dec 14, 2005, at 6:32 AM, Simon Strandgaard wrote:

> On 12/14/05, Martin Pirker <crf@sbox.tu-graz.ac.at> wrote:
>> Imagine an input string
>> aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee...
>>
>> I have regexp for the parts a,b,c
>> and e can be considered as else.
>>
>> So how can I efficiently search/step through the string from left to
>> right, while calling for each section the fitting handler, kind of
>
> irb(main):001:0> s = 'aaaaabbccccceeebbbbbbbbbbeaaabaacccceee'
> => "aaaaabbccccceeebbbbbbbbbbeaaabaacccceee"
> irb(main):002:0> s.scan(/(a+)|(b+)|(c+)|([^abc]+)/)
> => [["aaaaa", nil, nil, nil], [nil, "bb", nil, nil], [nil, nil,
> "ccccc", nil], [nil, nil, nil, "eee"], [nil, "bbbbbbbbbb", nil, nil],
> [nil, nil, nil, "e"], ["aaa", nil, nil, nil], [nil, "b", nil, nil],
> ["aa", nil, nil, nil], [nil, nil, "cccc", nil], [nil, nil, nil,
> "eee"]]
> irb(main):003:0>

My similar thought:

>> str = 'aaaaabbccccceeebbbbbbbbbbeaaabaacccceee'
=> "aaaaabbccccceeebbbbbbbbbbeaaabaacccceee"
>> str.scan(/((\w)\2*)/).map { |chunk| chunk.first }
=> ["aaaaa", "bb", "ccccc", "eee", "bbbbbbbbbb", "e", "aaa", "b",
"aa", "cccc", "eee"]

James Edward Gray II



Eero Saynatkari

12/14/2005 7:40:00 PM

0

Martin Pirker wrote:
> Imagine an input string
> aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee...
>
> I have regexp for the parts a,b,c
> and e can be considered as else.
>
> So how can I efficiently search/step through the string from left to
> right, while calling for each section the fitting handler, kind of
>
> case section
> /aaaa/ ...
>
> /bbb/ ...
>
> /cccc/ ..
>
> else
>
> end

Aside from the oft-mentioned String#scan, you might look into
using StringScanner (require 'strscan') from the stdlib. It is
very good for more complex cases of scanning. Documentation is
available, for example, at http://www.ruby-doc.....

> Thanks for ideas!
> Martin


E

--
Posted via http://www.ruby-....


Robert Klemme

12/15/2005 10:03:00 AM

0

Martin Pirker wrote:
> Robert Klemme <bob.news@gmx.net> wrote:
>>> Imagine an input string
>>> aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee...
>>>
>>> I have regexp for the parts a,b,c
>>> and e can be considered as else.
>>>
>>> So how can I efficiently search/step through the string from left to
>>> right, while calling for each section the fitting handler, kind of
>>>
>>> case section
>>> /aaaa/ ...
>>>
>>> /bbb/ ...
>>>
>>> /cccc/ ..
>>>
>>> else
>>>
>>> end
> [...]
>> Or, if you want to avoid a second match:
>
> of course I want :-)
>
>>>> s.scan /(a+)|(b+)|(c+)/ do |m|
>>> case ( m.inject(0) {|i,e| break i if e; i + 1} )
>>>> when 0
>>>> puts "A"
>>>> when 1
>>>> puts "B"
>>>> when 2
>>>> puts "C"
>>>> end
>>>> end
>> A
>> B
>> C
>> B
>> A
>> B
>> A
>> C
>> => "aaaaabbccccceeebbbbbbbbbbeaaabaacccceeee"
>
> ...but this doesn't allow a processing step in the "else" case?

You can have an else clause - but it will never be called. I guess you
will process entries between matches. In that case scan won't help - at
least not as used in my example.

A simple option would be to use #split with a group around the whole
regexp and then operate on the array of strings you get. Whether that's
feasible (volume?) in you case I cannot decide.

s.split(/((?:a+)|(?:b+)|(?:c+))/.each do |m|
case section
/aaaa/ ...

/bbb/ ...

/cccc/ ..
else
end
end


Kind regards

robert