[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

regular expressions question

konsu

12/14/2005 9:01:00 PM

hello,

i need to capture all matches for a group. for example if

'ab c' =~ /^(.)*$/

i would like to get array [ 'a', 'b', ' ', 'c' ]

could not figure out how to do it in ruby. String#scan did not seem to
be the right thing. please help.

thanks
konstantin

65 Answers

James Gray

12/14/2005 9:15:00 PM

0

On Dec 14, 2005, at 3:02 PM, ako... wrote:

> hello,
>
> i need to capture all matches for a group. for example if
>
> 'ab c' =~ /^(.)*$/
>
> i would like to get array [ 'a', 'b', ' ', 'c' ]
>
> could not figure out how to do it in ruby. String#scan did not seem to
> be the right thing. please help.

When using scan(), you need to remove the anchoring:

>> "ab c".scan(/./)
=> ["a", "b", " ", "c"]

Hope that helps.

James Edward Gray II


Ross Bamford

12/14/2005 9:21:00 PM

0

On Wed, 14 Dec 2005 21:00:56 -0000, ako... <akonsu@gmail.com> wrote:

> i need to capture all matches for a group. for example if
>
> 'ab c' =~ /^(.)*$/
>
> i would like to get array [ 'a', 'b', ' ', 'c' ]
>

You could try:

irb(main):001:0> "ab c".split('') # split on nothing
=> ["a", "b", " ", "c"]

irb(main):002:0> "ab c".split(//) # same again
=> ["a", "b", " ", "c"]

irb(main):003:0> "ab c".scan(/./) # scan on any single char
=> ["a", "b", " ", "c"]


--
Ross Bamford - rosco@roscopeco.remove.co.uk
"\e[1;31mL"

konsu

12/14/2005 9:35:00 PM

0

thank you. this was just an example. in general, is it possible to get
a collection of captures for a group without having to write custom
code?

Ross Bamford

12/14/2005 9:58:00 PM

0

On Wed, 14 Dec 2005 21:34:52 -0000, ako... <akonsu@gmail.com> wrote:

> thank you. this was just an example. in general, is it possible to get
> a collection of captures for a group without having to write custom
> code?
>

Have to admit I'm not exactly a regex wiz, but I imagine it can be done
somehow. I assume you mean having a repeated capturing group append to an
array any number of times?

But, I still think scan is a good tool for the job, it can do any regexp
anyway. I don't think a single regexp is really intended for doing
variable numbers of captures anyway (?) ).

irb(main):054:0> "ab c".scan(/\w|\s/)
=> ["a", "b", " ", "c"]

or

irb(main):052:0> "this is a test".scan(/\w+/)
=> ["this", "is", "a", "test"]

or even

irb(main):053:0> "this is a test".scan(/\w+|\s/)
=> ["this", " ", "is", " ", "a", " ", "test"]

Cheers,
Ross

--
Ross Bamford - rosco@roscopeco.remove.co.uk
"\e[1;31mL"

konsu

12/14/2005 9:59:00 PM

0

thank you. the question is general.

if i wanted to parse a list of letters separated by spaces and commas:

'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/

i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
can split, then massage the result some more and get the final result.
is there a way to get to groups' captures after a regex match? like in
microsoft's .net?

Ross Bamford

12/14/2005 11:51:00 PM

0

On Wed, 14 Dec 2005 21:59:27 -0000, ako... <akonsu@gmail.com> wrote:

> thank you. the question is general.
>
> if i wanted to parse a list of letters separated by spaces and commas:
>
> 'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
>
> i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
> can split, then massage the result some more and get the final result.
> is there a way to get to groups' captures after a regex match? like in
> microsoft's .net?
>

I don't really get what you mean. I don't understand the rules that got a
and b into one group and c into another. When you say it's a general
question, do you mean you just want access to the captures from some
regexp match?

irb(main):009:0> "a , b,c" =~ /(\w\s*?,\s*?\w)\s*?,\s*?(\w)/
=> 0
irb(main):010:0> $1
=> "a , b"
irb(main):011:0> $2
=> "c"
irb(main):012:0> $~[1]
=> "a , b"
irb(main):013:0> $~[2]
=> "c"
irb(main):014:0> md = /(\w\s*?,\s*?\w)\s*?,\s*?(\w)/.match("a, b,c")
=> #<MatchData:0xb7a47860>
irb(main):015:0> md[1]
=> "a, b"
irb(main):016:0> md.captures[1]
=> "c"
irb(main):017:0> $~.inspect
=> "#<MatchData:0xb7a47860>"

(and others...)

Hope that helps,
Ross

--
Ross Bamford - rosco@roscopeco.remove.co.uk
"\e[1;31mL"

Jeff Wood

12/15/2005 12:17:00 AM

0

You should be able to tell who this message is meant for:

PLEASE stop sending out code that uses any of the perl ${x} variables ...

They are ugly and have no place in Ruby ... they are only provided to
make the transition of Perl people easier ...

Please teach people to use MatchData objects ...

my_regex = /(\w\s*?.\s*?\w)\s*?.\s*?(\w)/

matches = my_regex.match( "a , b,c" )

element 0 of the matches object will contain the complete matched string.

each element after that will map to one of the groups you defined ...

so:

matches[0] will be the whole string
"a , b,c"
matches[1] will be your first group
"a , b"
matches[2] will be your second group
"c"

... seriously, we're not helping people make cleaner code when we show
approval for the ugly/evil ${x} warts we've kept from Perl.

... show people the beauty and cleanliness of using an OOP solution ...

I hope you agree.

j.

On 12/14/05, Ross Bamford <rosco@roscopeco.remove.co.uk> wrote:
> On Wed, 14 Dec 2005 21:59:27 -0000, ako... <akonsu@gmail.com> wrote:
>
> > thank you. the question is general.
> >
> > if i wanted to parse a list of letters separated by spaces and commas:
> >
> > 'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
> >
> > i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
> > can split, then massage the result some more and get the final result.
> > is there a way to get to groups' captures after a regex match? like in
> > microsoft's .net?
> >
>
> I don't really get what you mean. I don't understand the rules that got a
> and b into one group and c into another. When you say it's a general
> question, do you mean you just want access to the captures from some
> regexp match?
>
> irb(main):009:0> "a , b,c" =~ /(\w\s*?,\s*?\w)\s*?,\s*?(\w)/
> => 0
> irb(main):010:0> $1
> => "a , b"
> irb(main):011:0> $2
> => "c"
> irb(main):012:0> $~[1]
> => "a , b"
> irb(main):013:0> $~[2]
> => "c"
> irb(main):014:0> md = /(\w\s*?,\s*?\w)\s*?,\s*?(\w)/.match("a, b,c")
> => #<MatchData:0xb7a47860>
> irb(main):015:0> md[1]
> => "a, b"
> irb(main):016:0> md.captures[1]
> => "c"
> irb(main):017:0> $~.inspect
> => "#<MatchData:0xb7a47860>"
>
> (and others...)
>
> Hope that helps,
> Ross
>
> --
> Ross Bamford - rosco@roscopeco.remove.co.uk
> "\e[1;31mL"
>
>


--
"Remember. Understand. Believe. Yield! -> http://ruby-lang...

Jeff Wood


Ross Bamford

12/15/2005 12:59:00 AM

0

On Thu, 15 Dec 2005 00:16:52 -0000, Jeff Wood <jeff.darklight@gmail.com>
wrote:

> You should be able to tell who this message is meant for:
>

Why not just address me directly?

> PLEASE stop sending out code that uses any of the perl ${x} variables ...
>

Well, okay. No need to shout though, is there?

Just trying to put a bit back, you know?

--
Ross Bamford - rosco@roscopeco.remove.co.uk
"\e[1;31mL"

William James

12/15/2005 1:02:00 AM

0

ako... wrote:
> thank you. the question is general.
>
> if i wanted to parse a list of letters separated by spaces and commas:
>
> 'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
>
> i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
> can split, then massage the result some more and get the final result.
> is there a way to get to groups' captures after a regex match? like in
> microsoft's .net?

t = 'a , b,c'.split( /\s*,\s*/ )
group1 = t[0..-2]
group2 = t[-1,1]

Nicholas Van Weerdenburg

12/15/2005 1:22:00 AM

0