Patrick Doyle
10/3/2008 4:44:00 PM
[Note: parts of this message were removed to make it a legal post.]
The key idea here is that "*" means "match zero or more of" whereas "+"
means "match one or more of". So, when you match \w* against "one two",
there are zero or more instances of a word character (3, in fact, 'o', 'n',
and 'e'), so that produces one result. Following that result, there are
zero matches of a word character, but since you asked for "zero or more of",
you get that empty string result. Later, rinse, repeat for the "two" part.
FWIW, instead of looking at the result with #inspect, I found it more
informative to look at the result returned from #scan by itself, e.g.
irb> "one two".scan(/\w*/)
=> ["one", "", "two", ""]
--wpd
On Fri, Oct 3, 2008 at 12:29 PM, Patrick He <patrick.he@gmail.com> wrote:
> \w* does not match the space between string "one" and "two". it matches
> "one", <empty string after "one">, "two", <empty string after "two">.
>
> There are some other examples:
>
> irb(main):004:0> "one".scan(/^\w*/)
> => ["one"]
> irb(main):005:0> "one".scan(/\w*$/)
> => ["one", ""]
>
>
> --
> Patrick
>
>
> renton.dan@gmail.com wrote:
> > On Oct 3, 11:44 am, Ben Bleything <b...@bleything.net> wrote:
> >
> >> On Sat, Oct 04, 2008, renton....@gmail.com wrote:
> >>
> >>> "one two".scan(/\w*/).length
> >>>
> >>> returns 4. I can see it matching the 2 words and the space, what else
> >>> is it matching on? Is there a null terminator, I thought Ruby strings
> >>> were not null termed.
> >>>
> >> Try replacing #length with #inspect and seeing what the output of scan
> >> is. You'll find that it's returning two empty strings as well. I
> >> suspect what you really want is \w+...
> >>
> >> Ben
> >>
> >
> > Yeah, you're right \w+ will pull out the words, which is what I want
> > anyway. Though I'm trying to understand what \w* is doing.
> > irb(main):015:0> "one two".scan(/\w*/).inspect
> > => "[\"one\", \"\", \"two\", \"\"]"
> >
> > My question is, what is the last "\", where does it come from.
> >
>
>