Asp Forum - regex: get the first match

Trochalakis Christos

6/10/2007 10:47:00 AM

Hello!

I want to parse a tagged string like this: "this ismy
string"

i am doing:

>> "this ismy string".scan(/(.*)<\/i>/)
=> [["this ismy string"]]

What i want is a regex that will return the *first* segment that
matches.
in the above case -> [["this is", "my string"]]

Is there any way to do this?

Thanks!

5 Answers

Robert Dober

6/10/2007 12:09:00 PM

On 6/10/07, Trochalakis Christos <yatiohi@ideopolis.gr> wrote:
> Hello!
>
> I want to parse a tagged string like this: "this ismy
> string"
>
> i am doing:
>
> >> "this ismy string".scan(/(.*)<\/i>/)
> => [["this ismy string"]]
>
> What i want is a regex that will return the *first* segment that
> matches.
> in the above case -> [["this is", "my string"]]
>
> Is there any way to do this?
>
> Thanks!
>
>
>
This is a FAQ, and yes I will give the solution ;)
Regexps are gready par default, they consume as many chars as
possible, there are some possibilities - not tested:

(1) use non gready matches
"this ismy string".scan(/(.*?)<\/i>/)
(2) use less general expressions
"this ismy string".scan(/(.[^<]*)<\/i>/)
(3) Combine both ;)
"this ismy string".scan(/(.[^<]*?)<\/i>/)

HTH
Robert

P.S.
This *really* is a FAQ though
--
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

GrzechG

6/10/2007 12:23:00 PM

> I want to parse a tagged string like this: "this ismy
> string"
>
> i am doing:
>
>>> "this ismy string".scan(/(.*)<\/i>/)
> => [["this ismy string"]]
>
> What i want is a regex that will return the *first* segment that
> matches.
> in the above case -> [["this is", "my string"]]

The solution is :

"this ismy string".scan(/(.*?)<\/i>/)
=> [["this is"], ["my string"]]

The regexp scope is default maximum as is possible to find.
If you use '?' character you minimze the scope.
(.*?) instead of (.*) and the part of string don't be include
into one result.

Regards,
Grzegorz Golebiowski

Robert Dober

6/10/2007 12:45:00 PM

On 6/10/07, Logan Capaldo <logancapaldo@gmail.com> wrote:
> On 6/10/07, Robert Dober <robert.dober@gmail.com> wrote:
> >
> > On 6/10/07, Trochalakis Christos <yatiohi@ideopolis.gr> wrote:
> > > Hello!
> > >
> > > I want to parse a tagged string like this: "this ismy
> > > string"
> > >
> > > i am doing:
> > >
> > > >> "this ismy string".scan(/(.*)<\/i>/)
> > > => [["this ismy string"]]
> > >
> > > What i want is a regex that will return the *first* segment that
> > > matches.
> > > in the above case -> [["this is", "my string"]]
> > >
> > > Is there any way to do this?
> > >
> > > Thanks!
> > >
> > >
> > >
> > This is a FAQ, and yes I will give the solution ;)
> > Regexps are gready par default, they consume as many chars as
> > possible, there are some possibilities - not tested:
> >
> > (1) use non gready matches
> > "this ismy string".scan(/(.*?)<\/i>/)
> > (2) use less general expressions
> > "this ismy string".scan(/(.[^<]*)<\/i>/)
> > (3) Combine both ;)
> > "this ismy string".scan(/(.[^<]*?)<\/i>/)
>
>
> .Unless you want to match strings like <foo, it would be simple to
> just use [^<]*, and not .[^<]*. .[^<]* will also not match . If the
> intent was to make the regexp not match that, a better regexp would be [^<]+
Thanks for correcting my typos.
Robert

--
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

Trochalakis Christos

6/10/2007 5:16:00 PM

On Jun 10, 3:22 pm, GrzechG <grze...@DELITgazeta.pl> wrote:
> > I want to parse a tagged string like this: "this ismy
> > string"
>
> > i am doing:
>
> >>> "this ismy string".scan(/(.*)<\/i>/)
> > => [["this ismy string"]]
>
> > What i want is a regex that will return the *first* segment that
> > matches.
> > in the above case -> [["this is", "my string"]]
>
> The solution is :
>
> "this ismy string".scan(/(.*?)<\/i>/)
> => [["this is"], ["my string"]]
>
> The regexp scope is default maximum as is possible to find.
> If you use '?' character you minimze the scope.
> (.*?) instead of (.*) and the part of string don't be include
> into one result.
>
> Regards,
> Grzegorz Golebiowski

Thanks Grzegorz, nice trick!

Robert Dober

6/10/2007 5:31:00 PM

On 6/10/07, Trochalakis Christos <yatiohi@ideopolis.gr> wrote:
> On Jun 10, 3:22 pm, GrzechG <grze...@DELITgazeta.pl> wrote:
> > > I want to parse a tagged string like this: "this ismy
> > > string"
> >
> > > i am doing:
> >
> > >>> "this ismy string".scan(/(.*)<\/i>/)
> > > => [["this ismy string"]]
> >
> > > What i want is a regex that will return the *first* segment that
> > > matches.
> > > in the above case -> [["this is", "my string"]]
> >
> > The solution is :
> >
> > "this ismy string".scan(/(.*?)<\/i>/)
> > => [["this is"], ["my string"]]
> >
> > The regexp scope is default maximum as is possible to find.
> > If you use '?' character you minimze the scope.
> > (.*?) instead of (.*) and the part of string don't be include
> > into one result.
> >
> > Regards,
> > Grzegorz Golebiowski
>
> Thanks Grzegorz, nice trick!
>
You are welcome ;)
Robert

--
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

comp.lang.ruby

regex: get the first match

Trochalakis Christos

Robert Dober

GrzechG

Robert Dober

Trochalakis Christos

Robert Dober

x Login to ForumsZone