[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

regexp -how to match this?

Nanyang Zhan

4/9/2007 1:29:00 PM

what kind of pattern will match the part of sentence before a <span>
tag?

for instance:
for this sentence:
This forum is connected to a mailing list that is read by <span
class="wow">thousands</span> of people.

it'll match:
This forum is connected to a mailing list that is read by

--
Posted via http://www.ruby-....

9 Answers

Rick DeNatale

4/9/2007 2:13:00 PM

0

On 4/9/07, Nanyang Zhan <sxain@hotmail.com> wrote:
> what kind of pattern will match the part of sentence before a <span>
> tag?
>
> for instance:
> for this sentence:
> This forum is connected to a mailing list that is read by <span
> class="wow">thousands</span> of people.
>
> it'll match:
> This forum is connected to a mailing list that is read by

/^.*?(?=<span)/

This is a little loose since it treats anything starting with "<span"
as a span tag.

Breaking it down:

^ - start of string

*? - 0 or more characters, non-greedy, otherwise this would match
everything up to the LAST "<span" in the string, in stead of the first
which is what I suspect you really want.

(?=<span) - This is a zero-length lookahead, this means that "<span"
must occur just after what has been matched, but it will not be part
of the match itself.

HTH

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denh...

Robert Klemme

4/9/2007 2:22:00 PM

0

On 09.04.2007 15:28, Nanyang Zhan wrote:
> what kind of pattern will match the part of sentence before a <span>
> tag?
>
> for instance:
> for this sentence:
> This forum is connected to a mailing list that is read by <span
> class="wow">thousands</span> of people.
>
> it'll match:
> This forum is connected to a mailing list that is read by

One way to do it:

irb(main):022:0* s='This forum is connected to a mailing list that is
read by <span
irb(main):023:0' class="wow">thousands</span> of people.'
=> "This forum is connected to a mailing list that is read by
<span\nclass=\"wow\">thousands</span> of people."
irb(main):024:0> s[/\A(.*?)<span/, 1]
=> "This forum is connected to a mailing list that is read by "

robert

Nanyang Zhan

4/9/2007 2:37:00 PM

0

Rick Denatale wrote:

> /^.*?(?=<span)/

thanks.

BTW, what is â??Duck Typingâ??

--
Posted via http://www.ruby-....

Rick DeNatale

4/9/2007 3:01:00 PM

0

On 4/9/07, Nanyang Zhan <sxain@hotmail.com> wrote:
> Rick Denatale wrote:
>
> > /^.*?(?=<span)/
>
> thanks.
>
> BTW, what is "Duck Typing"?

Well, here's some of what *I've* written on the subject:
http://talklikeaduck.denh...articles...

I'd suggest looking at them starting with the oldest one (they are in
reverse chronological order).

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denh...

John Joyce

4/9/2007 3:08:00 PM

0

Using the old saying,
"If it walks like a duck and talks like a duck, then it is a duck."
It means deciding something is a duck if it seems to be a duck.
Part of the principle of least surprise [to Matz]
On Apr 9, 2007, at 11:37 PM, Nanyang Zhan wrote:

> Rick Denatale wrote:
>
>> /^.*?(?=<span)/
>
> thanks.
>
> BTW, what is “Duck Typing”?
>
> --
> Posted via http://www.ruby-....
>


Phillip Gawlowski

4/9/2007 4:03:00 PM

0

Nanyang Zhan wrote:

> BTW, what is â??Duck Typingâ??

PickAxe 2nd Edition (and probably the freely available 1st Edition) have
a nice, interesting and very readable chapter covering that.

In a nutshell: What the other's have already said.

--
Phillip "CynicalRyan" Gawlowski
http://cynicalryan....

Rule of Open-Source Programming #6:

The user is always right unless proven otherwise by the developer.

Nanyang Zhan

4/10/2007 1:01:00 PM

0

Rick Denatale wrote:

> (?=<span) - This is a zero-length lookahead, this means that "<span"
> must occur just after what has been matched, but it will not be part
> of the match itself.

so ?= makes pattern lookAHEAD. How to make pattern lookBEHIND?

for instance:

example sentence:
This forum is connected to a mailing list that is read by <span
class="wow">thousands</span> of people.

question:
how to make a Regexp to match the words followed by the </span> tag?

a /<\/span>.*/ will include the tag, which isn't what I want.

--
Posted via http://www.ruby-....

Gavin Kistner

4/10/2007 1:48:00 PM

0

On Apr 10, 7:00 am, Nanyang Zhan <s...@hotmail.com> wrote:
> so ?= makes pattern lookAHEAD. How to make pattern lookBEHIND?

http://phrogz.net/ProgrammingRuby/language.html#...

Zero-width positive and negative lookaheads are supported in Ruby's
regexp engine in 1.8. Zero-width lookbehind assertions are not
supported by the current regexp engine. (However, they are supported
by Oniguruma, the regexp engine used in 1.9 and future builds of
Ruby.)

> example sentence:
> This forum is connected to a mailing list that is read by <span
> class="wow">thousands</span> of people.
>
> question:
> how to make a Regexp to match the words followed by the </span> tag?

Just because you consume them doesn't mean you have to use them. Use
parentheses to saved parts of text extracted by your regular
expression.

irb(main):001:0> str = 'is read by <span class="wow">thousands</span>
of people.'
=> "is read by <span class=\"wow\">thousands</span> of people."

irb(main):002:0> str[ /<\/span>(.+)/, 1 ]
=> " of people."

irb(main):003:0> %r{</span>(.+)}.match( str ).to_a
=> ["</span> of people.", " of people."]



Nanyang Zhan

4/10/2007 2:40:00 PM

0

Gavin Kistner wrote:
> Just because you consume them doesn't mean you have to use them. Use
> parentheses to saved parts of text extracted by your regular
> expression.

I'm trying to code one method(with one regexp input) to extract any part
of a given string.

but now it seems a fix method is very hard to accomplish this job.

--
Posted via http://www.ruby-....