Robert Klemme
4/7/2008 8:07:00 AM
2008/4/7, Marc Heiler <shevegen@linuxmail.org>:
> I have a slight problem. I have strings with some tags such as
>
> '<b><lightblue>name:</></b>'
>
> I need to match "name:" and "lightblue"
> In other words:
> - What is between <> </>
> and
> - What is inside the first <> right next to "name:"
>
> The following regex does not work:
>
> '<b><lightblue>name:</></b>' =~ /<([a-zA-Z]+)>(.+?)<\/>/
>
> $1 # => "b"
> $2 # => "<lightblue>name:
>
> $2 should only be name:
> and $1 should only be lightblue
Constructing a regexp to match more specific often helps:
irb(main):001:0> s='<b><lightblue>name:</></b>'
=> "<b><lightblue>name:</></b>"
irb(main):002:0> md = %r{<b>\s*<([^>]*)>([^<]*)</>}.match s
=> #<MatchData:0x7ff973f4>
irb(main):003:0> md.to_a
=> ["<b><lightblue>name:</>", "lightblue", "name:"]
irb(main):004:0> md = %r{<b>\s*<([^>]*)>\s*([^<]*)</>}.match s
=> #<MatchData:0x7ff85b54>
irb(main):005:0> md.to_a
=> ["<b><lightblue>name:</>", "lightblue", "name:"]
irb(main):006:0>
See how this works without reluctant quantifier?
Cheers
robert
--
use.inject do |as, often| as.you_can - without end