[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: regexp questions

Daniel Lucraft

6/24/2007 8:46:00 AM

Mike Steiner wrote:
> I'm getting some odd errors. One problem seems to be that \1 in the
> replacement string doesn't always work. Are there any known "gotchas"
> between Python's regexps and Ruby's?
>
> And what's the most robust way to convert "underlined" words to HTML
> italics
> using a regexp? Something like: "This _word_ is in italics." -> "This
> <i>word</i> is in italics."

Could you give an example of where it isn't working in the first case?

As for the second, I don't know about 'most robust' but I might try
something like this (with the assumption that words aren't broken over
lines):

irb> str
=> "asdf asdf _asdfasd_ asdf _ash_ h"
irb> str.gsub(/_([^\s]+)_/, "<i>\\1</i>")
=> "asdf asdf <i>asdfasd</i> asdf <i>ash</i> h"

Or if I wanted to be able to have italicised sentences (_word word_) I
might try this:

irb> str
=> "asdf asdf _asdf asd_ asdf _ash \nash_ h"
irb> str.gsub(/_(.+?)_/m, "<i>\\1</i>")
=> "asdf asdf <i>asdf asd</i> asdf <i>ash \nash</i> h"

But then I would worry about performance because of the lazy operator
and would want to test it on some real data.

best,
Dan

--
Posted via http://www.ruby-....

2 Answers

Wyatt Draggoo

6/24/2007 4:35:00 PM

0

On Sun, Jun 24, 2007 at 05:45:30PM +0900, Daniel Lucraft wrote:
>
> irb> str
> => "asdf asdf _asdfasd_ asdf _ash_ h"
> irb> str.gsub(/_([^\s]+)_/, "<i>\\1</i>")
> => "asdf asdf <i>asdfasd</i> asdf <i>ash</i> h"
>
> Or if I wanted to be able to have italicised sentences (_word word_) I
> might try this:
>
> irb> str
> => "asdf asdf _asdf asd_ asdf _ash \nash_ h"
> irb> str.gsub(/_(.+?)_/m, "<i>\\1</i>")
> => "asdf asdf <i>asdf asd</i> asdf <i>ash \nash</i> h"

I like to be very strict with things like quotes (and underscores in this case), so I would probably use:

irb> str
=> "asdf asdf _asdf asd_ asdf _ash \nash_ h"
irb> str.gsub(/_([^_]+)_/, "<i>\\1</i>")
=> "asdf asdf <i>asdf asd</i> asdf <i>ash \nash</i> h"

That seems to work like I would expect it to---I'm just coming over from Perl...

Wyatt

Michael Glaesemann

6/24/2007 6:21:00 PM

0


On Jun 24, 2007, at 11:35 , Wyatt Draggoo wrote:

> On Sun, Jun 24, 2007 at 05:45:30PM +0900, Daniel Lucraft wrote:
>
>> irb> str.gsub(/_(.+?)_/m, "<i>\\1</i>")
>
> I like to be very strict with things like quotes (and underscores
> in this case), so I would probably use:
>
> irb> str.gsub(/_([^_]+)_/, "<i>\\1</i>")

From a strictness point of view, what's the difference between /(.+?)
_/ and /([^_]+)_/ in the above? AIUI, they're equivalent. I
personally like the former because if you need to change the _ to
some other character, you only have to make a single character change.

Michael Glaesemann
grzm seespotcode net