[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

newbie: Strings and RegExp

Diego Virasoro

10/19/2006 3:57:00 PM

Hello,
given an instance of String, str, how do you select/copy the section of
the string between two given strings START_STRING and END_STRING?

I used something like:
re = /#{STRING_BEFORE}(regular_expression)#{STRING_AFTER}/
str =~ re
middle = $1
where regular_expression is supposed to be a regular expression
selecting any character (digits, space, newline, character,
anything...).

Is this the right method? And what should I use for regular_expression?
I tried (.|\n)* but I get a "BUS error". Any other way?

Or maybe is there a method which does what I need?

(Note: I am using Ruby 1.6.4 on Solaris)

Thank you

Diego

7 Answers

Robert Klemme

10/19/2006 4:12:00 PM

0

On 19.10.2006 17:56, Diego Virasoro wrote:
> Hello,
> given an instance of String, str, how do you select/copy the section of
> the string between two given strings START_STRING and END_STRING?
>
> I used something like:
> re = /#{STRING_BEFORE}(regular_expression)#{STRING_AFTER}/
> str =~ re
> middle = $1
> where regular_expression is supposed to be a regular expression
> selecting any character (digits, space, newline, character,
> anything...).
>
> Is this the right method? And what should I use for regular_expression?
> I tried (.|\n)* but I get a "BUS error". Any other way?

First, you do not need this alternation if you use /m (multiline mode).
This regexp probably does an enourmous amount of backtracking because
of this alternation. What's in STRING_BEFORE and STRING_AFTER? If they
contain regexp metacharacters you should probably do

/#{Regexp.escape STRING_BEFORE}...

Depending on the length of your string, you might be better off with
something like this:

array = str.split(/(#{Regexp.escape STRING_BEFORE}|#{Regexp.escape
STRING_AFTER})/)

And then extract the middle portion from the resulting array.

> Or maybe is there a method which does what I need?
>
> (Note: I am using Ruby 1.6.4 on Solaris)

That version is *ancient*. It might even have some bugs which are fixed
in never versions. I strongly recommend to upgrade if possible.

Kind regards

robert

Gavin Kistner

10/19/2006 5:00:00 PM

0

Diego Virasoro wrote:
> given an instance of String, str, how do you select/copy the section of
> the string between two given strings START_STRING and END_STRING?

Here's one way:
STRING_BEFORE = "hello"
STRING_AFTER = "goodbye"
re = /#{STRING_BEFORE}(.+?)#{STRING_AFTER}/mo
str = "I just wanted to say hello to you before I say goodbye for the
final time."
str << " Oh, hello and goodbye again."
p str.scan( re ).flatten
#=> [" to you before I say ", " and "]

Louis J Scoras

10/19/2006 5:32:00 PM

0

You should also be able to do this with split. Then you can use
strings as well as regexes:

class String
def match_between(head, tail)
strings = []
s = split(head, 2)[1]
while (s)
match, rest = s.split(tail,2)
return strings unless rest
strings << match
s = rest.split(head, 2)[1]
end
strings
end
end

So with the same string as before,

str = "I just wanted to say hello to you before I say goodbye for
the final time."
str << " Oh, hello and goodbye again."

you get:

p str.match_between(/hello/,/goodbye/)
# => [" to you before I say ", " and "]

and also

p str.match_between('a','e')
# => ["nt", "y h", "y goodby", "l tim", "nd goodby"]

I'm sure you could also tweek it to operate in a manner which doesn't
consume the rest of the string, i.e.: the 'g' flag.


--
Lou.

Diego Virasoro

10/19/2006 5:48:00 PM

0

> Here's one way:
> STRING_BEFORE = "hello"
> STRING_AFTER = "goodbye"
> re = /#{STRING_BEFORE}(.+?)#{STRING_AFTER}/mo
> str = "I just wanted to say hello to you before I say goodbye for the
> final time."
> str << " Oh, hello and goodbye again."
> p str.scan( re ).flatten
> #=> [" to you before I say ", " and "]

Perfect, it works. Thanks a lot, and to Robert and Louis too, thanks.

So basically I gather the main problem was using the /mo. Right? That
alone, with the simpler regular expression seems to do the job. (maybe
before it was running out of memory? is that what bus error means?)

However here you are using /mo instead of /m. What does the extra o
mean? And where can I read more about this topic? I've checked on the
Pragmatic Programmer book (the one online) but I can't find any info on
that.

Thanks again.

Diego

Diego Virasoro

10/19/2006 5:50:00 PM

0

> That version is *ancient*. It might even have some bugs which are fixed
> in never versions. I strongly recommend to upgrade if possible.

Yes, I'd like to but it's the Univeristy computer so I have to live
with what there is. Sigh...

Diego

Robert Klemme

10/19/2006 9:50:00 PM

0

Diego Virasoro wrote:
>> That version is *ancient*. It might even have some bugs which are fixed
>> in never versions. I strongly recommend to upgrade if possible.
>
> Yes, I'd like to but it's the Univeristy computer so I have to live
> with what there is. Sigh...

:-) I believe you can compile Ruby in a way and install it completely
in your home directory. That of course works only if your quota is high
enough. Maybe worth a try.

Good luck!

robert

John Pywtorak

10/20/2006 5:31:00 AM

0

You absolutely can compile and install to home, just set the --prefix to
something like
--prefix=/usr/local/home/yours/ruby-1_8/

Then make sure the bin directory is in your path before the system
installed ruby that way env ruby picks up your new install.

Robert Klemme wrote:
> Diego Virasoro wrote:
>>> That version is *ancient*. It might even have some bugs which are fixed
>>> in never versions. I strongly recommend to upgrade if possible.
>>
>> Yes, I'd like to but it's the Univeristy computer so I have to live
>> with what there is. Sigh...
>
> :-) I believe you can compile Ruby in a way and install it completely
> in your home directory. That of course works only if your quota is high
> enough. Maybe worth a try.
>
> Good luck!
>
> robert
>
>