[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

regexp splitting problem

Brett S Hallett

11/29/2003 4:00:00 AM

Hi,
I am trying to split the following line of text:

<button> "btn Exit" "Exit Button" ( note the quotes may be
" or ' , read from a file)

in such a way that I can say

txt = line.split(/regrex/)

and get back

txt[0] = <button>
txt[1] = btn Exit
txt[2] = Exit Button

my current regexp

ans = tst.split(/[\"|\']/)

does this , except that the last set is missing ! ,


txt[0] = <button>
txt[1] = btn Exit
txt[2] =

so how do I get the expression to continue processing the line ??
Thanks




3 Answers

Maik Schmidt

11/29/2003 6:16:00 AM

0

Brett S Hallett wrote:
>
> ans = tst.split(/[\"|\']/)
Your regex can be simplified, because within a character class the pipe
character means "match a pipe character" and not "or". Additionally, you
do not have to escape the quotes, so the resulting regex would be /["']/.
>
> does this , except that the last set is missing ! ,
>
That's not totally correct. The last set isn't missing, but the 3rd set
is empty. For easier debugging try:

puts text.split(/["']/).join("\n")

> so how do I get the expression to continue processing the line ??
As mentioned before: That isn't the problem. Your are searching for a
regex that splits a line into tokens. Some of the tokens are enclosed in
quotes and some are not. Both tokens can contain whitespace. I am not
sure, if your problem can easily be solved by using a single regex. If
you can, you should change your input format.

Is the first token always enclosed in [<>] characters? Are the following
tokens always enclosed in quotes? Then it would be easier to split the
line, but you still would need more than one split call. Maybe then it
would fit in a single call of scan?

Cheers,

<maik/>

Robert Klemme

12/1/2003 8:13:00 AM

0


"Brett S Hallett" <dragoncity@impulse.net.au> schrieb im Newsbeitrag
news:3FC81A17.7050909@impulse.net.au...
> Hi,
> I am trying to split the following line of text:
>
> <button> "btn Exit" "Exit Button" ( note the quotes may be
> " or ' , read from a file)
>
> in such a way that I can say
>
> txt = line.split(/regrex/)
>
> and get back
>
> txt[0] = <button>
> txt[1] = btn Exit
> txt[2] = Exit Button
>
> my current regexp
>
> ans = tst.split(/[\"|\']/)
>
> does this , except that the last set is missing ! ,
>
>
> txt[0] = <button>
> txt[1] = btn Exit
> txt[2] =
>
> so how do I get the expression to continue processing the line ??

txt = line.scan /"[^"]*" | '[^']*' | \S+/x

robert

aero6dof

12/3/2003 1:17:00 AM

0

Brett S Hallett <dragoncity@impulse.net.au> wrote in message news:<3FC81A17.7050909@impulse.net.au>...
> Hi,
> I am trying to split the following line of text:
>
> <button> "btn Exit" "Exit Button" ( note the quotes may be
> " or ' , read from a file)
>
> in such a way that I can say
>
> txt = line.split(/regrex/)
>
> and get back
>
> txt[0] = <button>
> txt[1] = btn Exit
> txt[2] = Exit Button

This works for your example, but may be somewhat fragile when you go
to expand its use over a wider range of inputs...

require 'test/unit'

class TC_one < Test::Unit::TestCase
def test_01
str = %Q/<button> "btn Exit" "Exit Button"/
ans = str.split( / *[\"\'] *\"?/)

assert_equal( ["<button>", "btn Exit", "Exit Button"], ans)
end
end

Cheers,
- alan