[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

regex: splitting conditionally on |

computorist

6/17/2008 10:04:00 PM

I'm parsing mediawiki markup and I'd like to split a multi-line string
on | (vertical-bar), but only if it isn't contained w/in another
pattern.

| name1 = value1
| name2 = value2 | name3 = value3
| name4 = [[foo|bar]]

I'd like String.split(/pattern/) to return ['name1 = value1','name2 =
value2','name3 = value3','name4 = [[foo|bar]]']

I've working on a negative look-ahead pattern without success and my
brain is tired.

Thanks for any suggestions
3 Answers

Robert Klemme

6/18/2008 6:36:00 AM

0

On 18.06.2008 00:04, computorist wrote:
> I'm parsing mediawiki markup and I'd like to split a multi-line string
> on | (vertical-bar), but only if it isn't contained w/in another
> pattern.
>
> | name1 = value1
> | name2 = value2 | name3 = value3
> | name4 = [[foo|bar]]
>
> I'd like String.split(/pattern/) to return ['name1 = value1','name2 =
> value2','name3 = value3','name4 = [[foo|bar]]']
>
> I've working on a negative look-ahead pattern without success and my
> brain is tired.
>
> Thanks for any suggestions

Without testing, something like this might work:

str.scan %r{
(?: \[ [^\]]* \] | [^|] )+
}xm

Cheers

robert

Sebastian Hungerecker

6/18/2008 7:35:00 AM

0

computorist wrote:
> I'm parsing mediawiki markup and I'd like to split a multi-line string
> on | (vertical-bar), but only if it isn't contained w/in another
> pattern.
>
> | name1 = value1
> | name2 = value2 | name3 = value3
> | name4 = [[foo|bar]]
>
> I'd like String.split(/pattern/) to return ['name1 = value1','name2 =
> value2','name3 = value3','name4 = [[foo|bar]]']
>
> I've working on a negative look-ahead pattern without success and my
> brain is tired.
>
> Thanks for any suggestions

str.split /\s*\|\s*(?=\s*\w+\s*=)/
I don't know whether this exactly meets your requirement, but this will split
only on |s that are followed by a word and a =. For your sample input that
gives the desired result.

HTH,
Sebastian
--
Jabber: sepp2k@jabber.org
ICQ: 205544826

computorist

6/18/2008 5:23:00 PM

0

On Jun 18, 12:35 am, Sebastian Hungerecker <sep...@googlemail.com>
wrote:
> computorist wrote:
> > I'm parsing mediawiki markup and I'd like to split a multi-line string
> > on | (vertical-bar), but only if it isn't contained w/in another
> > pattern.
>
> > | name1 = value1
> > | name2 = value2 | name3 = value3
> > | name4 = [[foo|bar]]
>
> > I'd like String.split(/pattern/) to return ['name1 = value1','name2 =
> > value2','name3 = value3','name4 = [[foo|bar]]']
>
> > I've working on a negative look-ahead pattern without success and my
> > brain is tired.
>
> > Thanks for any suggestions
>
> str.split /\s*\|\s*(?=\s*\w+\s*=)/
> I don't know whether this exactly meets your requirement, but this will split
> only on |s that are followed by a word and a =. For your sample input that
> gives the desired result.
>
> HTH,
> Sebastian
> --
> Jabber: sep...@jabber.org
> ICQ: 205544826

This seems to work well. Searching for what I want is better than
excluding the bits I don't want.

Thanks