[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

confused by back refs in gsub

Peter Bailey

8/13/2007 7:00:00 PM

Can someone tell me why, in my code below, I'm getting part of the
original search in my substitution in my result, when, I'm not asking
for it, or at least, I don't think I'm asking for it.

Thanks,
Peter


Original line:
<registrantName>Normandy Group LLC</registrantName>

My Code:
xmlfile.gsub!(/<registrantName>(.*)<\/registrantName>/,
'<SUB.HEAD4>\&</SUB.HEAD4>')
I've tried "\1" instead of "\&," too. Same result. I've also tried
putting in "?" marks to make it non-greedy. Same result.

Yields:
<SUB.HEAD4><registrantName>Normandy Group
LLC</registrantName></SUB.HEAD4>

What I want:
<SUB.HEAD4>Normandy Group LLC(/SUB.HEAD4>
--
Posted via http://www.ruby-....

5 Answers

Jano Svitok

8/13/2007 7:09:00 PM

0

On 8/13/07, Peter Bailey <pbailey@bna.com> wrote:
> Can someone tell me why, in my code below, I'm getting part of the
> original search in my substitution in my result, when, I'm not asking
> for it, or at least, I don't think I'm asking for it.
>
> Thanks,
> Peter
>
>
> Original line:
> <registrantName>Normandy Group LLC</registrantName>
>
> My Code:
> xmlfile.gsub!(/<registrantName>(.*)<\/registrantName>/,
> '<SUB.HEAD4>\&</SUB.HEAD4>')
> I've tried "\1" instead of "\&," too. Same result. I've also tried
> putting in "?" marks to make it non-greedy. Same result.
>
> Yields:
> <SUB.HEAD4><registrantName>Normandy Group
> LLC</registrantName></SUB.HEAD4>
>
> What I want:
> <SUB.HEAD4>Normandy Group LLC(/SUB.HEAD4>

This works for me (I've used \1):

require 'test/unit'
class TestGsub < Test::Unit::TestCase
def test_replace
line = "<registrantName>Normandy Group LLC</registrantName>"

line.gsub!(/<registrantName>(.*)<\/registrantName>/,'<SUB.HEAD4>\1</SUB.HEAD4>')
assert_equal(line, '<SUB.HEAD4>Normandy Group LLC</SUB.HEAD4>')
end
end

Note that you have (/SUB.HEAD4> instead of </SUB.HEAD4> (the parenthesis)

Peter Bailey

8/13/2007 7:18:00 PM

0

Jano Svitok wrote:
> On 8/13/07, Peter Bailey <pbailey@bna.com> wrote:
>>
>> What I want:
>> <SUB.HEAD4>Normandy Group LLC(/SUB.HEAD4>
>
> This works for me (I've used \1):
>
> require 'test/unit'
> class TestGsub < Test::Unit::TestCase
> def test_replace
> line = "<registrantName>Normandy Group
> LLC</registrantName>"
>
> line.gsub!(/<registrantName>(.*)<\/registrantName>/,'<SUB.HEAD4>\1</SUB.HEAD4>')
> assert_equal(line, '<SUB.HEAD4>Normandy Group
> LLC</SUB.HEAD4>')
> end
> end
>
> Note that you have (/SUB.HEAD4> instead of </SUB.HEAD4> (the
> parenthesis)


Thank you, Jano. Yes, this worked for me now.

Cheers.

--
Posted via http://www.ruby-....

Peter Bailey

8/13/2007 8:06:00 PM

0

Felix Windt wrote:
>>
>> I've tried "\1" instead of "\&," too. Same result. I've also
>>
> If you're hardcoding replacements like that and are certain that your
> source
> is well formed xml, you could also just skip the back references:
>
> irb(main):001:0> "<registrantName>Normandy Group
> LLC</registrantName>".gsub!(/registrantName>/, 'SUB.HEAD4>')
> => "<SUB.HEAD4>Normandy Group LLC</SUB.HEAD4>"
> irb(main):002:0>

I don't quite understand your suggestion, Felix. Yes, I believe my
source data is well-formed XML. Are you suggesting that, somehow,
because it is well-formed XML, I can ignore the element closings? I
tried what I thought you meant by:

xmlfile.gsub!(/<registrantName>/, '<SUB.HEAD4>')

and, I got the subhead callout at the beginning of the data, but, the
closing element still is there--</registrantName>/

-Peter
--
Posted via http://www.ruby-....

Stefano Crocco

8/13/2007 8:15:00 PM

0

Alle lunedì 13 agosto 2007, Peter Bailey ha scritto:
> Felix Windt wrote:
> >> I've tried "\1" instead of "\&," too. Same result. I've also
> >
> > If you're hardcoding replacements like that and are certain that your
> > source
> > is well formed xml, you could also just skip the back references:
> >
> > irb(main):001:0> "<registrantName>Normandy Group
> > LLC</registrantName>".gsub!(/registrantName>/, 'SUB.HEAD4>')
> > => "<SUB.HEAD4>Normandy Group LLC</SUB.HEAD4>"
> > irb(main):002:0>
>
> I don't quite understand your suggestion, Felix. Yes, I believe my
> source data is well-formed XML. Are you suggesting that, somehow,
> because it is well-formed XML, I can ignore the element closings? I
> tried what I thought you meant by:
>
> xmlfile.gsub!(/<registrantName>/, '<SUB.HEAD4>')
>
> and, I got the subhead callout at the beginning of the data, but, the
> closing element still is there--</registrantName>/
>
> -Peter

What Felix is suggesting is that, if the source is valid XML, then it will
have the form

<elementName>text</elementName>

so, if you call gsub! passing a regexp matching elementName>, it should
replace both the opening and closing tags. When you tried, it didn't work
because you left the opening < in the regexp, which didn't match the closing
tag (it starts with </r, not <r). The correct call to gsub should be:

xmlfile.gsub!(/registrantName>/, 'SUB.HEAD4>')

(by the way, notice that the regexp doesn't match the starting '<', so it gets
removed from the replacement string)

I hope this helps

Stefano

Simon Krahnke

8/13/2007 9:42:00 PM

0

* Peter Bailey <pbailey@bna.com> (21:18) schrieb:

>> line.gsub!(/<registrantName>(.*)<\/registrantName>/,'<SUB.HEAD4>\1</SUB.HEAD4>')

> Thank you, Jano. Yes, this worked for me now.

Please note that regular expressions aren't a very good way to parse
XML. The above expression subgroup will match everything between the
first "<registrantName>" and the last "</registrantName>" which is
probably not what you want.

You can can use non-greedy *? as a workaround in this case.

mfg, simon .... l