[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Problem modifying captured regexp results

Paul Van Delst

10/20/2006 4:00:00 PM

Hello,

I'm using ruby to automatically generate Fortran95 code and I'm using a regular expression
to parse the following type of definition line:

REAL(fp), DIMENSION(Dim1,Dim2) :: Arr2 ! Description of Arr2

The regexp I'm using works fine and I build a array of hashes for each definition, i.e.

if line =~ componentRegexp
# We have matched an array component definition
arrayList<<{"type"=>$1,
"param"=>$2,
"dimlist"=>$3,
"name"=>$4,
"description"=>$5}
puts(arrayList.last.inspect)
else
# No match, so raise an error
raise StandardError, "Invalid array definition, #{$~}"
end

which works fine. The inspect o/p gives me:

{"name"=>"Arr2", "type"=>"REAL", "description"=>"Description of Arr2", "param"=>"fp",
"dimlist"=>"Dim1,Dim2"}

However, what I want to do is modify the dimlist in the hash so it is a string array
"dimlist"=>["Dim1","Dim2"]
rather than a single string,
"dimlist"=>"Dim1,Dim2"

Because the number of dimensions in the dimlist can vary from 1 to 7, rather than do the
splitting in the regexp, I tried doing it in the arrayList concatenation using the split
method like so,

arrayList<<{"type"=>$1,
"param"=>$2,
"dimlist"=>$3.split(/\s*,\s*/), # <--- split dimlist on ","
"name"=>$4,
"description"=>$5}

but I've found that the above operation on the $3 captured result appears to "wipe" the
subsequent entries $4 (name) and $5 (description). For example, the output of
puts(arrayList.last.inspect)
on the above gives me,

{"name"=>nil, "type"=>"REAL", "description"=>nil, "param"=>"fp", "dimlist"=>["Dim1", "Dim2"]}

Note that the "dimlist" is how I want it, but "name" and "description" entries are now nil.

So can someone elaborate on why the above split operation on captured regexp results seems
to bugger up the other captured results? Does this issue extend to *any* operation on
captured regexp results?

I've looked through the pickaxe and cookbook, but no information on this was immediately
apparent.

Thanks for any info.

cheers,

paulv

--
Paul van Delst Ride lots.
CIMSS @ NOAA/NCEP/EMC Eddy Merckx
Ph: (301)763-8000 x7748
Fax:(301)763-8545
3 Answers

Pit Capitain

10/20/2006 4:23:00 PM

0

Paul van Delst schrieb:
> (...)
>
> if line =~ componentRegexp
> # We have matched an array component definition
> arrayList<<{"type"=>$1,
> "param"=>$2,
> "dimlist"=>$3.split(/\s*,\s*/), # <--- split dimlist
> "name"=>$4,
> "description"=>$5}
>
> but I've found that the above operation on the $3 captured result
> appears to "wipe" the subsequent entries $4 (name) and $5 (description).

Paul, the problem is that #split with a Regexp internally executes some
Regexp matches which change $1, $2 etc. You have to capture the results
of the first match before executing the split.

Regards,
Pit

Paul Van Delst

10/20/2006 5:03:00 PM

0

Pit Capitain wrote:
> Paul van Delst schrieb:
>> (...)
>>
>> if line =~ componentRegexp
>> # We have matched an array component definition
>> arrayList<<{"type"=>$1,
>> "param"=>$2,
>> "dimlist"=>$3.split(/\s*,\s*/), # <--- split dimlist
>> "name"=>$4,
>> "description"=>$5}
>>
>> but I've found that the above operation on the $3 captured result
>> appears to "wipe" the subsequent entries $4 (name) and $5 (description).
>
> Paul, the problem is that #split with a Regexp internally executes some
> Regexp matches which change $1, $2 etc. You have to capture the results
> of the first match before executing the split.

Aha! That is the answer to the question (see my other post).

Bewdy. Thanks Pit and Gavin.

cheers,

paulv

--
Paul van Delst Ride lots.
CIMSS @ NOAA/NCEP/EMC Eddy Merckx
Ph: (301)763-8000 x7748
Fax:(301)763-8545

Jano Svitok

10/21/2006 7:21:00 PM

0

On 10/20/06, Pit Capitain <pit@capitain.de> wrote:
> Paul van Delst schrieb:
> > (...)
> >
> > if line =~ componentRegexp
> > # We have matched an array component definition
> > arrayList<<{"type"=>$1,
> > "param"=>$2,
> > "dimlist"=>$3.split(/\s*,\s*/), # <--- split dimlist
> > "name"=>$4,
> > "description"=>$5}
> >
> > but I've found that the above operation on the $3 captured result
> > appears to "wipe" the subsequent entries $4 (name) and $5 (description).
>
> Paul, the problem is that #split with a Regexp internally executes some
> Regexp matches which change $1, $2 etc. You have to capture the results
> of the first match before executing the split.

In this case, it's really easy: just reorder the lines so that the one
containing split will be the last (hash changes the order anyway):

arrayList<<{"type"=>$1,
"param"=>$2,
"name"=>$4,
"description"=>$5,
"dimlist"=>$3.split(/\s*,\s*/)}

This will work fine as in the moment split messes up those $x, you
don't need them any more. Obviously this would not work if there were
more splits.