[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Suggestion for string parsing

Ema Fuma

9/18/2008 8:48:00 AM

Hi all,
I would like to know if there's a better way to parse a string and
assing values to variables;

Ex:

Client=MPEG-4,390000,700000,24000

I can do

line =~ /(\w*)=([0-9A-Za-z -.:]*),([0-9]*),([0-9]*),([0-9]*)/

and

var1 = $1
var2 = $2
var3 = $3
var4 = $4
var4 = $5

But I'm sure there's a better way, even considering that the number of
parameters can increase and I don't want to write a long regular
expression rule, that is hard to read.

Thanks a lot for any tips
--
Posted via http://www.ruby-....

17 Answers

Peña, Botp

9/18/2008 9:07:00 AM

0

RnJvbTogTWUgTWUgW21haWx0bzplbWFudWVsZWZAdGlzY2FsaS5pdF0gDQojIHZhcjEgPSAkMQ0K
IyB2YXIyID0gJDINCiMgdmFyMyA9ICQzDQojIHZhcjQgPSAkNA0KIyB2YXI1ID0gJDUNCg0KaGlu
dDogYXJyYXkNCg0KZWcsDQoNCj4gbGluZQ0KPT4gIkNsaWVudD1NUEVHLTQsMzkwMDAwLDcwMDAw
MCwyNDAwMCINCg0KPiByZQ0KPT4gLyhcdyo/KT0oWzAtOUEtWmEteiAtLjpdKj8pLChcZCo/KSwo
XGQqPyksKFxkKikvDQoNCj4gbGluZS5tYXRjaChyZSkuY2FwdHVyZXMNCj0+IFsiQ2xpZW50Iiwg
Ik1QRUctNCIsICIzOTAwMDAiLCAiNzAwMDAwIiwgIjI0MDAwIl0NCg0KYWxzbywNCg0KPiB4LHks
ej1bMSwyLDNdDQo9PiBbMSwgMiwgM10NCg0KPiB4DQo9PiAxDQoNCj4geg0KPT4gMw0KDQo=

Chris Lowis

9/18/2008 9:20:00 AM

0

> But I'm sure there's a better way, even considering that the number of
> parameters can increase and I don't want to write a long regular
> expression rule, that is hard to read.

Are the parameters always delimited by commas ? In which case you could
modify the regular expression

line =~/(\w*)=(.*)/

Then

$2 #=> "MPEG-4,390000,700000,24000"
$2.split(",") #=> ["MPEG-4", "390000", "700000", "24000"]

Returns you the values after the '=' sign in line as an array. For more
power you could pass this sub-string to a CSV parsing library such as
FasterCSV.

Chris

--
Posted via http://www.ruby-....

Ema Fuma

9/18/2008 9:28:00 AM

0

Thans for answering,
I was thinking if there some kind of c sscanf,
so that I could parse and assing to variable at the same time

so if I have

line="Client=MPEG-4,390000,700000,24000"

something like:
sscanf(line, %s=%s %s %d %d %d, val1, val2, val3, val4, val5, val6)

I don't know if there's a similar string function for this in Ruby

thanks


--
Posted via http://www.ruby-....

Peña, Botp

9/18/2008 9:44:00 AM

0

RnJvbTogTWUgTWUgW21haWx0bzplbWFudWVsZWZAdGlzY2FsaS5pdF0gDQojIEkgd2FzIHRoaW5r
aW5nIGlmIHRoZXJlIHNvbWUga2luZCBvZiBjIHNzY2FuZiwNCiMgc28gdGhhdCBJIGNvdWxkIHBh
cnNlIGFuZCBhc3NpbmcgdG8gdmFyaWFibGUgYXQgdGhlIHNhbWUgdGltZQ0KIyBzbyBpZiBJIGhh
dmUNCiMgbGluZT0iQ2xpZW50PU1QRUctNCwzOTAwMDAsNzAwMDAwLDI0MDAwIg0KIyBzb21ldGhp
bmcgbGlrZToNCiMgc3NjYW5mKGxpbmUsICVzPSVzICVzICVkICVkICVkLCB2YWwxLCB2YWwyLCB2
YWwzLCB2YWw0LCB2YWw1LCB2YWw2KQ0KIyBJIGRvbid0IGtub3cgaWYgdGhlcmUncyBhIHNpbWls
YXIgc3RyaW5nIGZ1bmN0aW9uIGZvciB0aGlzIGluIFJ1YnkNCg0KeW91IGFyZSByaWdodCBvbiBz
Y2FuZi4NCnRoZXJlIGlzIG9uZSBpbiBydWJ5LCBhbmQgaXQncyBhIGxvdCBzaW1wbGVyIHRoYW4g
eW91IHRoaW5rDQoNCnlvdSdsbCBoYXZlIHRvIHJlcXVpcmUgaXQgdGhvdWdoIGJlZm9yZSB1c2lu
ZywNCg0KZWcsDQoNCj4gcmVxdWlyZSAnc2NhbmYnDQo9PiBmYWxzZQ0KDQo+IGxpbmUuc2NhbmYo
IiU2cz0lNnMsJWQsJWQsJWQsJWQiKQ0KPT4gWyJDbGllbnQiLCAiTVBFRy00IiwgMzkwMDAwLCA3
MDAwMDAsIDI0MDAwXQ0KDQo=

Ema Fuma

9/18/2008 10:07:00 AM

0

>> line.scanf("%6s=%6s,%d,%d,%d,%d")
> => ["Client", "MPEG-4", 390000, 700000, 24000]

Thanks
the problem I have now is that the size of the string is not fixed to 6
chars.
And if I try to parse like:
line.scanf("%s=%s,%d,%d,%d,%d")
It doesn't parse the string.

Is there a way to parse any string?
thanks again


--
Posted via http://www.ruby-....

Ema Fuma

9/18/2008 12:46:00 PM

0

is there a way to use the scanf to parse a string not knowing how many
chars?
thanks
--
Posted via http://www.ruby-....

Brian Candler

9/18/2008 12:55:00 PM

0

> is there a way to use the scanf to parse a string not knowing how many
> chars?

I'd still use Regexp.

line="Client=MPEG-4,390000,700000,24000"
val1,val2,val3,val4,val5 =
/^(\w*)=([^,]*),(\d*),(\d*),(\d*)/.match(line).captures

Another way:

def handle_line(v1,v2,v3,v4,v5)
puts "I got it! #{v1} etc"
end
...
if /^(\w*)=([^,]*),(\d*),(\d*),(\d*)/ =~ line
handle_line(*$~.captures)
end
--
Posted via http://www.ruby-....

Ema Fuma

9/18/2008 1:03:00 PM

0

Brian Candler wrote:
>> is there a way to use the scanf to parse a string not knowing how many
>> chars?
>
> I'd still use Regexp.
>
> line="Client=MPEG-4,390000,700000,24000"
> val1,val2,val3,val4,val5 =
> /^(\w*)=([^,]*),(\d*),(\d*),(\d*)/.match(line).captures
>
> Another way:
>
> def handle_line(v1,v2,v3,v4,v5)
> puts "I got it! #{v1} etc"
> end
> ...
> if /^(\w*)=([^,]*),(\d*),(\d*),(\d*)/ =~ line
> handle_line(*$~.captures)
> end

thanks,
but what I would like to avoid regexp, it seems strange to me that
there's no way to parse a string providing the structure.
scanf would be great but if I put %s it doesn't get the string, unless I
put the number of chars.
--
Posted via http://www.ruby-....

Brian Candler

9/18/2008 1:09:00 PM

0

> thanks,
> but what I would like to avoid regexp, it seems strange to me that
> there's no way to parse a string providing the structure.
> scanf would be great but if I put %s it doesn't get the string, unless I
> put the number of chars.

%s is terminated by whitespace. You have no way of telling scanf that
you want to treat "=" (after the first field) and "," (after the second
field) as separators, rather than characters to be consumed by %s.

Well, as long as your data doesn't contain spaces, you could do

line="Client=MPEG-4,390000,700000,24000"
line.gsub(/[=,]/,' ').scanf("%s %s %d %d %d")
--
Posted via http://www.ruby-....

Lloyd Linklater

9/18/2008 1:13:00 PM

0

Me Me wrote:
> Brian Candler wrote:
>>> is there a way to use the scanf to parse a string not knowing how many
>>> chars?
>>
>> I'd still use Regexp.
>>
>
> thanks,
> but what I would like to avoid regexp, it seems strange to me that
> there's no way to parse a string providing the structure.

Well, you can always write a BreakApart() algorithm but I must agree
with Brian that RegEx is the way to go. After all, that is what RegEx
does. I was tempted to add BreakApart() code here but I am neither sure
that it is what you really want nor that it is the best solution for the
problem at hand.

What is the *actual* problem? If it is what you said ("I would like to
know if there's a better way to parse a string and assing values to
variables;") then RegEx is a fine solution. If you reject a good
solution and seek something else, then it can only be that you are
actually seeking a solution to a different problem. So, what are you
*really* looking for?
--
Posted via http://www.ruby-....