Asp Forum - Help with a regexp

Daniel Schierbeck

7/12/2006 2:09:00 PM

I'm trying to write a regular expression that matches bencoded strings,
i.e. strings on the form x:y, where x is the numeric length of y.

This is valid:

6:foobar

while this is not:

4:foo

I've tried using #{$1} inside the regexp, but it seems $1 is still nil
at that point. Here's what I've got so far:

/^([1-9]+\d*):(\w){#{$1}}$/

(it ain't working)

I'm matching with `string =~ regexp'; would it be better if I did it
another way?

Cheers,
Daniel

51 Answers

Pit Capitain

7/12/2006 4:39:00 PM

Daniel Schierbeck schrieb:
> I'm trying to write a regular expression that matches bencoded strings,
> i.e. strings on the form x:y, where x is the numeric length of y.
>
> This is valid:
>
> 6:foobar
>
> while this is not:
>
> 4:foo
>
> I've tried using #{$1} inside the regexp, but it seems $1 is still nil
> at that point. Here's what I've got so far:
>
> /^([1-9]+\d*):(\w){#{$1}}$/
>
> (it ain't working)
>
> I'm matching with `string =~ regexp'; would it be better if I did it
> another way?

Daniel, I'm pretty sure you can't do it with a single regexp test. Why
not split the test in two parts, as in

str =~ /^([1-9]\d*):(\w+)$/ && $2.size == $1.to_i

Regards,
Pit

studlee2@gmail.com

7/12/2006 5:01:00 PM

Daniel,
When you grab the data it will be in a string format, so you need
to convert it to a number (most likely integer). Then you can compare
it with the size of the second value you grabbed. I would write it
like this (modify as needed):

print "enter data"
a = gets
valid = /^(\d*):(\w*)/

check = valid.match(a)

if(check[1].to_i == check[2].size)

<Your code here>

end

I hope this helps.

_Steve

Daniel Schierbeck wrote:
> I'm trying to write a regular expression that matches bencoded strings,
> i.e. strings on the form x:y, where x is the numeric length of y.
>
> This is valid:
>
> 6:foobar
>
> while this is not:
>
> 4:foo
>
> I've tried using #{$1} inside the regexp, but it seems $1 is still nil
> at that point. Here's what I've got so far:
>
> /^([1-9]+\d*):(\w){#{$1}}$/
>
> (it ain't working)
>
> I'm matching with `string =~ regexp'; would it be better if I did it
> another way?
>
>
> Cheers,
> Daniel

Logan Capaldo

7/12/2006 5:22:00 PM

On Jul 12, 2006, at 10:10 AM, Daniel Schierbeck wrote:

> I'm trying to write a regular expression that matches bencoded
> strings, i.e. strings on the form x:y, where x is the numeric
> length of y.
>
> This is valid:
>
> 6:foobar
>
> while this is not:
>
> 4:foo
>
> I've tried using #{$1} inside the regexp, but it seems $1 is still
> nil at that point. Here's what I've got so far:
>
> /^([1-9]+\d*):(\w){#{$1}}$/
>
> (it ain't working)
>
> I'm matching with `string =~ regexp'; would it be better if I did
> it another way?
>
>
> Cheers,
> Daniel
>

I believe that this is not in the set of languages a Regexp can
match. As others have already suggested you'll have to do it in 2 steps.

Daniel Schierbeck

7/12/2006 6:47:00 PM

studlee2@gmail.com wrote:
> Daniel,
> When you grab the data it will be in a string format, so you need
> to convert it to a number (most likely integer). Then you can compare
> it with the size of the second value you grabbed. I would write it
> like this (modify as needed):
>
> print "enter data"
> a = gets
> valid = /^(\d*):(\w*)/
>
> check = valid.match(a)
>
> if(check[1].to_i == check[2].size)
>
> <Your code here>
>
> end

Thank you for your reply.

My problem is that I have a string like this: "3:foo6:monkey5:sheep",
which I need to separate into ["foo", "monkey", "sheep"]. The values can
contain numeric values, so splitting at \d won't work. This is what
makes it difficult:

"3:ab23:cat5:sheep" => ["ab2", "cat", "sheep"]

I need to grab the number, then read that many characters, then read the
next number, etc.

Cheers,
Daniel

dblack

7/12/2006 6:55:00 PM

Marcin Mielzynski

7/12/2006 7:06:00 PM

Daniel Schierbeck wrote:

> My problem is that I have a string like this: "3:foo6:monkey5:sheep",
> which I need to separate into ["foo", "monkey", "sheep"]. The values can
> contain numeric values, so splitting at \d won't work. This is what
> makes it difficult:
>
> "3:ab23:cat5:sheep" => ["ab2", "cat", "sheep"]
>
> I need to grab the number, then read that many characters, then read the
> next number, etc.

"3:foo6:monkey5:sheep".scan(/(\d+)\:([^\d]+)/){|(num,str)|
if num.to_i == str.length
# correct
else
# not correct
end
}

lopex

Logan Capaldo

7/12/2006 7:13:00 PM

On Jul 12, 2006, at 2:54 PM, dblack@wobblini.net wrote:

> Hi --
>
> On Thu, 13 Jul 2006, Daniel Schierbeck wrote:
>
>> My problem is that I have a string like this:
>> "3:foo6:monkey5:sheep", which I need to separate into ["foo",
>> "monkey", "sheep"]. The values can contain numeric values, so
>> splitting at \d won't work. This is what makes it difficult:
>>
>> "3:ab23:cat5:sheep" => ["ab2", "cat", "sheep"]
>>
>> I need to grab the number, then read that many characters, then
>> read the next number, etc.
>
> How do you know, in the case of:
>
> 2:ab23:cat
>
> which of the two is invalid?
>
>
> David
>

I don't believe you do. OTOH I'm not sure if you should care. If
one's invalid, the whole thing is invalid.

Anyway, here's my solution:

% cat n_colon_s.rb
require 'strscan'
def parse_n_colon_s(s)
scanner = StringScanner.new(s)
results = []
until scanner.empty?
if scanner.scan(/(\d+):/)
n = scanner[1].to_i
raise 'Malformed String' unless (s = scanner.scan(/\w{#{n}}/))
results << s
else
raise 'Malformed string'
end
end
results
end

p parse_n_colon_s("2:ab3:cat")
p parse_n_colon_s("2:ab23:cat")

% ruby n_colon_s.rb
["ab", "cat"]
-:18:in `parse_n_colon_s': Malformed String (RuntimeError)
from -:28

> --
> http://www.rubypoweran... => Ruby/Rails training & consultancy
> http://www.manning... => RUBY FOR RAILS, the Ruby book for
> Rails developers
> http://dablog.r... => D[avid ]A[. ]B[lack's][ Web]log
> dblack@wobblini.net => me
>

Tom Werner

7/12/2006 7:20:00 PM

Daniel Schierbeck wrote:
> studlee2@gmail.com wrote:
>> Daniel,
>> When you grab the data it will be in a string format, so you need
>> to convert it to a number (most likely integer). Then you can compare
>> it with the size of the second value you grabbed. I would write it
>> like this (modify as needed):
>>
>> print "enter data"
>> a = gets
>> valid = /^(\d*):(\w*)/
>>
>> check = valid.match(a)
>>
>> if(check[1].to_i == check[2].size)
>>
>> <Your code here>
>>
>> end
>
> Thank you for your reply.
>
> My problem is that I have a string like this: "3:foo6:monkey5:sheep",
> which I need to separate into ["foo", "monkey", "sheep"]. The values
> can contain numeric values, so splitting at \d won't work. This is
> what makes it difficult:
>
> "3:ab23:cat5:sheep" => ["ab2", "cat", "sheep"]
>
> I need to grab the number, then read that many characters, then read
> the next number, etc.
>
>
> Cheers,
> Daniel
>
>

require 'strscan'

array = []
ss = StringScanner.new('3:ab23:cat5:sheep')

while !ss.eos?
len = ss.scan(/\d+:/).chop.to_i
array << ss.peek(len)
ss.pos += len
end

p array

=> ["ab2", "cat", "sheep"]

Tom

--
Tom Werner
Helmets to Hardhats
Software Developer
tom@helmetstohardhats.org
www.helmetstohardhats.org

Gavin Kistner

7/12/2006 7:26:00 PM

require 'strscan'
s = StringScanner.new( "3:ab23:cat5:sheep" )
words = []
until s.eos?
if digits = s.scan( /\d+/ )
digits = digits.to_i
s.pos += 1
words << s.peek( digits )
s.pos += digits
else
p words
abort "Couldn't find digits for: #{s.rest}"
end
end
p words
#=> ["ab2", "cat", "sheep"]

dblack

7/12/2006 7:28:00 PM

comp.lang.ruby

Help with a regexp

Daniel Schierbeck

Pit Capitain

studlee2@gmail.com

Logan Capaldo

Daniel Schierbeck

dblack

Marcin Mielzynski

Logan Capaldo

Tom Werner

Gavin Kistner

dblack

x Login to ForumsZone