Thomas Leitner
10/19/2004 10:41:00 PM
On Mon, 18 Oct 2004 17:02:30 +0900
Mark Hubbart <discordantus@gmail.com> wrote:
| I defined the basic algorithm I was working on in my other email like
| this:
|
| > 1. break the range up into regexp friendly sections, like this:
| > (23..1024) =>
| > 23..29,
| > 30..99,
| > 100..999,
| > 1000..1019,
| > 1020..1024
| >
| > 2. convert each range into a string regexp:
| > 23..29 => "2[3-9]"
| > 30..99 => "[3-9]\\d"
| > 100..999 => "[1-9]\\d\\d"
| > 1000..1019 => "10[01]\\d"
| > 1020..1024 => "102[0-4]"
| >
| > 3. join them all together
| > /^0*(?:2[3-9]|[3-9]\d|[1-9]\d\d|10[01]\d|102[0-4])$/
|
I have had the same idea once my simple solution (using each integer in a range) failed with large ranges.
I assume the following:
- leading zeros are not accepted
- nothing is captured
- must match the whole line (anchored to start and end of line)
Output for this file on my machine:
[penguin 73] ~/work/rubyquiz > ruby 4_20041018_regexp.rb
Loaded suite 4_20041018_regexp
Started
.....
Finished in 0.021428 seconds.
5 tests, 108 assertions, 0 failures, 0 errors
[penguin 74] ~/work/rubyquiz >
So, better late than never, here is my code!
Thomas
ps. I tried to shorten the new Range methods as good as I could, can anything else be done??
------------------------------------------------------------------------
require 'test/unit/ui/console/testrunner'
class Integer
def to_rstr
"#{self}"
end
end
class Range
def get_regexps( a, b, negative = false )
arr = [a]
af = (a == 0 ? 1.0 : a.to_f)
bf = (b == 0 ? 1.0 : b.to_f)
1.upto( b.to_s.length-1 ) do |i|
pot = 10**i
num = (af/pot).ceil*(pot) # next higher number with i zeros
arr.insert( i, num ) if num < b
num = (bf/pot).floor*(pot) # next lower number with i zeros
arr.insert( -i, num )
end
arr.uniq!
arr.push( b+1 ) # +1 -> to handle it in the same way as the other elements
result = []
0.upto( arr.length - 2 ) do |i|
first = arr[i].to_s
second = (arr[i+1] - 1).to_s
str = ''
0.upto( first.length-1 ) do |j|
if first[j] == second[j]
str << first[j]
else
str << "[#{first[j].chr}-#{second[j].chr}]"
end
end
result << str
end
result = result.join('|')
result = "-(?:#{result})" if negative
result
end
def to_rstr
if first < 0 && last < 0
get_regexps( -last, -first, true )
elsif first < 0
get_regexps( 1, -first, true ) + "|" + get_regexps( 0, last )
else
get_regexps( first, last )
end
end
end
class Regexp
def self.build( *args )
Regexp.new("^(?:" + args.collect {|a| a.to_rstr}.flatten.uniq.join('|') + ")$" )
end
end
class RegexpTest < Test::Unit::TestCase
def rangeTest( first, last )
r = Regexp.build( first..last )
assert_match( r, "#{first}" )
assert_match( r, "#{(first + last)/2}" )
assert_match( r, "#{last}" )
assert_no_match( r, "#{first-1}" )
assert_no_match( r, "#{last+1}" )
end
def testBuild
lucky = Regexp.build( 3, 7 )
assert_match(lucky, "7")
assert_no_match(lucky, "13")
assert_match(lucky, "3")
rangeTest( 1, 12 )
month = Regexp.build( 1..12 )
assert_no_match(month, "0")
assert_match(month, "1")
assert_match(month, "12")
rangeTest( 1, 31 )
day = Regexp.build( 1..31 )
assert_match(day, "6")
assert_match(day, "16")
assert_no_match(day, "Tues")
rangeTest( 2000, 2005 )
year = Regexp.build( 98, 99, 2000..2005 )
assert_no_match(year, "04")
assert_match(year, "2004")
assert_match(year, "99")
rangeTest( 0, 1000000 )
num = Regexp.build( 0..1_000_000 )
assert_no_match(num, "-1")
end
def testPositive
rangeTest( 2, 10 )
rangeTest( 23432, 12312123 )
end
def testNegative
rangeTest( -10, -2 )
rangeTest( -15, 4 )
rangeTest( -100342, -343 )
end
def testOther
rangeTest( 5, 16 )
rangeTest( 10, 100 )
rangeTest( 11, 99 )
rangeTest( 1, 123456789 )
rangeTest( 10, 10 )
rangeTest( 5, 5 )
rangeTest( 0, 5 )
rangeTest( 0, 10 )
rangeTest( 1, 5 )
end
def testIllegal
num = Regexp.build( 1..12 )
assert_no_match( num, "012" )
assert_no_match( num, "A12" )
assert_no_match( num, "120" )
assert_no_match( num, "12A" )
assert_no_match( num, "3125" )
end
end