[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Detecting number ranges

Jay Levitt

9/26/2007 5:16:00 AM

I had to write a script this evening to take an unsorted input file of the
form:

database 1 is on server 3
database 8 is on server 7
....

and output it in the form:

server 3 handles database 1 through 7
server 7 handles database 8 through 11

My brain was in procedural mode, and I wrote the following ugliness:

---

def output_range(start_db, end_db, server_n)
puts "server #{server_n} handles database #{start_db} through #{end_db}"
end

open("db_list.txt").readlines.each do |line|
m = line.match(/^database (.*) is on server (.*)/)
db_n = m[1].to_i
server_n = m[2].to_i

db_servers[db_n] = server_n
end

range_start = 1

db_servers.each_index do |i|
next if i == 1
if db_servers[i] != db_servers[i - 1]
output_range(range_start, i - 1, db_servers[i - 1])
range_start = i
end
end

last_db = db_servers.size - 1
output_range(range_start, last_db, db_servers[last_db])
---

which in fact is not only horrid to look at, but has a bug (it doesn't like
it if the input skips over a database number, i.e. db 233 and 235 are
present but 234 is missing).

I feel like there's a much nicer way to express this in Ruby, but can't
think of what it might be...


--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://... | - Kristoffer
15 Answers

7stud 7stud

9/26/2007 7:49:00 AM

0

I'm not sure why data like this:

database 1 is on server 3
database 8 is on server 3

should produce:

server 3 handles databases 1 to 8

since server 3 only hands db's 1 and 8.

Or, what to do if there is only one db for the server, say:

database 1 is on server 3

Should the output be:

server 3 handles database 1 to 1

Can there be duplicate lines like this:

database 1 is on server 3
database 8 is on server 7
database 1 is on server 3

Also, is the output supposed to be in ascending order by server?

In any case, here is the input I used:

database 8 is on server 7
database 10 is on server 7
database 10 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144

and here is the output:

server 3 handles databases 1 to 133
server 7 handles databases 8 to 10
server 9 handles database 5
server 144 handles database 4


require "set"

def output_data(arr)
arr.each do |elmt|
min = elmt[1].min
max = elmt[1].max

if min == max
puts "server #{elmt[0]} handles database #{min}"
else
puts "server #{elmt[0]} handles databases #{min} to #{max}"
end
end
end



servers = Hash.new{|hash, key| hash[key] = Set.new}

File.open("data.txt") do |file|
file.each_line do |line|
nums = line.scan(/\d+/)
servers[nums[1].to_i].add(nums[0].to_i)
end
end

data = servers.sort
output_data(data)


--
Posted via http://www.ruby-....

Jesús Gabriel y Galán

9/26/2007 8:19:00 AM

0

On 9/26/07, Jay Levitt <jay+news@jay.fm> wrote:
> I had to write a script this evening to take an unsorted input file of the
> form:
>
> database 1 is on server 3
> database 8 is on server 7
> ...
>
> and output it in the form:
>
> server 3 handles database 1 through 7
> server 7 handles database 8 through 11
>
> I feel like there's a much nicer way to express this in Ruby, but can't
> think of what it might be...

Does this feel nicer?

$ cat db_ranges.rb && ruby db_ranges.rb
# db_ranges.rb
# 26 September 2007
#

require 'enumerator'

file =<<END
database 8 is on server 7
database 10 is on server 7
database 9 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144
END


dbs = Hash.new {|h,k| h[k] = []}

file.each do |line|
m = line.match(/^database (.*) is on server (.*)/)
db_n = m[1].to_i
server_n = m[2].to_i
dbs[server_n] << db_n
end

dbs.each do |server, db|
result = [[db[0]]]
db.sort.each_cons(2) {|x, y| if y == x + 1 then result.last << y;
else result << [y]; end}
result.each {|x| puts "Server #{server} handles databases #{x.first}
to #{x.last}"}
end

Server 144 handles databases 4 to 4
Server 7 handles databases 8 to 10
Server 3 handles databases 1 to 2
Server 3 handles databases 133 to 133
Server 9 handles databases 5 to 5

Kind regards,

Jesus.

Harry Kakueki

9/26/2007 8:57:00 AM

0

On 9/26/07, Jay Levitt <jay+news@jay.fm> wrote:
> I had to write a script this evening to take an unsorted input file of the
> form:
>
> database 1 is on server 3
> database 8 is on server 7
> ...
>
> and output it in the form:
>
> server 3 handles database 1 through 7
> server 7 handles database 8 through 11
>
>

Does this do it?


fp_arr = ["db 11 is on sv 3","db 9 is on sv 7","db 13 is on sv 3","db
12 is on sv 3","db 8 is on sv 7","db 18 is on sv 2"]

nums,info = [],{}
fp_arr.each {|x| nums << x.scan(/\d+/)}
sv = nums.map{|x| x[1]}.uniq
sv.each {|t| info[t] = nums.select{|x| x[1] == t}.sort.map {|a| a[0]}}
info.keys.sort.each {|j| print "server #{j} handles database
#{info[j][0]} through #{info[j][-1]}\n"}

#server 2 handles database 18 through 18
#server 3 handles database 11 through 13
#server 7 handles database 8 through 9

Harry

--
A Look into Japanese Ruby List in English
http://www.ka...

William James

9/26/2007 9:57:00 PM

0

On Sep 26, 12:16 am, Jay Levitt <jay+n...@jay.fm> wrote:
> I had to write a script this evening to take an unsorted input file of the
> form:
>
> database 1 is on server 3
> database 8 is on server 7
> ...
>
> and output it in the form:
>
> server 3 handles database 1 through 7
> server 7 handles database 8 through 11
>
> My brain was in procedural mode, and I wrote the following ugliness:
>
> ---
>
> def output_range(start_db, end_db, server_n)
> puts "server #{server_n} handles database #{start_db} through #{end_db}"
> end
>
> open("db_list.txt").readlines.each do |line|
> m = line.match(/^database (.*) is on server (.*)/)
> db_n = m[1].to_i
> server_n = m[2].to_i
>
> db_servers[db_n] = server_n
> end
>
> range_start = 1
>
> db_servers.each_index do |i|
> next if i == 1
> if db_servers[i] != db_servers[i - 1]
> output_range(range_start, i - 1, db_servers[i - 1])
> range_start = i
> end
> end
>
> last_db = db_servers.size - 1
> output_range(range_start, last_db, db_servers[last_db])
> ---
>
> which in fact is not only horrid to look at, but has a bug (it doesn't like
> it if the input skips over a database number, i.e. db 233 and 235 are
> present but 234 is missing).

I'm not in the mood for hash.

ary = DATA.readlines
ary.map{|s| s[/\d+$/] }.uniq.each{|server|
db = ary.grep(/ #{server}$/).map{|s|
s[/\d+/].to_i}.sort
puts "Server #{server} databases #{db[0]} to #{db[-1]}"
}

__END__
database 8 is on server 7
database 10 is on server 7
database 9 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144


Jay Levitt

9/27/2007 4:36:00 PM

0

On Wed, 26 Sep 2007 16:49:09 +0900, 7stud -- wrote:

> I'm not sure why data like this:
>
> database 1 is on server 3
> database 8 is on server 3
>
> should produce:
>
> server 3 handles databases 1 to 8
>
> since server 3 only hands db's 1 and 8.

Sorry, I was being a lazy typist. (In my defense, I only said the input
was "in the form!") Assume that, for the example I gave, the output is
correct and the input has more data than I showed.

>
> Or, what to do if there is only one db for the server, say:
>
> database 1 is on server 3
>
> Should the output be:
>
> server 3 handles database 1 to 1

Yep, exactly.

>
> Can there be duplicate lines like this:
>
> database 1 is on server 3
> database 8 is on server 7
> database 1 is on server 3

Nope, never. The actual input is akin to a list of symbolic NFS links,
something like:

/prod_dir/db1.file -> /mounts/server3/db1.file

So db1.file can only ever point to one place, and can only ever show up in
the input once.

For that same reason, the input's initially unsorted - it's *mostly*
sorted, but alphanumerically (db1, db11, db12, ... db199, db2, db21, etc.)

> Also, is the output supposed to be in ascending order by server?

Nope, not necessary. Ascending by DB number probably makes it easier to
manually edit or check, though.

> In any case, here is the input I used:
>
> database 8 is on server 7
> database 10 is on server 7
> database 10 is on server 7
> database 5 is on server 9
> database 1 is on server 3
> database 2 is on server 3
> database 133 is on server 3
> database 4 is on server 144
>
> and here is the output:
>
> server 3 handles databases 1 to 133
> server 7 handles databases 8 to 10
> server 9 handles database 5
> server 144 handles database 4

Hmm, looks like it assumes the whole range if there are missing elements.
(I guess I didn't specify that behavior, did I?) If a database isn't
listed in the input, it shouldn't be included in ranges in the output.

So the actual output from the above input should be (in any order):

server 3 handles database 1 to 2
server 3 handles databases 133 to 133
server 7 handles databases 8 to 8
server 7 handles databases 10 to 10
server 9 handles databases 5 to 5
server 144 handles databases 4 to 4

I -think- the "missing database" problem is endemic to the Set approach,
right?

Jay


>
> require "set"
>
> def output_data(arr)
> arr.each do |elmt|
> min = elmt[1].min
> max = elmt[1].max
>
> if min == max
> puts "server #{elmt[0]} handles database #{min}"
> else
> puts "server #{elmt[0]} handles databases #{min} to #{max}"
> end
> end
> end
>
> servers = Hash.new{|hash, key| hash[key] = Set.new}
>
> File.open("data.txt") do |file|
> file.each_line do |line|
> nums = line.scan(/\d+/)
> servers[nums[1].to_i].add(nums[0].to_i)
> end
> end
>
> data = servers.sort
> output_data(data)


--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://... | - Kristoffer

Jay Levitt

9/27/2007 4:45:00 PM

0

On Wed, 26 Sep 2007 17:18:58 +0900, Jesús Gabriel y Galán wrote:

> db.sort.each_cons(2) {|x, y| if y == x + 1 then result.last << y;
> else result << [y]; end}

Nice! I think this is the magic I was looking for.

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://... | - Kristoffer

Jay Levitt

9/27/2007 4:51:00 PM

0

On Wed, 26 Sep 2007 17:57:16 +0900, Harry Kakueki wrote:

> fp_arr = ["db 11 is on sv 3","db 9 is on sv 7","db 13 is on sv 3","db
> 12 is on sv 3","db 8 is on sv 7","db 18 is on sv 2"]
>
> nums,info = [],{}
> fp_arr.each {|x| nums << x.scan(/\d+/)}
> sv = nums.map{|x| x[1]}.uniq
> sv.each {|t| info[t] = nums.select{|x| x[1] == t}.sort.map {|a| a[0]}}
> info.keys.sort.each {|j| print "server #{j} handles database
> #{info[j][0]} through #{info[j][-1]}\n"}
>
> #server 2 handles database 18 through 18
> #server 3 handles database 11 through 13
> #server 7 handles database 8 through 9

Hmm, that fails if I add "db 20 is on sv 2", which should retain "server 2
handles db 18 through 18" but also add "srver 2 handles db 20 through 20".
I can't quite follow what it's doing but I'm going to have to study that
more; map is a powerful tool in general.

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://... | - Kristoffer

Jay Levitt

9/27/2007 5:00:00 PM

0

On Wed, 26 Sep 2007 14:57:04 -0700, William James wrote:

> I'm not in the mood for hash.
>
> ary = DATA.readlines
> ary.map{|s| s[/\d+$/] }.uniq.each{|server|
> db = ary.grep(/ #{server}$/).map{|s|
> s[/\d+/].to_i}.sort
> puts "Server #{server} databases #{db[0]} to #{db[-1]}"
> }

Definitely wins the "shortest" prize! But, um, no output:

#=> Server databases to

Looks like .each is called only once, with |server| == nil.

Weirder, when I just run:

ary = open("fake_data").readlines
ary.map{|s| s[/\d+$/] }.inspect
return

I get:

#=> wjames.rb:4: unexpected return (LocalJumpError)

I would have thought that ary.map{|s| s[/d+$] } should return an array
itself. Although I'm not clear what that line does - doesn't it look for a
1+ digit number n, and return the nth character of the line?

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://... | - Kristoffer

William James

9/27/2007 5:27:00 PM

0

On Sep 27, 12:00 pm, Jay Levitt <jay+n...@jay.fm> wrote:
> On Wed, 26 Sep 2007 14:57:04 -0700, William James wrote:
> > I'm not in the mood for hash.
>
> > ary = DATA.readlines
> > ary.map{|s| s[/\d+$/] }.uniq.each{|server|
> > db = ary.grep(/ #{server}$/).map{|s|
> > s[/\d+/].to_i}.sort
> > puts "Server #{server} databases #{db[0]} to #{db[-1]}"
> > }
>
> Definitely wins the "shortest" prize! But, um, no output:
>
> #=> Server databases to

I just copied, pasted, and ran the code. Output:
Server 7 databases 8 to 10
Server 9 databases 5 to 5
Server 3 databases 1 to 133
Server 144 databases 4 to 4

Change
ary = DATA.readlines
to
ary = DATA.readlines ; p ary
and see what you get.
Your lack of output could be caused by an added space at the end
of each data line.

>
> Looks like .each is called only once, with |server| == nil.
>
> Weirder, when I just run:
>
> ary = open("fake_data").readlines
> ary.map{|s| s[/\d+$/] }.inspect
> return
>
> I get:
>
> #=> wjames.rb:4: unexpected return (LocalJumpError)
>
> I would have thought that ary.map{|s| s[/d+$] } should return an array
> itself. Although I'm not clear what that line does - doesn't it look for a
> 1+ digit number n, and return the nth character of the line?

irb(main):001:0> "is it foo bar again?"[ /.oo \w+/ ]
=> "foo bar"

s[ /\d+$/ ] returns a sequence of digits at the end of the string (or
just before
"\n" in the string).

Extra spaces won't bother this version:

ary = DATA.readlines.map{|s| s.scan(/\d+/).map{|s| s.to_i}}
p ary
ary.map{|a| a[1] }.uniq.each{|server|
db = ary.select{|a| a[1]==server}.map{|a| a[0]}.sort
puts "Server #{server} databases #{db[0]} to #{db[-1]}"
}

__END__
database 8 is on server 7
database 10 is on server 7
database 9 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144

Harry Kakueki

9/27/2007 10:14:00 PM

0

On 9/28/07, Jay Levitt <jay+news@jay.fm> wrote:
> On Wed, 26 Sep 2007 17:57:16 +0900, Harry Kakueki wrote:
>
> > fp_arr = ["db 11 is on sv 3","db 9 is on sv 7","db 13 is on sv 3","db
> > 12 is on sv 3","db 8 is on sv 7","db 18 is on sv 2"]
> >
> > nums,info = [],{}
> > fp_arr.each {|x| nums << x.scan(/\d+/)}
> > sv = nums.map{|x| x[1]}.uniq
> > sv.each {|t| info[t] = nums.select{|x| x[1] == t}.sort.map {|a| a[0]}}
> > info.keys.sort.each {|j| print "server #{j} handles database
> > #{info[j][0]} through #{info[j][-1]}\n"}
> >
> > #server 2 handles database 18 through 18
> > #server 3 handles database 11 through 13
> > #server 7 handles database 8 through 9
>
> Hmm, that fails if I add "db 20 is on sv 2", which should retain "server 2
> handles db 18 through 18" but also add "srver 2 handles db 20 through 20".
> I can't quite follow what it's doing but I'm going to have to study that
> more; map is a powerful tool in general.
>

In that case, it outputs "server 2 handles database 18 to 20".
I also thought that is what you wanted. But after seeing your further
explanation, I see that it is not.
Here is the same code with comments I probably should have included
the first time.


fp_arr = ["db 11 is on sv 3","db 9 is on sv 7","db 13 is on sv 3","db
12 is on sv 3","db 8 is on sv 7","db 18 is on sv 2","db 20 is on sv
2"]

nums,info = [],{}
fp_arr.each {|x| nums << x.scan(/\d+/)}
#p nums #[["11", "3"], ["9", "7"], ["13", "3"], ["12", "3"]....]

sv = nums.map{|x| x[1]}.uniq #["3", "7", "2"] server list
sv.each {|t| info[t] = nums.select{|x| x[1] == t}.sort.map {|a| a[0]}}
#p info #{"7"=>["8", "9"], "2"=>["18", "20"], "3"=>["11", "12", "13"]}

info.keys.sort.each do |j|
print "server #{j} handles database #{info[j][0]} to #{info[j][-1]}\n"
end


#server 2 handles database 18 to 20
#server 3 handles database 11 to 13
#server 7 handles database 8 to 9


Harry

--
A Look into Japanese Ruby List in English
http://www.ka...