[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

[ANN] rand.rb 0.9: Random access methods for Enumerables

Ilmari Heikkinen

12/15/2004 12:25:00 AM

Hello all, here's a little convenience library we whipped up a couple
weeks back with Christian Neukirchen. It's very simple, but we hope
someone might find it useful :)
---

rand.rb is a library for picking random elements and shuffling.
This work is licensed under the same terms as Ruby itself.

Overview
========
rand.rb adds new methods to Enumerable, Array, and Hash to:

* return a random element (pick, pick_index, pick_key, pick_value
and their destructive versions suffixed with !).
* arrange elements in new, random order (shuffle,
shuffle_hash_pairs, shuffle_hash).
* use above methods in convenient ways (each_random, map_random).

It also provides these new facilities to String:

* shuffle_chars, to arrange the characters of the string in new
order.
* pick_byte, pick_char and pick_index, to return random bytes,
characters or elements.

Examples:

arr = (1..10).to_a
arr.pick # => 3
arr.pick # => 7
arr.shuffle # => [3,5,2,10,8,7,6,1,9,4]

str = "string strung strong rang"
str.pick_char # => 'n'
str.pick_byte # => 97
str.shuffle_chars # => "srsntnignrgrgttsuan r og"

hash = {'a'=>2, 'b'=>3, 'c'=>4}
hash.pick! #=> ['c', 4]
hash #=> {'a'=>2, 'b'=>3}
hash.pick_key #=> 'b'
hash.pick_value #=> 2

File.open("foo"){|f|
f.each_random{|line| puts line } # about equal to
readlines.shuffle.each{...}
}

History
=======
* November 26, 2004: Initial version done as IRC collaboration
project between kig and chris2.
* November 29, 2004: First public release 0.9.

Getting it
==========
rand.rb is packaged in RPA.
rpa install rand

You can download rand.rb at:
http://kronavita.de/chris...-0....

Upstream source:
You can get latest development releases using darcs by executing:

darcs get http://kronavita.de/chris...

darcs send is the preferred way to send patches, but mailing diffs is
fine too.

Authors
=======
* Ilmari Heikkinen: Code and unit tests
* Christian Neukirchen: Documentation and housekeeping.

Please mail bugs, feature requests or patches to the mail addresses
above or use IRC to contact the developers.

Copyright
=========
Copyright 2004 Ilmari Heikkinen < kig misfiring net >
Parts Copyright 2004 Christian Neukirchen < chneukirchen gmail com >

P.S. I guess it's bad mojo to start versions from 0.9, but it _is_ very
simple



17 Answers

Ralf Müller

12/15/2004 7:39:00 AM

0

Am Mittwoch, 15. Dezember 2004 01:24 schrieb Ilmari Heikkinen:
> Hello all, here's a little convenience library we whipped up a couple
> weeks back with Christian Neukirchen. It's very simple, but we hope
> someone might find it useful :)
> ---
>
> rand.rb is a library for picking random elements and shuffling.
> This work is licensed under the same terms as Ruby itself.
>
> Overview
> ========
> rand.rb adds new methods to Enumerable, Array, and Hash to:
>
> * return a random element (pick, pick_index, pick_key, pick_value
> and their destructive versions suffixed with !).
> * arrange elements in new, random order (shuffle,
> shuffle_hash_pairs, shuffle_hash).
> * use above methods in convenient ways (each_random, map_random).
>
> It also provides these new facilities to String:
>
> * shuffle_chars, to arrange the characters of the string in new
> order.
> * pick_byte, pick_char and pick_index, to return random bytes,
> characters or elements.
>
> Examples:
>
> arr = (1..10).to_a
> arr.pick # => 3
> arr.pick # => 7
> arr.shuffle # => [3,5,2,10,8,7,6,1,9,4]
>
> str = "string strung strong rang"
> str.pick_char # => 'n'
> str.pick_byte # => 97
> str.shuffle_chars # => "srsntnignrgrgttsuan r og"
>
> hash = {'a'=>2, 'b'=>3, 'c'=>4}
> hash.pick! #=> ['c', 4]
> hash #=> {'a'=>2, 'b'=>3}
> hash.pick_key #=> 'b'
> hash.pick_value #=> 2
>
> File.open("foo"){|f|
> f.each_random{|line| puts line } # about equal to
> readlines.shuffle.each{...}
> }
>
> History
> =======
> * November 26, 2004: Initial version done as IRC collaboration
> project between kig and chris2.
> * November 29, 2004: First public release 0.9.
>
> Getting it
> ==========
> rand.rb is packaged in RPA.
> rpa install rand
>
> You can download rand.rb at:
> http://kronavita.de/chris...-0....
>
> Upstream source:
> You can get latest development releases using darcs by executing:
>
> darcs get http://kronavita.de/chris...
>
> darcs send is the preferred way to send patches, but mailing diffs is
> fine too.
>
> Authors
> =======
> * Ilmari Heikkinen: Code and unit tests
> * Christian Neukirchen: Documentation and housekeeping.
>
> Please mail bugs, feature requests or patches to the mail addresses
> above or use IRC to contact the developers.
>
> Copyright
> =========
> Copyright 2004 Ilmari Heikkinen < kig misfiring net >
> Parts Copyright 2004 Christian Neukirchen < chneukirchen gmail com >
>
> P.S. I guess it's bad mojo to start versions from 0.9, but it _is_ very
> simple

Hi Ilmari,
its a bit OT, but I'd like to know, how you test the methods above.
How could you know, there is as much randomness as you want?

regards
ralf



Neil Spring

12/15/2004 5:06:00 PM

0

On Dec 14, 2004, at 7:24 PM, Ilmari Heikkinen wrote:
> rand.rb is a library for picking random elements and shuffling.
> This work is licensed under the same terms as Ruby itself.

umm, could you maybe replace:

def shuffle!
sort!{rand <=> 0.5}
end

with:

def shuffle!
each_index {|j| i = rand(size-j); self[j], self[j+i] = self[j+i],
self[j]}
self
end

I would not rely on sort to do the right thing, and it is less
efficient than necessary. I use the second shuffle code extensively in
my code which deals with shuffling lots and lots of elements.
Presumably the second is O(n) where the sort based scheme is at best
O(n log n). A little googling suggests this is the "Fisher-Yates
shuffle."

-neil





Ilmari Heikkinen

12/15/2004 5:22:00 PM

0

Hi,

On 15.12.2004, at 09:38, Ralf Müller wrote:
>
> Hi Ilmari,
> its a bit OT, but I'd like to know, how you test the methods above.
> How could you know, there is as much randomness as you want?

Short answer: I don't. Bad, yes.

Long answer:
The only probabilistic test is for #shuffle, which tries 10 times to
get a shuffled array that != original array. The rest trust that
Kernel#rand provides random numbers and compare the picked values to
the original array to check that they all are from the original array.

I guess one way to test the randomness would be to collect an array
from the picked values that's much larger than the original. Then check
the distribution by checking that the entry counts normalized to array
length (amt of item / array.length) are close enough to the original
array's normalized counts. And check the randomness of the order by
gzipping a long randomized string and checking that it compresses
badly.

-Ilmari



Brian Schröder

12/15/2004 5:27:00 PM

0

On Thu, 16 Dec 2004 02:05:42 +0900
Neil Spring <nspring@cs.washington.edu> wrote:

> On Dec 14, 2004, at 7:24 PM, Ilmari Heikkinen wrote:
> > rand.rb is a library for picking random elements and shuffling.
> > This work is licensed under the same terms as Ruby itself.
>
> umm, could you maybe replace:
>
> def shuffle!
> sort!{rand <=> 0.5}
> end
>
> with:
>
> def shuffle!
> each_index {|j| i = rand(size-j); self[j], self[j+i] = self[j+i],
> self[j]}
> self
> end
>
> I would not rely on sort to do the right thing, and it is less
> efficient than necessary. I use the second shuffle code extensively in
> my code which deals with shuffling lots and lots of elements.
> Presumably the second is O(n) where the sort based scheme is at best
> O(n log n). A little googling suggests this is the "Fisher-Yates
> shuffle."
>

I don't know if the first proposed solution is valid, because it depends on how
sort is implemented and if the sort algorithm makes each comparision only once.
Otherwise I imagine weird things happening, if elements change their order
while sorting.

So the correct way to do this is

sort_by{rand}

Fisher Yates is a nice, unbiased O(n) shuffler. I personally more often use the
sort shuffler, because it is so amazingly wordy and I can remember it.

Regards,

Brian

--
Brian Schröder
http://www.brian-sch...



Ilmari Heikkinen

12/15/2004 6:31:00 PM

0


On 15.12.2004, at 19:05, Neil Spring wrote:
> def shuffle!
> each_index {|j| i = rand(size-j); self[j], self[j+i] = self[j+i],
> self[j]}
> self
> end
>
> I would not rely on sort to do the right thing, and it is less
> efficient than necessary. I use the second shuffle code extensively
> in my code which deals with shuffling lots and lots of elements.
> Presumably the second is O(n) where the sort based scheme is at best
> O(n log n). A little googling suggests this is the "Fisher-Yates
> shuffle."

Thank you, changing #shuffle! to that and #shuffle to dup.shuffle!

-Ilmari



T. Onoma

12/15/2004 6:38:00 PM

0

I wonder. Do others see #pick in relation to #pop and #push as I do? Might
#pick be destructive instead of #pick! and using something like Galvin's
#rand for non-destructive?

Thoughts?
T.


On Tuesday 14 December 2004 07:24 pm, Ilmari Heikkinen wrote:
| Hello all, here's a little convenience library we whipped up a couple
| weeks back with Christian Neukirchen. It's very simple, but we hope
| someone might find it useful :)
| ---
|
| rand.rb is a library for picking random elements and shuffling.
| This work is licensed under the same terms as Ruby itself.
|
| Overview
| ========
| rand.rb adds new methods to Enumerable, Array, and Hash to:
|
| * return a random element (pick, pick_index, pick_key, pick_value
| and their destructive versions suffixed with !).
| * arrange elements in new, random order (shuffle,
| shuffle_hash_pairs, shuffle_hash).
| * use above methods in convenient ways (each_random, map_random).
|
| It also provides these new facilities to String:
|
| * shuffle_chars, to arrange the characters of the string in new
| order.
| * pick_byte, pick_char and pick_index, to return random bytes,
| characters or elements.
|
| Examples:
|
| arr = (1..10).to_a
| arr.pick # => 3
| arr.pick # => 7
| arr.shuffle # => [3,5,2,10,8,7,6,1,9,4]
|
| str = "string strung strong rang"
| str.pick_char # => 'n'
| str.pick_byte # => 97
| str.shuffle_chars # => "srsntnignrgrgttsuan r og"
|
| hash = {'a'=>2, 'b'=>3, 'c'=>4}
| hash.pick! #=> ['c', 4]
| hash #=> {'a'=>2, 'b'=>3}
| hash.pick_key #=> 'b'
| hash.pick_value #=> 2
|
| File.open("foo"){|f|
| f.each_random{|line| puts line } # about equal to
| readlines.shuffle.each{...}
| }
|
| History
| =======
| * November 26, 2004: Initial version done as IRC collaboration
| project between kig and chris2.
| * November 29, 2004: First public release 0.9.
|
| Getting it
| ==========
| rand.rb is packaged in RPA.
| rpa install rand
|
| You can download rand.rb at:
| http://kronavita.de/chris...-0....
|
| Upstream source:
| You can get latest development releases using darcs by executing:
|
| darcs get http://kronavita.de/chris...
|
| darcs send is the preferred way to send patches, but mailing diffs is
| fine too.
|
| Authors
| =======
| * Ilmari Heikkinen: Code and unit tests
| * Christian Neukirchen: Documentation and housekeeping.
|
| Please mail bugs, feature requests or patches to the mail addresses
| above or use IRC to contact the developers.
|
| Copyright
| =========
| Copyright 2004 Ilmari Heikkinen < kig misfiring net >
| Parts Copyright 2004 Christian Neukirchen < chneukirchen gmail com >
|
| P.S. I guess it's bad mojo to start versions from 0.9, but it _is_ very
| simple

--
( o _ ���
// trans.
/ \ transami@runbox.com

ruby -rdrb -e
'DRb.start_service;duck=DRbObject.new(nil,"druby://whytheluckystiff.net:6503");puts
duck.toms'

I don't give a damn for a man that can only spell a word one way.
-Mark Twain

[8,16,20,29,78,65,2,14,26,12,12,28,71,114,12,13,12,82,72,21,17,4,10,2,95].
each_with_index{|x,i| $><<(x^'Begin landing your troops'[i]).chr}
-Tadayoshi Funaba



Maarten Boonen

12/15/2004 10:12:00 PM

0

Neil Spring schreef:
> On Dec 14, 2004, at 7:24 PM, Ilmari Heikkinen wrote:
>
>> rand.rb is a library for picking random elements and shuffling.
>> This work is licensed under the same terms as Ruby itself.
>
>
> umm, could you maybe replace:
>
> def shuffle!
> sort!{rand <=> 0.5}
> end
>
> with:
>
> def shuffle!
> each_index {|j| i = rand(size-j); self[j], self[j+i] = self[j+i],
> self[j]}
> self
> end

I made this a function a few days ago and it seems to be a lot faster.
Not sure if it's completely correct though.

def shuffle!
a = 0
while(a<length)
b = (length*Kernel.rand).to_i
tmp = self[a]
self[a] = self[b]
self[b] = tmp
a+=1
end
self
end

T. Onoma

12/16/2004 4:50:00 AM

0

On Wednesday 15 December 2004 05:12 pm, Maarten Boonen wrote:
| Neil Spring schreef:
| > On Dec 14, 2004, at 7:24 PM, Ilmari Heikkinen wrote:
| >> rand.rb is a library for picking random elements and shuffling.
| >> This work is licensed under the same terms as Ruby itself.
| >
| > umm, could you maybe replace:
| >
| > def shuffle!
| > sort!{rand <=> 0.5}
| > end
| >
| > with:
| >
| > def shuffle!
| > each_index {|j| i = rand(size-j); self[j], self[j+i] = self[j+i],
| > self[j]}
| > self
| > end
|
| I made this a function a few days ago and it seems to be a lot faster.
| Not sure if it's completely correct though.
|
| def shuffle!
| a = 0
| while(a<length)
| b = (length*Kernel.rand).to_i
| tmp = self[a]
| self[a] = self[b]
| self[b] = tmp
| a+=1
| end
| self
| end

Well, lets see... a couple of equivalent transformations and:

def shuffle!
ln = length
ln.times do |a|
b = Kernel.rand(ln)
self[a], self[b] = self[b], self[a]
end
self
end

If you look closely, it is almost the very same thing. The real difference
lies in the the subtraction on the index being fed into #rand. I think this
is to improve the randomness a bit.

How much faster does it run?

T.







nobu.nokada

12/16/2004 5:09:00 AM

0

Hi,

At Thu, 16 Dec 2004 07:12:15 +0900,
Maarten Boonen wrote in [ruby-talk:123730]:
> I made this a function a few days ago and it seems to be a lot faster.
> Not sure if it's completely correct though.
>
> def shuffle!
> a = 0
> while(a<length)
> b = (length*Kernel.rand).to_i
> tmp = self[a]
> self[a] = self[b]
> self[b] = tmp
> a+=1
> end
> self
> end

It wouldn't be uniform. An element which was close to the head
will tend to be placed close to the tail.

--
Nobu Nakada


Maarten Boonen

12/16/2004 8:26:00 AM

0

trans. (T. Onoma) schreef:
> On Wednesday 15 December 2004 05:12 pm, Maarten Boonen wrote:
> | Neil Spring schreef:
> | > On Dec 14, 2004, at 7:24 PM, Ilmari Heikkinen wrote:
> | >> rand.rb is a library for picking random elements and shuffling.
> | >> This work is licensed under the same terms as Ruby itself.
> | >
> | > umm, could you maybe replace:
> | >
> | > def shuffle!
> | > sort!{rand <=> 0.5}
> | > end
> | >
> | > with:
> | >
> | > def shuffle!
> | > each_index {|j| i = rand(size-j); self[j], self[j+i] = self[j+i],
> | > self[j]}
> | > self
> | > end
> |
> | I made this a function a few days ago and it seems to be a lot faster.
> | Not sure if it's completely correct though.
> |
> | def shuffle!
> | a = 0
> | while(a<length)
> | b = (length*Kernel.rand).to_i
> | tmp = self[a]
> | self[a] = self[b]
> | self[b] = tmp
> | a+=1
> | end
> | self
> | end
>
> Well, lets see... a couple of equivalent transformations and:
>
> def shuffle!
> ln = length
> ln.times do |a|
> b = Kernel.rand(ln)
> self[a], self[b] = self[b], self[a]
> end
> self
> end
>
> If you look closely, it is almost the very same thing. The real difference
> lies in the the subtraction on the index being fed into #rand. I think this
> is to improve the randomness a bit.
>
> How much faster does it run?
>
> T.

Just a very small test: (code below)

-> third is your equivalent code, self[a], self[b] = self[b], self[a]
seems to slow it down quite a lot.

>ruby test3.rb
1.272
0.571
1.042
>Exit code: 0
>ruby test3.rb
1.262
0.601
1.042
>Exit code: 0
>ruby test3.rb
1.302
0.551
1.041
>Exit code: 0






class Array

def shuffle!
each_index {|j| i = rand(size-j); self[j], self[j+i] = self[j+i],
self[j]}
self
end

def shuffle2!
a = 0
while(a<length)
b = (length*Kernel.rand).to_i
tmp = self[a]
self[a] = self[b]
self[b] = tmp
a+=1
end
self
end

def shuffle3!
ln = length
ln.times do |a|
b = Kernel.rand(ln)
self[a], self[b] = self[b], self[a]
end
self
end

end


arr = (1..100000).to_a

a=Time.now
arr.shuffle!
puts Time.now-a

a=Time.now
arr.shuffle2!
puts Time.now-a

a=Time.now
arr.shuffle3!
puts Time.now-a