seebs
4/30/2007 7:33:00 AM
In message <4635A730.20506@gmail.com>, Dan Zwell writes:
>That's true, but keep in mind that sort_by{rand} is code that's built
>into the interpreter, so it's still gonna be pretty fast. I tested this
>on my system a few weeks ago, and rerunning the benchmark program, the
>traditional sort only becomes faster when the arrays in question have
>over 5,000 elements. After all, it's a better algorithm, but it's
>written in ruby and not C.
Hmm. On my system, at only 250 elements, the shuffle comes out ahead.
Worth noting that the margin is very thin; .11 vs. .12 seconds user,
although oddly, the system time is .05 vs. .15 seconds. Might be
some underlying quirk there. Still, that's a "large" set for many
programs. On the other hand, by, say, 20k, the factor is 3-1 or more.
>If the numbers that rand() generates are good enough, the chance of
>collisions preserving order should be the same as the chance of the next
>iteration of randomizing putting them back in the same order, shouldn't it?
I think not quite. Imagine a hypothetical pair of entities. They have a
1/bignum chance of a collision, which preserves their relative sorting.
If you do it twice, the probability that the same pair of entities collided
both times is 1/bignum^2. They might otherwise end up in the same order, or
not, but there's only a 1/bignum^2 chance that they ended up in the same
order *due to a collision*.
-s