Clifford Heath
2/5/2008 11:03:00 PM
Tim Pease wrote:
> Should this output the same integer value on all platforms where Ruby
> can run?
Perhaps, but if you read the below, you'll see why you should never rely
on it.
> It appears not! So, any suggestions on generating an ID number for an
> object that is unique yet consistent across different platforms? I'd
> like to have some method that I could call on an object that would
> return a reproducible value that would uniquely identify that object.
That's not possible. There is more entropy in an arbitrary object than
can be represented in a FixNum. Basic coding theory stuff. If it was
possible, then you could code all the data in all the databases in the
world into a single Fixnum :-).
If you want a fixed-length code that's sufficiently likely to be unique
that you can be almost certain that you'll never see a false duplicate,
you need to use a cryptographic hash function. I recommend SHA-256, but
you might survive with a weaker one like MD5 or SHA-1. They take a lot
more work to calculate than is justified for Ruby's hash keys though!
With these functions, the probability of a population containing a false
duplicate is approximately 50% when the population contains sqrt(2^N),
(or 2*(N/2)) distinct items, where N is the number of bits in the
checksum. For SHA-256, that means you need 2^128 items before you have
a reasonable chance of a collision. All of the programs you'll ever write,
running for your entire life, will only create a tiny fraction of this
many objects, so the chance of you ever seeing a collision is tiny.
That might sound risky still, but all of e-commerce is built on the
principle. If it's good enough for that, it's good enough for you :-)
Clifford Heath.