Jörg W Mittag
12/23/2008 9:42:00 PM
Just Another Victim of the Ambient Morality wrote:
> Are all built-in objects thread safe? For example, if I have an array
> and one thread is constant appending to it while another thread is shifting
> elements off of it and there's no synchronization going on, can the array
> object ever get corrupted? What about a similar scenario for hashes? These
> are surely complicated objects with internal state that must be maintained.
> Are they implemented to be thread safe?
This is a *very* interesting question! And it is a question that can
ultimately *only* be answered by a formal Ruby Specification or more
specifically a formal Ruby Memory Model.
Until we have such a specification, the C source code of MRI or YARV
is considered to be the "specfication". However, there is a problem:
that source code can actually be interpreted several different ways.
If you look at the implementations of Hash, Array and friends, you
will see that they are not thread-safe. Ergo: the specification says
that the user is responsible for locking Arrays and Hashes.
If, however, you look at the implementation of threads, you will see
that both MRI and YARV are actually incapable of running more than one
thread at a time -- even on a 1000-core machine MRI and YARV will only
ever use one core. So, since two threads can never access an Array at
the same time, there is no need for locking. Ergo: the specification
says that the user is *not* responsible for locking Arrays and Hashes.
There is a conflict here -- on the one hand, Arrays aren't
thread-safe, on the other hand, MRI's broken threading implementation
accidentally *makes* them thread-safe. Which do you depend on? As it
turns out, different people interpret this differently.
A couple of months ago, this actually became an issue. Originally, the
JRuby developers had implemented Arrays to be not safe. One of the big
selling points of JRuby was and still is the promise of true
concurrency and better scalability. So, naturally, people wanted to
take advantage of this feature and started running their concurrent
programs on JRuby. And those programs crashed left and right, because
they didn't lock their Arrays properly. So, the JRuby team decided to
implement thread-safe data structures on their end, so that code that
didn't crash on MRI could be run unmodified on JRuby.
However, they didn't *have* to do that. They could just as well have
concluded that those programs were broken and *they* needed to become
thread-safe. That would have been perfectly acceptable. And there is
no guarantee that *all* Ruby Implementations will do it that way (and
there's lots of them, something around 14 or so at the moment). Well,
*unless* of course, there is a specification which tells them to.
So, in short: when in doubt, lock.
jwm