Asp Forum - Portable concurrent structure access using BORDEAUX-THREADS?

Max Rottenkolber

3/26/2015 12:22:00 AM

Hi c.l.l,

Imagine we have a structure like this:

(defstruct my-struct
(read-only (error "Supply me!"))
(read-write nil)
(lock (bordeaux-threads:make-lock)))

No we have functions that shall read/write the READ-WRITE slot and read
the READ-ONLY slot in a thread-safe manner.

(defun get-read-only (s)
(my-struct-read-only s))

(defun get-read-write (s)
(bordeaux-threads:with-lock-held ((my-struct-lock s))
(my-struct-read-write s)))

(defun set-read-write (s v)
(bordeaux-threads:with-lock-held ((my-struct-lock s))
(setf (my-struct-read-write s) v)))

If these functions are called concurrently, will GET-READ-ONLY be safe or
will its result be undefined? What if SET-READ-WRITE would PUSH instead
of SETF?

Generally, BORDEAUX-THREADS doesn't seem to define what kind of writes
and/or reads need to be mutually exclusive in order to have defined
behavior. Maybe its obvious, but if this leads to any enlightenment I
would love to contribute to the BORDEAUX-THREADS documentation.

Regards,
max

11 Answers

George Neuner

3/26/2015 8:40:00 AM

On 26 Mar 2015 00:22:05 GMT, Max Rottenkolber <max@mr.gy> wrote:

>Generally, BORDEAUX-THREADS doesn't seem to define what kind of writes
>and/or reads need to be mutually exclusive in order to have defined
>behavior. Maybe its obvious, but if this leads to any enlightenment I
>would love to contribute to the BORDEAUX-THREADS documentation.
>
>Regards,
>max

Bordeaux does not specify a particular thread model - implementations
are free to use OS threads, user-space (green) threads, or even green
threads on OS threads - a hybrid model often referred to as M:N
threading.

The CPU cannot atomically update anything larger than a register:
implementation dependent that equates to a fixnum and/or a double
float value. The memory controller usually[*] will atomically update
memory in cache line sized blocks, but that isn't portable and you
can't make use of it in any case unless you work at a very low level.
It is best to forget about it.

Cooperatively scheduled threads don't need synchronization if you are
careful in accessing shared data, but most thread implementations
today are pre-emptive [even user-space threads: a timer sets a task
switch flag that is checked by the running thread at implementation
determined "safe" points within the code].

Under pre-emptive scheduling, it is safe to share read-only data
without synchronization, but, in general, access to mutable data
always must be synchronized. The exceptions are too rare to worry
about.
[And don't forget about mutable _bindings_ to read-only data.]

Hope this helps.
George

[*] there is at least one system that can do partial cache line writes
to memory.

Max Rottenkolber

3/26/2015 2:42:00 PM

On Thu, 26 Mar 2015 04:39:36 -0400, George Neuner wrote:

> Under pre-emptive scheduling, it is safe to share read-only data without
> synchronization, but, in general, access to mutable data always must be
> synchronized. The exceptions are too rare to worry about.
> [And don't forget about mutable _bindings_ to read-only data.]

Thanks for the detailed explanation!

So in essence, I can't store the lock for a slot in the same structure as
the slot, because mutating the slot might mutate more than just the slot
itself?

Or can't I even concurrently read a structure slot in general? Where do I
store the lock then? There really isn't such a thing as an immutable
binding in CL, as far as I know.

I find it hard to translate from atomic CPU operations to actual language
semantics here. I understand its not defined by the Common Lisp standard
and thus implementation dependent but I would like to think that BORDEAUX-
THREADS should define some baseline behavior for all implementations it
supports.

Kaz Kylheku

3/26/2015 2:50:00 PM

On 2015-03-26, George Neuner <gneuner2@comcast.net> wrote:
> On 26 Mar 2015 00:22:05 GMT, Max Rottenkolber <max@mr.gy> wrote:
>
>
>>Generally, BORDEAUX-THREADS doesn't seem to define what kind of writes
>>and/or reads need to be mutually exclusive in order to have defined
>>behavior. Maybe its obvious, but if this leads to any enlightenment I
>>would love to contribute to the BORDEAUX-THREADS documentation.
>>
>>Regards,
>>max
>
> Bordeaux does not specify a particular thread model - implementations
> are free to use OS threads, user-space (green) threads, or even green
> threads on OS threads - a hybrid model often referred to as M:N
> threading.
>
> The CPU cannot atomically update anything larger than a register:
> implementation dependent that equates to a fixnum and/or a double
> float value. The memory controller usually[*] will atomically update
> memory in cache line sized blocks, but that isn't portable and you
> can't make use of it in any case unless you work at a very low level.
> It is best to forget about it.

That memory is written out in cache line sized blocks is somewhat useless,
because the local view is not consistent. A thread writes into the cache
line in word-sized chunks. It can be preempted at any time, and another
thread (or interrupt) can read that region of memory on the same processor. And
in any case, the cache line is already dirty after the first word is written;
you have to be somehow sure that the write-back won't fire until all the data
is assembled.

> Cooperatively scheduled threads don't need synchronization if you are
> careful in accessing shared data, but most thread implementations
> today are pre-emptive [even user-space threads: a timer sets a task
> switch flag that is checked by the running thread at implementation
> determined "safe" points within the code].

If cooperatively threaded code makes any call which can yield the processor,
then shared data can change. This means that any important invariants over
shared data have to be settled before making such a call (e.g. don't
leave the pointers of a linked list structure in some half-baked state), and
after resuming, the thread may have to re-examine the world to see what has
changed. E.g. the element you pushed onto a queue before the sleep is no
longer at the front, or possibly not enqueued at all. In some situations, the
right solution will be a lock after all.

Pascal Costanza

3/26/2015 4:28:00 PM

On 26/03/2015 01:22, Max Rottenkolber wrote:
> Hi c.l.l,
>
> Imagine we have a structure like this:
>
> (defstruct my-struct
> (read-only (error "Supply me!"))
> (read-write nil)
> (lock (bordeaux-threads:make-lock)))
>
> No we have functions that shall read/write the READ-WRITE slot and read
> the READ-ONLY slot in a thread-safe manner.
>
> (defun get-read-only (s)
> (my-struct-read-only s))
>
> (defun get-read-write (s)
> (bordeaux-threads:with-lock-held ((my-struct-lock s))
> (my-struct-read-write s)))
>
> (defun set-read-write (s v)
> (bordeaux-threads:with-lock-held ((my-struct-lock s))
> (setf (my-struct-read-write s) v)))
>
> If these functions are called concurrently, will GET-READ-ONLY be safe or
> will its result be undefined? What if SET-READ-WRITE would PUSH instead
> of SETF?

In the general case, you don't want locks for single slots. There are
two reasons for this:

1) For most data types, single slots already come with their own locks
on modern processor architectures. You can use low-level
compare-and-swap operations (also atomic-incf, and similar atomic
operations) to modify them in a thread-safe way. (Only modification is
relevant here, read-only fields are already safe anyway.) For example,
LispWorks and SBCL come with such operations.

2) Having said that, in many cases you want thread safety over several
memory locations, not just one. For that, it's a good idea to use locks
(nowadays pretty much a non-brainer, since locks have become relatively
cheap). However, you need to have application knowledge to understand
what to cover. Sometimes it's a good idea to have a single lock for a
struct instance, but sometimes you need to ensure thread safety over
even larger numbers of memory locations. It's not possible to tell
without knowing more about your application.

Pascal

--
My website: http:/...
Common Lisp Document Repository: http://cdr.eu...
Closer to MOP & ContextL: http://common-lisp.net/proje...
The views expressed are my own, and not those of my employer.

Max Rottenkolber

3/26/2015 5:37:00 PM

On Thu, 26 Mar 2015 17:27:39 +0100, Pascal Costanza wrote:

> In the general case, you don't want locks for single slots. There are
> two reasons for this:
>
> 1) For most data types, single slots already come with their own locks
> on modern processor architectures. You can use low-level
> compare-and-swap operations (also atomic-incf, and similar atomic
> operations) to modify them in a thread-safe way. (Only modification is
> relevant here, read-only fields are already safe anyway.) For example,
> LispWorks and SBCL come with such operations.

I am not looking for the most efficient solution here--even though I am
interested in knowing what the options are and how they compare to each
other--but for the most portable (for instance using BORDEAUX-THREADS)
and correct one. Using an implementation specific feature is out of the
question. I either want to know how to use BORDEAUX-THREADS correctly or
an alternative portable specification. When I say portable, in this case
I mean "portable over N CL implementations".

> 2) Having said that, in many cases you want thread safety over several
> memory locations, not just one. For that, it's a good idea to use locks
> (nowadays pretty much a non-brainer, since locks have become relatively
> cheap). However, you need to have application knowledge to understand
> what to cover. Sometimes it's a good idea to have a single lock for a
> struct instance, but sometimes you need to ensure thread safety over
> even larger numbers of memory locations. It's not possible to tell
> without knowing more about your application.

I thought I had described the application in my original post fully:
given the defined GET-READ-ONLY, GET-READ-WRITE, and SET-READ-WRITE,
would three bordeaux threads calling these functions concurrently (on the
same structure instance) yield undefined behavior or not?

Kaz Kylheku

3/26/2015 8:01:00 PM

On 2015-03-26, Max Rottenkolber <max@mr.gy> wrote:
> On Thu, 26 Mar 2015 17:27:39 +0100, Pascal Costanza wrote:
>
>> In the general case, you don't want locks for single slots. There are
>> two reasons for this:
>>
>> 1) For most data types, single slots already come with their own locks
>> on modern processor architectures. You can use low-level
>> compare-and-swap operations (also atomic-incf, and similar atomic
>> operations) to modify them in a thread-safe way. (Only modification is
>> relevant here, read-only fields are already safe anyway.) For example,
>> LispWorks and SBCL come with such operations.
>
> I am not looking for the most efficient solution here--even though I am
> interested in knowing what the options are and how they compare to each
> other--but for the most portable (for instance using BORDEAUX-THREADS)
> and correct one. Using an implementation specific feature is out of the
> question. I either want to know how to use BORDEAUX-THREADS correctly or
> an alternative portable specification. When I say portable, in this case
> I mean "portable over N CL implementations".

The following kind thing is also portable over multiple CL implementations:

#+THIS_CL (do this)
#+THAT_CL (do that)
#-(or :THIS_CL :THAT_CL) (fallback)

You can use an implementation feature *and* be reasonably portable.

Siebe de Vos

3/26/2015 10:45:00 PM

On 2015-03-26 01:22, Max Rottenkolber wrote:
> Hi c.l.l,
>
> Imagine we have a structure like this:
>
> (defstruct my-struct
> (read-only (error "Supply me!"))
> (read-write nil)
> (lock (bordeaux-threads:make-lock)))
>
> No we have functions that shall read/write the READ-WRITE slot and read
> the READ-ONLY slot in a thread-safe manner.
>
> (defun get-read-only (s)
> (my-struct-read-only s))
>
> (defun get-read-write (s)
> (bordeaux-threads:with-lock-held ((my-struct-lock s))
> (my-struct-read-write s)))
>
> (defun set-read-write (s v)
> (bordeaux-threads:with-lock-held ((my-struct-lock s))
> (setf (my-struct-read-write s) v)))
>
> If these functions are called concurrently, will GET-READ-ONLY be safe or
> will its result be undefined? What if SET-READ-WRITE would PUSH instead
> of SETF?
>
> Generally, BORDEAUX-THREADS doesn't seem to define what kind of writes
> and/or reads need to be mutually exclusive in order to have defined
> behavior. Maybe its obvious, but if this leads to any enlightenment I
> would love to contribute to the BORDEAUX-THREADS documentation.
>
> Regards,
> max

The questions about safety can be looked at in two ways. First you have
the high level thread control interface, including locks like you use
above. Those will work fine for control flow. From this point of view
your examples are completely safe.

The second level deals with dirty synchronization issues, like execution
order, cache synchronization, etc. So you could have:
- T1 creates a MY-STRUCT, passes it to T2, and T2 calls GET-READ-ONLY
before the memory is synchronized;
- T1 locks a MY-STRUCT, writes to READ-WRITE, unlocks, T2 locks and
still might see the old value.

I'm not sure you will find an SMP Common Lisp that can guarantee
structure access conforms to some kind of thread safety model (e.g.,
SBCL doesn't).

You should see BORDEAUX-THREADS as a unification of the different
implementations for the high level only. There is no attempt to provide
the 'Common Lisp memory model' and certainly it will not try to change
the way a struct accessor is implemented.

It's a nice idea to amend the CL spec with thread safety, but you would
probably end up with a lot of 'The consequences are undefined ...' (not
to mention ending up like C++).

Bye,
Siebe

Pascal Costanza

3/27/2015 7:18:00 AM

On 26/03/2015 18:36, Max Rottenkolber wrote:
> On Thu, 26 Mar 2015 17:27:39 +0100, Pascal Costanza wrote:
>
>> In the general case, you don't want locks for single slots. There are
>> two reasons for this:
>>
>> 1) For most data types, single slots already come with their own locks
>> on modern processor architectures. You can use low-level
>> compare-and-swap operations (also atomic-incf, and similar atomic
>> operations) to modify them in a thread-safe way. (Only modification is
>> relevant here, read-only fields are already safe anyway.) For example,
>> LispWorks and SBCL come with such operations.
>
> I am not looking for the most efficient solution here--even though I am
> interested in knowing what the options are and how they compare to each
> other--but for the most portable (for instance using BORDEAUX-THREADS)
> and correct one. Using an implementation specific feature is out of the
> question. I either want to know how to use BORDEAUX-THREADS correctly or
> an alternative portable specification. When I say portable, in this case
> I mean "portable over N CL implementations".

You're limiting yourself in unnecessary ways. But sure, you can use
locks also on a per-slot basis.

>> 2) Having said that, in many cases you want thread safety over several
>> memory locations, not just one. For that, it's a good idea to use locks
>> (nowadays pretty much a non-brainer, since locks have become relatively
>> cheap). However, you need to have application knowledge to understand
>> what to cover. Sometimes it's a good idea to have a single lock for a
>> struct instance, but sometimes you need to ensure thread safety over
>> even larger numbers of memory locations. It's not possible to tell
>> without knowing more about your application.
>
> I thought I had described the application in my original post fully:
> given the defined GET-READ-ONLY, GET-READ-WRITE, and SET-READ-WRITE,
> would three bordeaux threads calling these functions concurrently (on the
> same structure instance) yield undefined behavior or not?

That's not an application. It appears that you are looking for generic
ways to make single-slot accesses safe, but that's the wrong focus,
because in many cases, you have to provide thread safety for much
coarser granularities. Simple example: Imagine a struct where one slot
points to a list of items, and another slot maintains how many items are
in that list. Already in that simple case, per-slot locks don't help
you, because you need to ensure that both slots are always updated
consistently. In /actual/ applications, you will encounter even more
coarse-grained thread safety requirements.

There is a reason why programming languages provide locks (or
transactions, for example) as separate constructs.

Java, for example, provides synchronization on a per-object basis, and
doesn't go more fine-grained by default - and that already doesn't work
that well either in the general case.

Pascal

--
My website: http:/...
Common Lisp Document Repository: http://cdr.eu...
Closer to MOP & ContextL: http://common-lisp.net/proje...
The views expressed are my own, and not those of my employer.

Max Rottenkolber

3/27/2015 3:36:00 PM

On Fri, 27 Mar 2015 08:18:29 +0100, Pascal Costanza wrote:

> [...] It appears that you are looking for generic
> ways to make single-slot accesses safe, but that's the wrong focus,
> because in many cases, you have to provide thread safety for much
> coarser granularities. Simple example: Imagine a struct where one slot
> points to a list of items, and another slot maintains how many items are
> in that list. Already in that simple case, per-slot locks don't help
> you, because you need to ensure that both slots are always updated
> consistently. In /actual/ applications, you will encounter even more
> coarse-grained thread safety requirements.

The single slot was a minimal example. I really wanted to specifically
know if my proposed example had well-defined portable behavior or not (if
I understand you correctly, you believe it is so). Whether I lock
operations on one slot or many slots isn't the issue here, rather I was
wondering if its legal at all to store the lock (and other read-only
slots) in the same structure a the locked slots and access them
concurrently.

I see how you instinctively think about more advanced MP uses, which is
awesome because I know that you're an expert on the field. I might get
back to optimal usage, once I have my correct and functional lame-case
right. :)

Cheers,
max

Max Rottenkolber

3/27/2015 3:45:00 PM

On Thu, 26 Mar 2015 23:44:55 +0100, Siebe de Vos wrote:

> - T1 locks a MY-STRUCT, writes to READ-WRITE, unlocks, T2 locks and
> still might see the old value.

Wouldn't that mean that I can't use BORDEAUX-THREADS to synchronize data
at all? In the end the whole point on the lock was to assure that threads
have a consistent view of a value stored in a location.

I have for instance used JPL-QUEUES' synchronized-queue class to
synchronize threads in the past. That couldn't work portably if your
statement above was true.

comp.lang.lisp

Portable concurrent structure access using BORDEAUX-THREADS?

Max Rottenkolber

George Neuner

Max Rottenkolber

Kaz Kylheku

Pascal Costanza

Max Rottenkolber

Kaz Kylheku

Siebe de Vos

Pascal Costanza

Max Rottenkolber

Max Rottenkolber

x Login to ForumsZone