Asp Forum - Here is what's important about reader-writer algorithms...

Ramine

12/6/2014 10:52:00 PM

Hello,

Here is what's important about reader-writer algorithms...

I have thought more about this subject and i think i have mastered to
a certain level the subject of synchronization and reader-writer
algorithms...

You have to be carefull my dear programmers and hardware engineers,
cause if you take a look at the PThread reader-writer lock , it
is atomicaly incrementing a shared variable on the reader side,
so it makes it inefficient , cause on x86 for example when you
atomicaly increment a variable by writing for example "lock add"
in assembler or using atimics, this "lock add" or atomics will put a
full memory barrier that is costly, i have measured a full memory
barrier on x86 and it takes around over 300 CPU cycles and that's too
costly , cause if we add to this a time to transfer cache line(s)
between cores it will be costly and this "lock add" or atomics will
belgong to the Serial part of the Amdah's law , so this "lock add" or
atomics will not scale , and that's the weakness with the Pthread
reader-writer lock, it makes it really inefficient, so what have done
Dmitry Vyukov with his distributed reader-writer algorithm here:

http://www.1024cores.net/home/lock-free-algorithms/reader-writer-problem/distributed-reader-wr...

Dmitry Vyukov has used a distributed mechanism , this distributed
mechanism allows for example the "lock add" in assembler or the atomics
of the reader side of the RWLock used by the distributed algorithm to be
run by only a group of threads belonging to the same core, so this
has made the "lock add" or atomics a part of the parallel part of the
Amdahl's law, so his distributed algorithm has allowed the "lock add" or
atomics to effectivly scale, but there is still a weakness with this
distributed algorithm of Dmitry Vyukov , because his algorithm
is slow cause this "lock add" or atomics on the reader side of the
RWLock used by the distributed algorithm will issue a full memory
barrier on every call to the "lock add" or atomics, and this full
memory barrier is expensive , i have benchmarked it and i have
found that it is so expensive that it takes over 16 or 32 cores to make
it scale as Seqlock with 4 cores, cause this full memory barrier
is part of the parallel part of the Amdahl's law on the distributed
algorithm, so i think that the Dmitry Vyukov distributed reader-writer
lock is inefficient up to 32 cores compared to the Seqlock,
now what about Seqlock? Seqlock doesn't use atomics on the reader-side
so it is very fast and it is scalable on read-mostly scenarios, it is as
fast and as scalable as RCU on read-mostly scenarios, but the weakness
with Seqlock is that Seqlock can starve and livelock when
a greater percentage of writers are used, so this is why i have invented
my scalable distributed sequential lock because it is
as fast and as scalable as Seqlock and it avoids the weakness of
Seqlock because it doesn't starve or livelock when a greater percentage
of writers are used.

Hope you have understood what i have just explained to you, because
it is very important.

You can download my scalable distributed sequential lock version 1.1 from:

https://sites.google.com/site/aminer68/scalable-distributed-seque...

Thank you,
Amien Moulay Ramdane.

1 Answer

Ramine

12/6/2014 10:57:00 PM

On 12/6/2014 2:51 PM, Ramine wrote:
> Hello,
>
>
> Here is what's important about reader-writer algorithms...
>
> I have thought more about this subject and i think i have mastered to
> a certain level the subject of synchronization and reader-writer
> algorithms...
>
> You have to be carefull my dear programmers and hardware engineers,
> cause if you take a look at the PThread reader-writer lock , it
> is atomicaly incrementing a shared variable on the reader side,
> so it makes it inefficient , cause on x86 for example when you
> atomicaly increment a variable by writing for example "lock add"
> in assembler or using atimics, this "lock add" or atomics will put a
> full memory barrier that is costly, i have measured a full memory
> barrier on x86 and it takes around over 300 CPU cycles and that's too
> costly , cause if we add to this a time to transfer cache line(s)
> between cores it will be costly and this "lock add" or atomics will
> belgong to the Serial part of the Amdah's law , so this "lock add" or
> atomics will not scale , and that's the weakness with the Pthread
> reader-writer lock, it makes it really inefficient, so what have done
> Dmitry Vyukov with his distributed reader-writer algorithm here:
>
> http://www.1024cores.net/home/lock-free-algorithms/reader-writer-problem/distributed-reader-wr...
>
>
>
> Dmitry Vyukov has used a distributed mechanism , this distributed
> mechanism allows for example the "lock add" in assembler or the atomics
> of the reader side of the RWLock used by the distributed algorithm to be
> run by only a group of threads belonging to the same core, so this
> has made the "lock add" or atomics a part of the parallel part of the
> Amdahl's law, so his distributed algorithm has allowed the "lock add" or
> atomics to effectivly scale, but there is still a weakness with this
> distributed algorithm of Dmitry Vyukov , because his algorithm
> is slow cause this "lock add" or atomics on the reader side of the
> RWLock used by the distributed algorithm will issue a full memory
> barrier on every call to the "lock add" or atomics, and this full
> memory barrier is expensive , i have benchmarked it and i have
> found that it is so expensive that it takes over 16 or 32 cores to make
> it scale as Seqlock with 4 cores,

I have done this benchmark for small to medium reader sections.

>cause this full memory barrier
> is part of the parallel part of the Amdahl's law on the distributed
> algorithm, so i think that the Dmitry Vyukov distributed reader-writer
> lock is inefficient up to 32 cores compared to the Seqlock,
> now what about Seqlock? Seqlock doesn't use atomics on the reader-side
> so it is very fast and it is scalable on read-mostly scenarios, it is as
> fast and as scalable as RCU on read-mostly scenarios, but the weakness
> with Seqlock is that Seqlock can starve and livelock when
> a greater percentage of writers are used, so this is why i have invented
> my scalable distributed sequential lock because it is
> as fast and as scalable as Seqlock and it avoids the weakness of
> Seqlock because it doesn't starve or livelock when a greater percentage
> of writers are used.
>
>
> Hope you have understood what i have just explained to you, because
> it is very important.
>
> You can download my scalable distributed sequential lock version 1.1 from:
>
> https://sites.google.com/site/aminer68/scalable-distributed-seque...
>
>
> Thank you,
> Amien Moulay Ramdane.

comp.programming.threads

Here is what's important about reader-writer algorithms...

Ramine

Ramine

x Login to ForumsZone