[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.programming

Parallelism and synchronization

aminer

6/10/2014 9:57:00 AM

Hello...


As you have noticed my dear Delphi programmers, i am working on
parallelism and synchronization, my last scalable Lock is called MLock,
it is a fast node based Lock that is as efficient and fast as
the MCS Lock, but my scalable MLock has a better and easy interface than
the MCS Lock, but why do you have to use my scalable MLock or my
scalable AMLock(that is a scalable array based Lock)? because the the
windows critical section is no starvation-free and it uses a backoff
mechanism and that's not so good... and the Ticket spinlock of the
Omnithread ibrary is not scalable, so it is why i have invented my new
and scalable algorithms...

A bigger problem with the scalable MCS lock is its API. It requires a
second structure to be passed in addition to the address of the lock.
The algorithm uses this second structure to store the information which
describes the queue of threads waiting for the lock. Unfortunately, most
code written using spinlocks doesn't have this extra information, so the
fact that the MCS algorithm isn't a drop-in replacement to a standard
spin lock is a problem.

An IBM working group found a way to improve the MCS algorithm to remove
the need to pass the extra structure as a parameter. Instead, on-stack
information was used instead. The result is the K42 lock algorithm:

Unfortunately, the K42 algorithm has another problem. It appears that it
may be patented by IBM. Thus it cannot be used either. (Without perhaps
paying royalties to IBM.)"

So you have to know that my new scalable MLock doesn't require a local
"queue node" to be passed in as a parameter as is doing the MCS and CLH
locks, my scalable MLock doen't require any parameter to be passed, just
call the Enter() and Leave() method and that's all.

- This scalable Lock was discovered by Amine Moulay Ramdane

- It has the same space requirement as the scalable MCS lock

- Doesn't require a local "queue node" to be passed in as a parameter as
is doing the MCS and CLH locks.

- Spins only on local locations on a cache-coherent machine

- And my scalable MLock is fast.
You can download my scalable MLock from:

https://sites.google.com/site/aminer68/scal...
Other than that, look at my scalable RWLock here:

https://sites.google.com/site/aminer68/scala...

You have to know that the Omnithread MREW synchronization don't scale
either cause
it uses an expensive atomic operation on the reader side, that's the
same for the
pthread reader-writer Lock, read this:

https://www.efficios.com/pub/rcu/urc...

I have read the following IEEE paper about RCU (Read-Copy Update), and
as you will notice they are testing this RCU implementation against the
pthread reader-writer lock, and as you have noticed the pthread
reader-writer lock doesn't scale well cause the reader side of the
pthread reader-writer lock is expensive...

Here is the paper:

https://www.efficios.com/pub/rcu/urc...

But as you will notice that the quiescent-state based reclamation (QSBR)
and RCU scales very well cause there reader side functions scale very
well, but don't worry , you don't need the RCU, cause my scalable
RWLocks are also scaling very well on read-mostly scenarios, why my
scalable RWLocks are scaling very well ? Cause read the following about
my LW_RWLock algorithm:

"Notice carefully that my RWLock is scalable cause each element of the
FCount1^ array resides in a seperate cache line , hence when i am
incrementing with LockedExchangeAdd(FCount1^[myid].fcount1,1) it's
scaling, notice also that i am using the following:
myid:=GetCurrentProcessorNumber so i am puting the processor number
inside myid variable."

Read this:

http://pages.videotron.com/aminer/rw...

So as you have noticed in my algorithm, the threads that have the same
"myid" will have and will increment the "FCount1^[myid].fcount1" in the
same local cache, so there will be no cache-lines transfers between the
cores on the reader side of my scalable RWlocks algorithms, this is why
my RWLock algorithms are scaling very well on read-mostly scenarios.

I think that my scalable RWLocks algorithms that i have invented are as
good and as scalable as both quiescent-state based reclamation (QSBR)
and RCU (Read-Copy Update) on the above IEEE paper..

And optimistic synchronization with hardware or software Transactional
memory will not outperform my scalable RWLocks algorithms, cause my
scalable RWLocks algorithms are used in scenarios of frequent reads and
infrequent writes, so my scalable RWLocks algorithms are still very good
and useful.

So hope that you will be happy with my RWLocks algorithms...

You can download my scalable RWLocks from:

https://sites.google.com/site/aminer68/scala...


And of course it's a freeware.


Thank you,
Amine Moulay Ramdane.