[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.programming.threads

About my scalable MLock...

Ramine

7/31/2015 7:57:00 PM

Hello,


So in this post i will explain to you what is exactly my invention that
is my scalable MLock algorithm..

If you look the Enter() method of my scalable MLock algorithm you will
read this(that's easy to undertand the code in Object Pascal) , and read
my explanation after:

You can download the source code here:

https://sites.google.com/site/aminer68/scal...


==
procedure TMLock.Enter;
var prev,fcount1:PListEntry;
k:integer;
Buffer1:pointer;

begin
Buffer1 := AllocMem(SizeOf(TListEntry) + Alignment);
FCount1 := PListEntry((int(Buffer1) + Alignment - 1)
and not (Alignment - 1));

fcount1^.Data:=0;
fcount1^.Next:=nil;
fcount1^.mem:=buffer1;

long(prev) := LockedExchange(long(m_head), long(fcount1));

prev.next := fcount1;

if (flag=1)
then
begin
if (m_tail=prev)
then
if CAS(flag,1,0)
then
begin
fcount1^.Data:=-1;
exit;
end;
end;
k:=1;
repeat

if fcount1^.data=1
then
begin
freemem(fcount1^.mem);
break;
end
else if fcount1^.data=2
then
begin
fcount1^.Data:=-1;
break
end;
if (flag=1)
then
begin
if (m_tail.next.next=fcount1)
then
if CAS(flag,1,0)
then
begin
fcount1^.Data:=-1;
break;
end;
end;
inc(k);
if (k mod 140)=0
then
{$IFDEF FPC}
ThreadSwitch;
{$ENDIF}
{$IFDEF Delphi}
sleep(0);
{$ENDIF}
asm pause end;
until false;

end;
===



First i am allocating a Buffer1 struct and using fcount1 that is a
pointer to a 64 bytes aligned record and i am assigning zero to
fcount1^.Data cause the threads that enter will have to wait for there
turn and for fcount1^.Data to become 1 to enter the scalable Lock except
for the first thread that enters the lock, the first thread that enter
the lock will find the "flag" variable equal to 1 and notice with me
that the CAS(flag,1,0) will succeed for the first thread that enters the
Enter() method...

Look that i am using this inside the source code:

long(prev) := LockedExchange(long(m_head), long(fcount1));

prev.next := fcount1;

this is a waitfree queue and the push() and the pop() inside my scalable
MLock algorithm are waitfree...

So notice with me that the threads on the Enter() method will enter a
repeat-until loop waiting for there turn to enter, and notice
that the CAS will be touched rarely , i have benchmarked and collected
some empiric statistics and i have noticed that the CAS inside the
repeat-unil loop inside the Enter() method will be touched only 1% of
the time, so the threads will wait 99% of the time for fcount1^.data to
become 1 or fcount1^.data to become 2 , and notice that fcount1^.data
inside the Enter() method is not generating cache-lines transfers for
each thread waiting in the Enter() method, cause fcount1^.data is only
touched by its correponding threads on its local cache, this is why my
MLock algorithm is scalable, cause it minimizes efficiently the
cache-coherence traffic.

For the Leave() method it is waitfree and it is easy to prove that, just
look at the source code and you will notice that even though there is a
repeat-until loop inside the Leave() method , my algorithm will not loop
more than 2 times inside the Leave() method, so it makes my algorithm
waitfree and my algorithm can be used on parallel and realtime safety
critical systems.

And also you will notice easily that my scalable MLock is FIFO fair , so
it's starvation-free.

This is a node based Lock that is scalable, FIFO fair and starvation-free.

- Discovered by Amine Moulay Ramdane

- This lock is scalable

- It has the same space requirement as the scalable MCS lock

- Doesn't require a local "queue node" to be passed in as a parameter as
is doing the MCS and CLH locks.

- Spins only on local locations on a cache-coherent machine

- And it's fast.



Please read also this:

A bigger problem with the MCS lock is its API. It requires a second
structure to be passed in addition to the address of the lock. The
algorithm uses this second structure to store the information which
describes the queue of threads waiting for the lock. Unfortunately, most
code written using spinlocks doesn't have this extra information, so the
fact that the MCS algorithm isn't a drop-in replacement to a standard
spin lock is a problem.

An IBM working group found a way to improve the MCS algorithm to remove
the need to pass the extra structure as a parameter. Instead, on-stack
information was used instead. The result is the K42 lock algorithm:

Unfortunately, the K42 algorithm has another problem. It appears that it
may be patented by IBM. Thus it cannot be used either. (Without perhaps
paying royalties to IBM.)

So you have to know that my scalable MLock doesn't require a local
"queue node" to be passed in as a parameter as is doing the MCS and CLH
locks, my scalable MLock doesn't require any parameter to be passed,
just call the Enter() and Leave() method and that's all.


You can download my scalable node based Lock that is FIFO fair and
waitfree from:


https://sites.google.com/site/aminer68/scal...



Thank you,
Amine Moulay Ramdane.