[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.programming.threads

About my new algorithm

Ramine

12/5/2014 3:57:00 AM


Hello,


Chriss M. Thomasson wrote about my new algorithm:

>Sorry, but you already lost to RCU. Big time. That atomic
>RMR and associated nasty memory barrier simply destroys
>performance when compared to RCU read-side overhead.
>I am not sure that you actually understand RCU fully.

Chriss M Thomasson, i think you don't understand correclty my new
algorithm , cause this "lock add" inside the RLock() of the reader side
is run only by the group of threads belonging to the same core, cause
RLock() called from the reader side belong to a distributed algorithm,
so it's very very cheap, it is why my new algorithm scales very well.



Thank you,
Amine Moulay Ramdane.


5 Answers

Ramine

12/5/2014 4:07:00 AM

0


Look inside the source code, RLock() called from the reader side
belong to the DRWLOCK library that is a distributed reader-writer
mutex, this is why it's very very cheap and this is why it scales
very very well...

Please look inside my algorithm's source code again to understand it
better by downloading it from:

https://sites.google.com/site/aminer68/scalable-distributed-seque...



Thank you,
Amine Moulay Ramdane.






Chris M. Thomasson

12/5/2014 7:33:00 PM

0

> "Ramine" wrote in message news:m5qvr9$o4v$3@dont-email.me...

> Hello,

> Chriss M. Thomasson wrote about my new algorithm:

> >Sorry, but you already lost to RCU. Big time. That atomic
> >RMR and associated nasty memory barrier simply destroys
> >performance when compared to RCU read-side overhead.
> >I am not sure that you actually understand RCU fully.

> Chriss M Thomasson, i think you don't understand correclty my new
> algorithm , cause this "lock add" inside the RLock() of the reader side
> is run only by the group of threads belonging to the same core, cause
> RLock() called from the reader side belong to a distributed algorithm,
> so it's very very cheap, it is why my new algorithm scales very well.

Comparing RCU read-side with anything that uses an atomic rmw and/or
memory barrier is just plain foolish. That LOCK ADD destroys performance
when compared to RCU's basically zero overhead reads. End of story.

I have been working with RCU for a long time now, and know what I
am talking about. Trust me.

Drazen Kacar

12/5/2014 10:49:00 PM

0

Chris M. Thomasson wrote:

> Comparing RCU read-side with anything that uses an atomic rmw and/or
> memory barrier is just plain foolish. That LOCK ADD destroys performance
> when compared to RCU's basically zero overhead reads. End of story.

I don't quite understand the "or" part of your "and/or memory barrier"
statement. If there was a different algorithm which used only memory
barrier, but not RMW, would that be so much worse than RCU?

Another thing I don't quite understand is in Wikipedia's RCU article (at
http://en.wikipedia.org/wiki/Read-c..., section Simple
implementation)

The article says:

[RCU's] read-side overhead is precisely zero, as smp_read_barrier_depends()
is an empty macro on all but DEC Alpha CPUs;[19] such memory barriers are
not needed on modern CPUs.

I was under the impression that you'd need memory barrier on Alpha because
it does more agressive memory access reordering than other processors.
But then, if that's correct, that means that modern CPUs (whichever they
are) still don't have features that old Alpha CPUs have. And not the other
way round.

--
.-. .-. Yes, I am an agent of Satan, but my duties are largely
(_ \ / _) ceremonial.
|
| dave@fly.srk.fer.hr

Chris M. Thomasson

12/6/2014 9:46:00 PM

0

> "Drazen Kacar" wrote in message
> news:slrnm84di2.je0.dave@fly.srk.fer.hr...

> > Chris M. Thomasson wrote:

> > Comparing RCU read-side with anything that uses an atomic rmw and/or
> > memory barrier is just plain foolish. That LOCK ADD destroys
> > performance
> > when compared to RCU's basically zero overhead reads. End of story.

> I don't quite understand the "or" part of your "and/or memory barrier"
> statement. If there was a different algorithm which used only memory
> barrier, but not RMW, would that be so much worse than RCU?

Yes. Anytime you can reduce the number of memory barriers and/or
atomics, the better. I have tested RCU verses the same RCU with a single
MFENCE instruction on the reader side. The one without the membar
slaughtered it wrt reads-per-second per-thread.



> Another thing I don't quite understand is in Wikipedia's RCU article (at
> http://en.wikipedia.org/wiki/Read-c..., section Simple
> implementation)

> The article says:

> [RCU's] read-side overhead is precisely zero, as
> smp_read_barrier_depends()
> is an empty macro on all but DEC Alpha CPUs;[19] such memory barriers
> are
> not needed on modern CPUs.

> I was under the impression that you'd need memory barrier on Alpha because
> it does more agressive memory access reordering than other processors.
> But then, if that's correct, that means that modern CPUs (whichever they
> are) still don't have features that old Alpha CPUs have. And not the other
> way round.

Yup. On a DEC Alpha, RCU simply needs a membar for it does not honor
data-dependent loads. RCU relies on data-dependences being there.

Chris M. Thomasson

12/6/2014 9:56:00 PM

0



> "Chris M. Thomasson" wrote in message
> news:m5vteu$dp3$1@speranza.aioe.org...

> > "Drazen Kacar" wrote in message
> > news:slrnm84di2.je0.dave@fly.srk.fer.hr...

[...]

> Yup. On a DEC Alpha, RCU simply needs a membar for it does not honor
> data-dependent loads. RCU relies on data-dependences being there.

AFAICT, this is why `std::memory_order_consume' exists.