Asp Forum
Home
|
Login
|
Register
|
Search
Forums
>
comp.programming
Efficiency of my parallel algorithm
Ramine
1/13/2015 10:33:00 PM
Hello,
I have spook before about my Parallel Conjugate gradient linear system
solver library and i have said that it is using a probabilistic
mechanism that is very efficient and that render my parallel algorithm
into a scalable parallel algorithm on NUMA architecture, so now i will
prove to you what i am saying:
If you look at my parallel algorithm it is dividing the array by 250
elements, and if you look carefully i am using two functions that
consumes the greater part of all the CPU, it is the atsub() and asub(),
and inside those functions i am using a probabilistic mechanism so that
to render my algorithm scalable on NUMA architecture, what i am
doing is scrambling the array parts using a probabilistic function
and what i have noticed that this probabilitic mechanism is very
efficient, to proove to you what i am saying , please look at the
following simulation that i have done using a variable that
contains the number of NUMA nodes, and what i have noticed that my
simulation is giving almost a perfect scalability on NUMA architecture,
for example let us give to the "NUMA_nodes" variable a value of 4,
and to our array a value of 250, since i am dividing all the array by
250, the simulation will give a number of contention points of a
quarter of the array, so if i am using 16 cores , in the the worst case
it will scale 4X throughput on NUMA architecture, because since
i am using an array of 250 and there is a quarter of the array of
contention points , so from the Amdahl'law this will give a scalability
of 4X throughput on four NUMA nodes, and this will give almost a perfect
scalability on more and more NUMA nodes, so my parallel algorithm is
scalable on NUMA architecture,
Here the simulation that i have done, please run it and you will notice
yourself that my parallel algorithm is scalable on NUMA architecture.
Here it is:
---
program test;
uses math;
var tab,tab1,tab2,tab3:array of integer;
a,n1,k,i,n2,tmp,j,numa_nodes:integer;
begin
a:=250;
Numa_nodes:=4;
setlength(tab2,a);
for i:=0 to a-1
do
begin
tab2[i]:=i mod numa_nodes;
end;
setlength(tab,a);
randomize;
for k:=0 to a-1
do tab[k]:=k;
n2:=a-1;
for k:=0 to a-1
do
begin
n1:=random(n2);
tmp:=tab[k];
tab[k]:=tab[n1];
tab[n1]:=tmp;
end;
setlength(tab1,a);
randomize;
for k:=0 to a-1
do tab1[k]:=k;
n2:=a-1;
for k:=0 to a-1
do
begin
n1:=random(n2);
tmp:=tab1[k];
tab1[k]:=tab1[n1];
tab1[n1]:=tmp;
end;
for i:=0 to a-1
do
if tab2[tab[i]]=tab2[tab1[i]] then
begin
inc(j);
writeln('A contention at: ',i);
end;
writeln('Number of contention points: ',j);
setlength(tab,0);
setlength(tab1,0);
setlength(tab2,0);
end.
---
You can download my Parallel Conjugate gradient solver library from:
https://sites.google.com/site/aminer68/scalable-parallel-implementation-of-conjugate-gradient-linear-system-solver-library-that-is-numa-aware-and-c...
Thank you for your time.
Amine Moulay Ramdane.
Servizio di avviso nuovi messaggi
Ricevi direttamente nella tua mail i nuovi messaggi per
Efficiency of my parallel algorithm
Inserendo la tua e-mail nella casella sotto, riceverai un avviso tramite posta elettronica ogni volta che il motore di ricerca troverà un nuovo messaggio per te
Il servizio è completamente GRATUITO!
x
Login to ForumsZone
Login with Google
Login with E-Mail & Password