Asp Forum - Concurent (using threads) slower than sequential -doubt

Carlos Ortega

10/6/2008 3:40:00 AM

Hi Folks.
While starting to study the benefits of using threads in Ruby, I tried
to solve the following problem:

I have 3 text files ( numbers0.txt, numbers1.txt, c:\numbers2.txt ),
each file contains a very large list of numbers.
I attempt to read and compute each file by using a different thread.
Finally I tried to sum all subtotals to provide the final result.

Here is the code.
===================

require 'thread'
m_threads = []

print "INITIAL TIME := ", initial_time = Time.now, "\n"
3.times do |i|
m_threads[i] = Thread.new do
total_per_thread = 0
case i
when 0 then path = "C:\\numbers0.txt"
when 1 then path = "C:\\numbers1.txt"
when 2 then path = "C:\\numbers2.txt"
end
File.open( path, "r" ) do |m_file|
while line = m_file.gets
total_per_thread = line.to_i + total_per_thread
end
Thread.current[:INDEX] = total_per_thread
end
end
end

result = 0
m_threads.each{ |t| t.join; result = t[:INDEX] + result; }

print "FINAL TIME := ", final_time = Time.now, "\n"
print "TOTAL TIME := ", total_time = final_time-initial_time, "\n"
print "Total := ", result, "\n"

=======================================
Output (CONCURRENT - Using Threads):

INITIAL TIME := Sun Oct 05 22:07:26 -0500 2008

FINAL TIME := Sun Oct 05 22:07:38 -0500 2008
TOTAL TIME := 11.485
Total := 1150000000
========================================

I verified and each thread made the job, result is OK too.
I also solved the same problem by using a sequential program with no
threads at all
Here is the code:

print "INITIAL Time := ", initial_time = Time.now, "\n"

paths = [ "C:\\numbers0.txt", "C:\\numbers1.txt", "C:\\numbers2.txt" ]
result = 0
for m_path in paths
File.open( m_path, "r+" ) do |m_file|
while line = m_file.gets
result = line.to_i + result
end
end
end

print "FINAL time := ", final_time = Time.now, "\n"
print "TOTAL time := ", total_time = final_time - initial_time, "\n"
print "Total := ", result, "\n"

=======================================
Output: (SECUENCIAL- NO Threads)

INITIAL TIME := Sun Oct 05 22:34:47 -0500 2008
FINAL TIME := Sun Oct 05 22:34:57 -0500 2008
TOTAL TIME := 10.656
Total := 1150000000

=======================================
As you see, the thread based program run slower.
I thought that by using threads it will be faster, but it didn't....Why
is it slower?

Any help will be very appreciated
--
Posted via http://www.ruby-....

5 Answers

Charles Oliver Nutter

10/6/2008 5:18:00 AM

Carlos Ortega wrote:
> As you see, the thread based program run slower.
> I thought that by using threads it will be faster, but it didn't....Why
> is it slower?

You may want to try with JRuby, which actually uses native threads. On a
multi-core system, it should improve performance.

- Charlie

Erik Veenstra

10/6/2008 8:58:00 PM

If you are on Linux, you might want to have a look at the gem
"forkandreturn" [1]. ForkAndReturn handles each element in an
enumeration in a seperate process [2].

gegroet,
Erik V. - http://www.erikve...

[1] http://www.erikve...forkandreturn/doc/index.html

[2] ...if you're on a multicore machine. Oops. Will be fixed in
the next release.

----------------------------------------------------------------

$ cat count1.rb
files = ["numbers0.txt", "numbers1.txt", "numbers2.txt"]
result = 0

files.collect do |file|
res = 0

File.open(file) do |file|
file.each do |line|
res += line.to_i
end
end

res
end.each do |res|
result += res
end

p result

----------------------------------------------------------------

$ diff -ur count[12].rb | clean_diff
+require "forkandreturn"
+
files = ["numbers0.txt", "numbers1.txt", "numbers2.txt"]
result = 0

-files.collect do |file|
+files.concurrent_collect do |file|
res = 0

File.open(file) do |file|

----------------------------------------------------------------

$ time ruby count1.rb
81627450482688

real 0m15.309s
user 0m15.201s
sys 0m0.076s

----------------------------------------------------------------

$ time ruby count2.rb
81627450482688

real 0m8.976s <=== Multicore!
user 0m17.177s <=== Multicore!
sys 0m0.204s

----------------------------------------------------------------

$ uname -a
Linux laptop 2.6.24-19-generic #1 SMP Wed Aug 20 22:56:21 UTC 2008
i686 GNU/Linux

----------------------------------------------------------------

$ ruby --version
ruby 1.8.6 (2008-06-20 patchlevel 230) [i686-linux]

----------------------------------------------------------------

$ gem list | grep -ie forkandreturn
forkandreturn (0.2.0)

----------------------------------------------------------------

Erik Veenstra

10/6/2008 10:07:00 PM

> [2] ...if you're on a multicore machine. Oops. Will be fixed in
> the next release.

It's released...

gegroet,
Erik V.

Carlos Ortega

10/7/2008 3:26:00 AM

Erik Veenstra wrote:
>> [2] ...if you're on a multicore machine. Oops. Will be fixed in
>> the next release.
>
> It's released...
>
> gegroet,
> Erik V.

Thank you Erik and Robert...

I will try on both environments.

Regards
Carlos
--
Posted via http://www.ruby-....

Prashant Srinivasan

10/8/2008 12:10:00 AM

Carlos, that sounds about correct. I did some similar tests early this
year[1]. Basically your problem is that Ruby runs on one kernel
thread/LWP irrespective of how many user land threads you create. It's
expensive to switch between threads(cost varies depending on which
hardware platform you're running on) - so these two factors combine to
make it slower for you when you use threads.

JRuby was almost just as bad until JRuby 1.1.1 after which it started
doing better with threads(this was due to a bug fix by Charles [2]).
It's now much better at scaling with threads compared with MRI, but
still quite poor in absolute terms[3] - it's scalability on an
embarrassingly threaded program eroded 54% jumping from 1 to 2 threads
and became worse after that. (*Caveat:* My numbers are old, they're
from March, and things may have gotten much better since!)

[1] http://blogs.sun.co.../resource/files/jruby-ruby-comp...
[2] Ref to Charles' entry
http://blog.headius.com/2008/04/shared-data-considered-ha...
[3] http://blogs.sun.co.../resource/files/jruby-t...

-ps

Carlos Ortega wrote:
> Hi Folks.
> While starting to study the benefits of using threads in Ruby, I tried
> to solve the following problem:
>
> I have 3 text files ( numbers0.txt, numbers1.txt, c:\numbers2.txt ),
> each file contains a very large list of numbers.
> I attempt to read and compute each file by using a different thread.
> Finally I tried to sum all subtotals to provide the final result.
>
> Here is the code.
> ===================
>
> require 'thread'
> m_threads = []
>
> print "INITIAL TIME := ", initial_time = Time.now, "\n"
> 3.times do |i|
> m_threads[i] = Thread.new do
> total_per_thread = 0
> case i
> when 0 then path = "C:\\numbers0.txt"
> when 1 then path = "C:\\numbers1.txt"
> when 2 then path = "C:\\numbers2.txt"
> end
> File.open( path, "r" ) do |m_file|
> while line = m_file.gets
> total_per_thread = line.to_i + total_per_thread
> end
> Thread.current[:INDEX] = total_per_thread
> end
> end
> end
>
> result = 0
> m_threads.each{ |t| t.join; result = t[:INDEX] + result; }
>
> print "FINAL TIME := ", final_time = Time.now, "\n"
> print "TOTAL TIME := ", total_time = final_time-initial_time, "\n"
> print "Total := ", result, "\n"
>
> =======================================
> Output (CONCURRENT - Using Threads):
>
> INITIAL TIME := Sun Oct 05 22:07:26 -0500 2008
>
> FINAL TIME := Sun Oct 05 22:07:38 -0500 2008
> TOTAL TIME := 11.485
> Total := 1150000000
> ========================================
>
> I verified and each thread made the job, result is OK too.
> I also solved the same problem by using a sequential program with no
> threads at all
> Here is the code:
>
> print "INITIAL Time := ", initial_time = Time.now, "\n"
>
> paths = [ "C:\\numbers0.txt", "C:\\numbers1.txt", "C:\\numbers2.txt" ]
> result = 0
> for m_path in paths
> File.open( m_path, "r+" ) do |m_file|
> while line = m_file.gets
> result = line.to_i + result
> end
> end
> end
>
> print "FINAL time := ", final_time = Time.now, "\n"
> print "TOTAL time := ", total_time = final_time - initial_time, "\n"
> print "Total := ", result, "\n"
>
> =======================================
> Output: (SECUENCIAL- NO Threads)
>
> INITIAL TIME := Sun Oct 05 22:34:47 -0500 2008
> FINAL TIME := Sun Oct 05 22:34:57 -0500 2008
> TOTAL TIME := 10.656
> Total := 1150000000
>
> =======================================
> As you see, the thread based program run slower.
> I thought that by using threads it will be faster, but it didn't....Why
> is it slower?
>
> Any help will be very appreciated
>

--
Prashant Srinivasan
F/OSS Enthusiast
Sun Microsystems, Inc.
http://blogs.sun.co...
GnuPG key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=...

comp.lang.ruby

Concurent (using threads) slower than sequential -doubt

Carlos Ortega

Charles Oliver Nutter

Erik Veenstra

Erik Veenstra

Carlos Ortega

Prashant Srinivasan

x Login to ForumsZone