[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

very slow IO (STDIN.gets and puts) on Linux, ruby 1.8.2_pre3

MiG

3/10/2005 8:05:00 PM


Why is Ruby 2x slower in IO than php or bash?


data.dat is 80 MB file with 5000000 lines. I use Linux, 2GB RAM (tested
on another pc with similar result).

--------------------

test.php:
#!/usr/bin/php
<? while (fgets(STDIN)); ?>

$ time ./test.php < data.dat
/test.php < data.dat 5,59s user 0,19s system 88% cpu 6,516 total

--------------------

test.rb:
#!/usr/bin/ruby
while gets
end

$ time ./test.rb < data.dat
/test.rb < data.dat 11,51s user 0,31s system 86% cpu 13,598 total



9 Answers

Florian Gross

3/10/2005 8:48:00 PM

0

MiG wrote:

> Why is Ruby 2x slower in IO than php or bash?
>
> data.dat is 80 MB file with 5000000 lines. I use Linux, 2GB RAM (tested
> on another pc with similar result).
>
> --------------------
>
> test.php:
> #!/usr/bin/php
> <? while (fgets(STDIN)); ?>
>
> $ time ./test.php < data.dat
> /test.php < data.dat 5,59s user 0,19s system 88% cpu 6,516 total
>
> --------------------
>
> test.rb:
> #!/usr/bin/ruby
> while gets
> end
>
> $ time ./test.rb < data.dat
> /test.rb < data.dat 11,51s user 0,31s system 86% cpu 13,598 total

Perhaps also try io.read() -- I think it will be faster.

Ben Giddings

3/10/2005 9:15:00 PM

0

MiG wrote:
> Why is Ruby 2x slower in IO than php or bash?
>
>
> data.dat is 80 MB file with 5000000 lines. I use Linux, 2GB RAM (tested
> on another pc with similar result).
>
> --------------------
>
> test.php:
> #!/usr/bin/php
> <? while (fgets(STDIN)); ?>
>
> $ time ./test.php < data.dat
> ./test.php < data.dat 5,59s user 0,19s system 88% cpu 6,516 total
>
> --------------------
>
> test.rb:
> #!/usr/bin/ruby
> while gets
> end
>
> $ time ./test.rb < data.dat
> ./test.rb < data.dat 11,51s user 0,31s system 86% cpu 13,598 total
>

English is so much worse than Japanese! When I try to count to one
million in English it takes me 3.42 days, but when I try it in Japanese,
it only takes me 3.12 days!

Obviously, that means English is the worse language. Why does English
suck so bad?!?

-----

In other words: your benchmark is really dumb. That isn't practical
code, and trying to draw any conclusions from it is silly. For Ruby to
be considered fast, how much time should it take to read and discard a
line of text 5 kagillion times? Btw, I found a way to optimize your code:

deleteme.rb
#!/usr/bin/ruby
exit(0)

ben% time ruby deleteme.rb
ruby deleteme.rb 0.00s user 0.00s system 102% cpu 0.006 total

I'm still working on getting it to run in less than 0.004 total.

Ben


Florian Frank

3/10/2005 10:52:00 PM

0

MiG wrote:

>test.rb:
>#!/usr/bin/ruby
>while gets
>end
>
>$ time ./test.rb < data.dat
>./test.rb < data.dat 11,51s user 0,31s system 86% cpu 13,598 total
>
>
>
>
Well, Ruby assigns the line string to $_, if you use gets that way. So
Ruby has to construct an object for every line. Perhaps PHP doesn't do that?

--
Florian Frank



MiG

3/11/2005 7:55:00 AM

0


1. I have NOTHING against Ruby, it is my best language
2. Is it wrong-doing to ask?
3. My dumb benchmark: I used real data. If you have 2GB of free RAM and
use 80MB file, is it wrong? It's the same if you have 1MB RAM and use
smaller file. I used the real data I have, that's all. It behaves the
same way with smaller.
4. Thank you for excellent humour.

MiG

>English is so much worse than Japanese! When I try to count to one
>million in English it takes me 3.42 days, but when I try it in Japanese,
>it only takes me 3.12 days!
>
>Obviously, that means English is the worse language. Why does English
>suck so bad?!?
>
>-----
>
>In other words: your benchmark is really dumb. That isn't practical
>code, and trying to draw any conclusions from it is silly. For Ruby to
>be considered fast, how much time should it take to read and discard a
>line of text 5 kagillion times? Btw, I found a way to optimize your code:
>
>deleteme.rb
>#!/usr/bin/ruby
>exit(0)
>
>ben% time ruby deleteme.rb
>ruby deleteme.rb 0.00s user 0.00s system 102% cpu 0.006 total
>
>I'm still working on getting it to run in less than 0.004 total.
>
>Ben
>



MiG

3/11/2005 7:57:00 AM

0


So the solution is maybe to use getc and parse lines on my own...

MiG

Dne 10/3/2005, napsal "Florian Frank" <flori@nixe.ping.de>:

>MiG wrote:
>
>>test.rb:
>>#!/usr/bin/ruby
>>while gets
>>end
>>
>>$ time ./test.rb < data.dat
>>./test.rb < data.dat 11,51s user 0,31s system 86% cpu 13,598 total
>>
>>
>>
>>
>Well, Ruby assigns the line string to $_, if you use gets that way. So
>Ruby has to construct an object for every line. Perhaps PHP doesn't do that?
>
>--
>Florian Frank
>
>



gabriele renzi

3/11/2005 8:21:00 AM

0

MiG ha scritto:
> So the solution is maybe to use getc and parse lines on my own...
>

or maybe use one of the standard methods for iterating over lines, such as
open('file').each do |x| stuff(x) end
this would not set $_ (I don't think it slows down things that much, but
who knows).
Once you have stuff() in place you can re-check if there is a difference.

Navindra Umanee

3/11/2005 8:24:00 AM

0

MiG <mig@1984.cz> wrote:
> So the solution is maybe to use getc and parse lines on my own...

Maybe you're missing the point.

The two programs aren't doing the same amount of work; your benchmarks
aren't equivalent. If you change the PHP benchmark slightly, you'll
likely see PHP is just as slow as Ruby.

[navindra@dot /tmp]$ time php -r 'while (fgets(STDIN));' < FILE
8.421u 2.334s 0:26.53 40.5% 0+0k 0+0io 2pf+0w
[navindra@dot /tmp]$ time ruby -e 'while gets;end' < FILE
11.676u 2.586s 0:39.44 36.1% 0+0k 0+0io 11pf+0w
[navindra@dot /tmp]$ time php -r 'while ($blah=fgets(STDIN));' < FILE
10.680u 2.372s 0:37.83 34.4% 0+0k 0+0io 10pf+0w

Cheers,
Navin.



Tom Willis

3/11/2005 2:06:00 PM

0

On Fri, 11 Mar 2005 17:23:50 +0900, Navindra Umanee
<navindra@cs.mcgill.ca> wrote:
> MiG <mig@1984.cz> wrote:
> > So the solution is maybe to use getc and parse lines on my own...
>
> Maybe you're missing the point.
>
> The two programs aren't doing the same amount of work; your benchmarks
> aren't equivalent. If you change the PHP benchmark slightly, you'll
> likely see PHP is just as slow as Ruby.
>
> [navindra@dot /tmp]$ time php -r 'while (fgets(STDIN));' < FILE
> 8.421u 2.334s 0:26.53 40.5% 0+0k 0+0io 2pf+0w
> [navindra@dot /tmp]$ time ruby -e 'while gets;end' < FILE
> 11.676u 2.586s 0:39.44 36.1% 0+0k 0+0io 11pf+0w
> [navindra@dot /tmp]$ time php -r 'while ($blah=fgets(STDIN));' < FILE
> 10.680u 2.372s 0:37.83 34.4% 0+0k 0+0io 10pf+0w
>
> Cheers,
> Navin.
>
>

Here's my results on a 14.5 mb file, ruby wins.

twillis:~$ time ruby -e 'while gets;end'< HL7Audit.csv

real 0m1.481s
user 0m0.924s
sys 0m0.095s
twillis:~$ time php -r 'while($blah=fgets(STDIN));'< HL7Audit.csv

real 0m2.327s
user 0m1.001s
sys 0m0.083s


--
Thomas G. Willis
http://paperbac...


Ben Giddings

3/11/2005 5:12:00 PM

0

MiG wrote:
> 1. I have NOTHING against Ruby, it is my best language
> 2. Is it wrong-doing to ask?
> 3. My dumb benchmark: I used real data. If you have 2GB of free RAM and
> use 80MB file, is it wrong? It's the same if you have 1MB RAM and use
> smaller file. I used the real data I have, that's all. It behaves the
> same way with smaller.
> 4. Thank you for excellent humour.

I'm glad you see the humour. I was a little harsh, but I was having a
bad day, sorry.

Really, the benchmark really isn't meaningful. You need to do something
with the data you're reading. It doesn't matter if it's a 80MB file or
a 10 byte file. If you're simply reading the data and discarding it,
you aren't doing anything. For the measurement to be meaningful, you
actually need to *do something*.

Would you expect these two applications to take the same amount of time:

#!/bin/env ruby

1000.times do
# do nothing
end

------

#!/bin/env ruby

1000.times do
num = Math.sin(rand(1.0))
if num < 0.0
num += 1.0
else
num -= 1.0
end
end


Both programs are essentially equivalent. Neither actually *does*
anything. If the second one ran slower, could you really draw any
conclusions about the speed of Ruby's math operations?

In fact, it may be that Ruby's IO is slower than other languages. If
Ruby were even close to the speed of C I'd be stunned. Ruby has to
construct an object with every line it reads. C just stuffs things
blindly into an array. The problem is that your sample doesn't test
Ruby's IO capabilities. In the end, your sample code does absolutely
nothing.

If you want to benchmark Ruby's IO, try doing something like writing a
program to concatenate a number of files, or even just to copy a file.
Open one file for writing, and then open a file for reading, read
something from the input file, write to the output file.

In any case, until the slowness of Ruby's IO proves to be a problem in
actual use, why do you care how it fares on a benchmark?

Ben