Asp Forum - Help me understand why the Ruby block is slower than without

Alan Burch

3/10/2006 10:58:00 PM

I just wrote my first Ruby script. I'm an experienced C and perl
programmer, so please, if it looks too much like these languages and not
Ruby, let me know. I've got a 100K word list (Linux dictionary) on my
Mac and am opening it then looking for any words that are exactly 10
letters long with no letters repeating ('profligate\n') == 11 is a
match. After I wrote my first version I did some playing. I first saw
that the array class mixed in enumerable and that I could use the to_a
call from there, but a quick check using -r profile showed that my
original call to split was a much quicker way to convert from a string
to an array. I then tried putting the File.open in a block and found
that this was much slower, even if I subtract out the time for the open,
which I assume is an error in how the profile is counting total time.

Here's the faster version:

f = File.open("./words")
begin
while f.gets
if $_.length == 11
ar = $_.split(//)
if ar.uniq! == nil
print "#{ar.to_s}"
end
end
end
rescue EOFError
f.close
end

And here's the slower block version:

File.open("./words") { |f|
while f.gets
if $_.length == 11
ar = $_.split(//)
if ar.uniq! == nil
print "#{ar.to_s}"
end
end
end
}

Again, the words file is just a list of about 100K unique words from the
dict command or similar on *nix....

Any critique welcome and enlightenment is encouraged.
Thanks!

--
Posted via http://www.ruby-....

34 Answers

William James

3/10/2006 11:27:00 PM

Alan Burch wrote:
> I just wrote my first Ruby script. I'm an experienced C and perl
> programmer, so please, if it looks too much like these languages and not
> Ruby, let me know. I've got a 100K word list (Linux dictionary) on my
> Mac and am opening it then looking for any words that are exactly 10
> letters long with no letters repeating ('profligate\n') == 11 is a
> match. After I wrote my first version I did some playing. I first saw
> that the array class mixed in enumerable and that I could use the to_a
> call from there, but a quick check using -r profile showed that my
> original call to split was a much quicker way to convert from a string
> to an array. I then tried putting the File.open in a block and found
> that this was much slower, even if I subtract out the time for the open,
> which I assume is an error in how the profile is counting total time.
>
> Here's the faster version:
>
> f = File.open("./words")
> begin
> while f.gets
> if $_.length == 11
> ar = $_.split(//)
> if ar.uniq! == nil
> print "#{ar.to_s}"
> end
> end
> end
> rescue EOFError
> f.close
> end
>
> And here's the slower block version:
>
> File.open("./words") { |f|
> while f.gets
> if $_.length == 11
> ar = $_.split(//)
> if ar.uniq! == nil
> print "#{ar.to_s}"
> end
> end
> end
> }
>
> Again, the words file is just a list of about 100K unique words from the
> dict command or similar on *nix....
>
> Any critique welcome and enlightenment is encouraged.
> Thanks!

File.open("wordlist") { |f|
while w = f.gets
puts w if w.size==11 && w.split(//).uniq.size == 11
end
}

Alan Burch

3/10/2006 11:39:00 PM

> File.open("wordlist") { |f|
> while w = f.gets
> puts w if w.size==11 && w.split(//).uniq.size == 11
> end
> }

Ok, factor of 10 faster, and more Ruby like, much and many Thanks!
Others, any comments on the block slow down?
AB

--
Posted via http://www.ruby-....

Alan Burch

3/11/2006 12:09:00 AM

>
> Ok, factor of 10 faster, and more Ruby like, much and many Thanks!
> Others, any comments on the block slow down?
> AB

I mis-spoke. Not a factor of 10 faster, just marginally. I had
"wordlist" in my directory as a list of the unique 10 letter words.
I do like the code better still, but with out the block, it's still much
faster. Also using uniq! rather than size is quicker than taking the
size twice.

My current fastest script:

f= File.open("./words")
begin
while w = f.gets
puts w if w.size == 11 && w.split(//).uniq! == nil
end
rescue EOFError
f.close
end

Not measurably faster than the first one, but seems better and more Ruby
like to me.

--
Posted via http://www.ruby-....

Mark Devlin

3/11/2006 12:59:00 AM

Alan Burch wrote:

> I mis-spoke. Not a factor of 10 faster, just marginally. I had
> "wordlist" in my directory as a list of the unique 10 letter words.
> I do like the code better still, but with out the block, it's still much
> faster. Also using uniq! rather than size is quicker than taking the
> size twice.

Solely for my own amusement, since I'm still trying teach myself Ruby...

File.open("./words").read.split.collect! {|x| x if x.length == 10 &&
x.split(//).uniq! == nil}.compact!.each {|x| puts x }

--
Posted via http://www.ruby-....

George Ogata

3/11/2006 1:27:00 AM

Alan Burch <orotone@gmail.com> writes:

>> File.open("wordlist") { |f|
>> while w = f.gets
>> puts w if w.size==11 && w.split(//).uniq.size == 11
>> end
>> }
>
> Ok, factor of 10 faster, and more Ruby like, much and many Thanks!
> Others, any comments on the block slow down?

I don't see much of a slowdown.

----------------------------------------------------------------------

g@crash:~/tmp$ cat read-slow.rb
File.open("./words") { |f|
while f.gets
if $_.length == 11
ar = $_.split(//)
if ar.uniq! == nil
print "#{ar.to_s}"
end
end
end
}
g@crash:~/tmp$ /usr/bin/time ruby read-slow.rb > out-slow
2.56user 0.01system 0:02.64elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+550minor)pagefaults 0swaps
g@crash:~/tmp$ /usr/bin/time ruby read-slow.rb > out-slow
2.55user 0.01system 0:02.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+550minor)pagefaults 0swaps
g@crash:~/tmp$ /usr/bin/time ruby read-slow.rb > out-slow
2.54user 0.01system 0:02.56elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+550minor)pagefaults 0swaps
g@crash:~/tmp$ cat read-fast.rb
f = File.open("./words")
begin
while f.gets
if $_.length == 11
ar = $_.split(//)
if ar.uniq! == nil
print "#{ar.to_s}"
end
end
end
rescue EOFError
f.close
end
g@crash:~/tmp$ /usr/bin/time ruby read-fast.rb > out-fast
2.51user 0.01system 0:02.54elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+544minor)pagefaults 0swaps
g@crash:~/tmp$ /usr/bin/time ruby read-fast.rb > out-fast
2.50user 0.01system 0:02.56elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+544minor)pagefaults 0swaps
g@crash:~/tmp$ /usr/bin/time ruby read-fast.rb > out-fast
2.51user 0.01system 0:02.53elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+544minor)pagefaults 0swaps

----------------------------------------------------------------------

There's a bit of a slowdown, but note that in your "fast" algo, the
stream is never closed, since IO#gets never throws EOFError. Do `ri
IO#gets' for the method's documentation. :-)

Another speedup: replace:

w.split(//).uniq.size == 11

with:

w !~ /(.).*\1/

It's faster since there's less intermediate diddlage, but
theoretically it shouldn't scale as well. You'd have to increase your
"11" quite a lot to notice it though I think.

More shell dump.

----------------------------------------------------------------------

g@crash:~/tmp$ cat read-one.rb
File.open("words") { |f|
while w = f.gets
puts w if w.size==11 && w.split(//).uniq.size == 11
end
}
g@crash:~/tmp$ /usr/bin/time ruby read-one.rb > out-one
2.54user 0.02system 0:02.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+548minor)pagefaults 0swaps
g@crash:~/tmp$ /usr/bin/time ruby read-one.rb > out-one
2.54user 0.01system 0:02.56elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+548minor)pagefaults 0swaps
g@crash:~/tmp$ /usr/bin/time ruby read-one.rb > out-one
2.55user 0.01system 0:02.58elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+548minor)pagefaults 0swaps
g@crash:~/tmp$ cat read-two.rb
File.open("words") { |f|
while w = f.gets
puts w if w.size==11 && w !~ /(.).*\1/
end
}
g@crash:~/tmp$ /usr/bin/time ruby read-two.rb > out-two
1.23user 0.01system 0:01.25elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+713minor)pagefaults 0swaps
g@crash:~/tmp$ /usr/bin/time ruby read-two.rb > out-two
1.27user 0.01system 0:01.29elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+713minor)pagefaults 0swaps
g@crash:~/tmp$ /usr/bin/time ruby read-two.rb > out-two
1.27user 0.02system 0:01.30elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+713minor)pagefaults 0swaps
g@crash:~/tmp$
g@crash:~/tmp$
g@crash:~/tmp$ diff out-one out-two
g@crash:~/tmp$

----------------------------------------------------------------------

Alan Burch

3/11/2006 1:33:00 AM

Mark Devlin wrote:

>
> Solely for my own amusement, since I'm still trying teach myself Ruby...
>
> File.open("./words").read.split.collect! {|x| x if x.length == 10 &&
> x.split(//).uniq! == nil}.compact!.each {|x| puts x }

Mark:
Thanks for doing this way. I had thought about trying to read it in and
split it up, but didn't know how to do it as I've only read a bit of the
pikaxe book. On my two processor G5 with 4 gb of memory, your version
is about 30% slower than the fastest method above. 18.69 vs 12.52
seconds. I intend to look a bit closer at your code and see if I can
see another way to speed it up.

Gary:
Thanks for your input also. I saw the redundancy when William James
gave me input, but really don't fully understand arrays vs strings in
Ruby yet and also the differences in print vs puts and other types of
output. I'll read through pikaxe a bit more right now.

Others:
Any input as to why it runs slower inside the file block? Have I
overlooked something simple?

--
Posted via http://www.ruby-....

dblack

3/11/2006 2:38:00 AM

Alan Burch

3/11/2006 3:11:00 AM

George Ogata wrote:

> Another speedup: replace:
>
> w.split(//).uniq.size == 11
>
> with:
>
> w !~ /(.).*\1/
>
> It's faster since there's less intermediate diddlage, but
> theoretically it shouldn't scale as well. You'd have to increase your
> "11" quite a lot to notice it though I think.
>

George:
Much thanks, I think that you've proved what I suspected, that Ruby is
counting the time wrong with the profile (ruby -r profile script.rb) as
when I subtract the profile time for the File.open block it's only a bit
slower than the faster call. I appreciate all the help and will try to
ask a more difficult question next time.
I've always been fairly strong with regexes, but I'd have never thought
to use one here. Thanks for that as well.

David:
Thanks for chiming in, I'll check out your links as well.

Alan

--
Posted via http://www.ruby-....

Bill Kelly

3/11/2006 4:41:00 AM

Hi,

From: "Mark Devlin" <OnlyMostlyDead@gmail.com>
>
> Solely for my own amusement, since I'm still trying teach myself Ruby...
>
> File.open("./words").read.split.collect! {|x| x if x.length == 10 &&
> x.split(//).uniq! == nil}.compact!.each {|x| puts x }

One detail here is the file handle is not being closed. A few alternatives
that close the file:

# open with block
File.open("./words"){|f| f.read.split.collect! {|x| x if x.length == 10 && x.split(//).uniq! == nil}.compact.each {|x| puts x } }

# File.read method
File.read("./words").split.collect! {|x| x if x.length == 10 && x.split(//).uniq! == nil}.compact.each {|x| puts x }

# IO.readlines method
IO.readlines("./words").collect! {|x| x if x.length == 11 && x.split(//).uniq! == nil}.compact.each {|x| puts x }

Note, used length 11 because readlines keeps linefeeds; also changed all to
non-bang form of compact, as compact! would return nil if it didn't do any work.
(I.e. if all words in the input satisfied the criteria, collect! would have returned
nil, and we'd have gotten a NoMethodError: undefined method `each' for nil:NilClass.)

Regards,

Bill

Stephen Waits

3/11/2006 5:04:00 AM

On Mar 10, 2006, at 4:08 PM, Alan Burch wrote:

> My current fastest script:
>
> f= File.open("./words")
> begin
> while w = f.gets
> puts w if w.size == 11 && w.split(//).uniq! == nil
> end
> rescue EOFError
> f.close
> end
>
> Not measurably faster than the first one, but seems better and more
> Ruby
> like to me.

I'm curious why you see it so? Personally, seems less Ruby-like to me.

--Steve

comp.lang.ruby

Help me understand why the Ruby block is slower than without

Alan Burch

William James

Alan Burch

Alan Burch

Mark Devlin

George Ogata

Alan Burch

dblack

Alan Burch

Bill Kelly

Stephen Waits

x Login to ForumsZone