Robert Klemme
6/12/2007 11:12:00 AM
On 12.06.2007 11:52, Michal Suchanek wrote:
> On 11/06/07, Robert Klemme <shortcutter@googlemail.com> wrote:
>> On 11.06.2007 07:41, Anthony Martinez wrote:
>> > So, I was tinkering with ways to build a hash out of transforming an
>> > array, knowing the standard/idiomatic
>> >
>> > id_list = [:test1, :test2, :test3]
>> > id_list.inject({}) { |a,e| a[e]=e.object_id ; a }
>> >
>> > I also decided to try something like this:
>> >
>> > Hash[ *id_list.collect { |e| [e,e.object_id]}.flatten]
>> >
>> > and further (attempt to) optimize it via
>> > Hash[ *id_list.collect { |e| [e,e.object_id]}.flatten!]
>> > and
>> > Hash[ *id_list.collect! { |e| [e,e.object_id]}.flatten!]
>> >
>> > Running this via Benchmark#bmbm gives pretty interesting, and to me,
>> > unexpected, results (on a 3.2 GHz P4, 1GB of RAM, FC5 with ruby 1.8.4)
>> >
>> > require 'benchmark'
>> > id_list = (1..1_000_000).to_a
>> > Benchmark::bmbm do |x|
>> > x.report("inject") { id_list.inject({}) { |a,e| a[e] =
>> e.object_id ; a} }
>> > x.report("non-bang") { Hash[ *id_list.collect { |e|
>> [e,e.object_id]}.flatten] }
>> > x.report("bang") { Hash[ *id_list.collect { |e|
>> [e,e.object_id]}.flatten!] }
>> > x.report("two-bang") { Hash[ *id_list.collect! { |e|
>> [e,e.object_id]}.flatten!] }
>> > end
>> >
>> > Rehearsal --------------------------------------------
>> > inject 16.083333 0.033333 16.116667 ( 9.670747)
>> > non-bang 1657.050000 1.800000 1658.850000 (995.425642)
>> > bang 1593.716667 0.016667 1593.733333 (956.334565)
>> > two-bang 1604.816667 1.350000 1606.166667 (963.803356)
>> > -------------------------------- total: 4874.866667sec
>> >
>> > user system total real
>> > inject 5.183333 0.000000 5.183333 ( 3.102379)
>> > non-bang zsh: segmentation fault ruby
>> >
>> > Ow?
>> >
>> > Also, I just thought of a similar way to accomplish the same thing:
>> >
>> > x.report("zip") { Hash[ *id_list.zip(id_list.collect {|e|
>> e.object_id})] }
>> >
>> > Array#collect! won't work right with this, of course, but it seems to
>> > have equally-bad performance. Is Array#inject just optimized for this,
>> > or something?
>>
>> The reason why you are seeing this (performance as well as timing) is
>> most likely caused by the different approach. When you use #inject you
>> just create one copy of the Array (the Hash). When you use #collect you
>> create at least one additional copy of the large array plus a ton of two
>> element arrays. That's way less efficient considering memory usage and
>> GC. You'll probably see much different results if your input array is
>> much shorter (try with 10 or 100 elements).
>>
>
> I also get a segfault (with 1.8.5 on Gentoo):
>
> ruby -w test.rb
> Rehearsal --------------------------------------------
> inject 7.210000 0.870000 8.080000 ( 13.011414)
> non-bang 4620.790000 179.730000 4800.520000 (6029.976497)
> bang 4586.140000 200.530000 4786.670000 (5970.267190)
> two-bang 4599.560000 268.080000 4867.640000 (6035.687979)
> ------------------------------- total: 14462.910000sec
>
> user system total real
> inject 10.740000 1.990000 12.730000 ( 15.500874)
> non-bang Segmentation fault
> hramrach@hp-tc2110:11(0) 06120053 16]~ $ ruby -v
> ruby 1.8.5 (2006-12-04 patchlevel 2) [i686-linux]
>
> Of course, it would be better to test with 1.8.6 but it takes quite
> some time to retest ;-)
>
> Thanks
>
> Michal
I believe it's the star operator:
13:10:22 [Temp]: ruby -e 'Hash[*Array.new(ARGV.shift.to_i) ]' 1000000
13:10:35 [Temp]: ruby -e 'Hash[*Array.new(ARGV.shift.to_i) ]' 10000000
-e:1: [BUG] Segmentation fault
ruby 1.8.6 (2007-03-13) [i386-cygwin]
Aborted (core dumped)
13:10:39 [Temp]: ruby -e 'def f(*a) end; f(*Array.new(ARGV.shift.to_i))'
1000000
13:11:09 [Temp]: ruby -e 'def f(*a) end; f(*Array.new(ARGV.shift.to_i))'
10000000
-e:1: [BUG] Segmentation fault
ruby 1.8.6 (2007-03-13) [i386-cygwin]
Aborted (core dumped)
13:11:14 [Temp]: uname -a
CYGWIN_NT-5.2-WOW64 PDBXPWSRK38 1.5.24(0.156/4/2) 2007-01-31 10:57 i686
Cygwin
13:11:36 [Temp]: ruby --version
ruby 1.8.6 (2007-03-13 patchlevel 0) [i386-cygwin]
Kind regards
robert