[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

map and join or inject?

konsu

8/26/2006 6:15:00 AM

hello,

given an array of strings A, i need to map a given function F to each
element of A and concatenate the results.

which way is more memory efficient:

1. A.inject("") { |t,x| t + F(x) }
2. (A.map { |x| F(x) }).join

thanks
konstantin

4 Answers

Erik Veenstra

8/26/2006 6:31:00 AM

0

----------------------------------------------------------------

$ vi test.rb ; cat test.rb
GC.disable

1000.times do
(1..1000).map{|s| s.to_s}.join("")
#(1..1000).inject(""){|r, s| s.to_s; r}
end

count = 0
ObjectSpace.each_object do
count += 1
end
p count

$ ruby test.rb
1003392

----------------------------------------------------------------

$ vi test.rb ; cat test.rb
GC.disable

1000.times do
#(1..1000).map{|s| s.to_s}.join("")
(1..1000).inject(""){|r, s| s.to_s; r}
end

count = 0
ObjectSpace.each_object do
count += 1
end
p count

$ ruby test.rb
2001392

----------------------------------------------------------------

gegroet,
Erik V. - http://www.erikve...


konsu

8/26/2006 6:03:00 PM

0

thanks, i did not know that one can count objects using ObjectSpace.

it looks like map+join use fewer objects than inject. even though map
creates a whole new array. i would expect inject to use fewer objects.
strange...

konstantin

Erik Veenstra wrote:
> ----------------------------------------------------------------
>
> $ vi test.rb ; cat test.rb
> GC.disable
>
> 1000.times do
> (1..1000).map{|s| s.to_s}.join("")
> #(1..1000).inject(""){|r, s| s.to_s; r}
> end
>
> count = 0
> ObjectSpace.each_object do
> count += 1
> end
> p count
>
> $ ruby test.rb
> 1003392
>
> ----------------------------------------------------------------
>
> $ vi test.rb ; cat test.rb
> GC.disable
>
> 1000.times do
> #(1..1000).map{|s| s.to_s}.join("")
> (1..1000).inject(""){|r, s| s.to_s; r}
> end
>
> count = 0
> ObjectSpace.each_object do
> count += 1
> end
> p count
>
> $ ruby test.rb
> 2001392
>
> ----------------------------------------------------------------
>
> gegroet,
> Erik V. - http://www.erikve...

brent.rowland@gmail.com

8/27/2006 5:45:00 AM

0

ako... wrote:
> thanks, i did not know that one can count objects using ObjectSpace.
>
> it looks like map+join use fewer objects than inject. even though map
> creates a whole new array. i would expect inject to use fewer objects.
> strange...
>
> konstantin
>
> Erik Veenstra wrote:
> > ----------------------------------------------------------------
> >
> > $ vi test.rb ; cat test.rb
> > GC.disable
> >
> > 1000.times do
> > (1..1000).map{|s| s.to_s}.join("")
> > #(1..1000).inject(""){|r, s| s.to_s; r}
> > end
> >
> > count = 0
> > ObjectSpace.each_object do
> > count += 1
> > end
> > p count
> >
> > $ ruby test.rb
> > 1003392
> >
> > ----------------------------------------------------------------
> >
> > $ vi test.rb ; cat test.rb
> > GC.disable
> >
> > 1000.times do
> > #(1..1000).map{|s| s.to_s}.join("")
> > (1..1000).inject(""){|r, s| s.to_s; r}
> > end
> >
> > count = 0
> > ObjectSpace.each_object do
> > count += 1
> > end
> > p count
> >
> > $ ruby test.rb
> > 2001392
> >
> > ----------------------------------------------------------------
> >
> > gegroet,
> > Erik V. - http://www.erikve...

It's true that map appears to create fewer objects, which may have some
advantages, but the situation isn't entirely one-sided. Now consider
the case where the result of each function consumes a whole megabyte of
memory. Using map and join, you get 1000 megabyte-long strings in an
array, so none of them can be freed. You're already a gig into your
memory before you join them, resulting in another gig used.

If, as in the example, you have garbage collection turned off, you'll
be 2 gigs (plus overhead) into your memory+swap in either scenario.
However, with the garbage collector enabled and using inject, the
results of each iteration could be freed, keeping your maximum memory
demand lower.

Of course, the complete answer would have to factor in the
implementations of the string concatenations and of the join method,
the number of memory reallocations needed by each, and the degree of
heap fragmentation incurred.

Personally, I would use some kind of memory or file stream with
inject--so that not only can the intermediate result objects be freed
but also so memory reallocations can be minimized.

In the end, Erik's approach, using metrics rather than speculation,
will be most effective.

Brent Rowland

James Gray

8/27/2006 4:14:00 PM

0

On Aug 26, 2006, at 10:05 PM, ako... wrote:

> it looks like map+join use fewer objects than inject. even though map
> creates a whole new array. i would expect inject to use fewer objects.
> strange...

It has to do with the block you gave inject() and the definition of
String.+(). Watch:

>> ("a".."j").inject(String.new) do |res, let|
?> new_str = res + let
>> puts "#{new_str.object_id}: #{new_str.inspect}"
>> new_str
>> end
1669064: "a"
1668834: "ab"
1668624: "abc"
1668424: "abcd"
1668304: "abcde"
1668144: "abcdef"
1668064: "abcdefg"
1667854: "abcdefgh"
1667764: "abcdefghi"
1667534: "abcdefghij"
=> "abcdefghij"

Each of those intermediate Strings is a new object, which is right.
That's what String.+() does:

$ ri -T String#+
--------------------------------------------------------------- String#+
str + other_str => new_str
------------------------------------------------------------------------
Concatenation---Returns a new String containing other_str
concatenated to str.

"Hello from " + self.to_s #=> "Hello from main"

The keywords there being "new String." String does have an append
method though:

$ ri -T 'String#<<'
-------------------------------------------------------------- String#<<
str << fixnum => str
str.concat(fixnum) => str
str << obj => str
str.concat(obj) => str
------------------------------------------------------------------------
Append---Concatenates the given object to str. If the object is a
Fixnum between 0 and 255, it is converted to a character before
concatenation.

a = "hello "
a << "world" #=> "hello world"
a.concat(33) #=> "hello world!"

And if we use that, we can get down to less objects than map(),
because we don't need the Array:

>> ("a".."j").inject(String.new) do |res, let|
?> same_str = res << let
>> puts "#{same_str.object_id}: #{same_str.inspect}"
>> same_str
>> end
938158: "a"
938158: "ab"
938158: "abc"
938158: "abcd"
938158: "abcde"
938158: "abcdef"
938158: "abcdefg"
938158: "abcdefgh"
938158: "abcdefghi"
938158: "abcdefghij"
=> "abcdefghij"

Hope that helps.

James Edward Gray II