[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Question: Time efficiency of Array <<

Peter Suk

4/23/2005 12:32:00 AM

Forgive the newbie-ish question. I have been playing around with
Array, and discovered the << operator:

irb(main):001:0> array = [1, 2, 3, 4]
=> [1, 2, 3, 4]
irb(main):002:0> array2 = array << 5
=> [1, 2, 3, 4, 5]
irb(main):003:0> array
=> [1, 2, 3, 4, 5]
irb(main):004:0> array2
=> [1, 2, 3, 4, 5]
irb(main):005:0>

I am curious about the time & space complexity of n << operations to an
array. Is it O(n^2) or is it O(n)? Is there a doubling of allocated
space going on behind the scenes?

--Peter

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.



19 Answers

Eric Hodel

4/23/2005 12:44:00 AM

0

On 22 Apr 2005, at 17:31, Peter Suk wrote:

> Forgive the newbie-ish question. I have been playing around with
> Array, and discovered the << operator:
>
> irb(main):001:0> array = [1, 2, 3, 4]
> => [1, 2, 3, 4]
> irb(main):002:0> array2 = array << 5
> => [1, 2, 3, 4, 5]
> irb(main):003:0> array
> => [1, 2, 3, 4, 5]
> irb(main):004:0> array2
> => [1, 2, 3, 4, 5]
> irb(main):005:0>
>
> I am curious about the time & space complexity of n << operations to
> an array. Is it O(n^2) or is it O(n)? Is there a doubling of
> allocated space going on behind the scenes?

Take a look at rb_ary_store in array.c...

It looks like an Array grows by half of its current capacity when an
index is larger than the current capacity, but by no less than
ARY_DEFAULT_SIZE (16 elements).


--
Eric Hodel - drbrain@segment7.net - http://se...
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

Ara.T.Howard

4/23/2005 1:37:00 AM

0

ES

4/23/2005 2:48:00 AM

0


Le 23/4/2005, "Ara.T.Howard@noaa.gov" <Ara.T.Howard@noaa.gov> a écrit:
>On Sat, 23 Apr 2005, Eric Hodel wrote:
>
>> --Apple-Mail-8-618338431
>> Content-Transfer-Encoding: 7bit
>> Content-Type: text/plain; charset=US-ASCII; format=flowed
>>
>> On 22 Apr 2005, at 17:31, Peter Suk wrote:
>>
>>> Forgive the newbie-ish question. I have been playing around with
>>> Array, and discovered the << operator:
>>>
>>> irb(main):001:0> array = [1, 2, 3, 4]
>>> => [1, 2, 3, 4]
>>> irb(main):002:0> array2 = array << 5
>>> => [1, 2, 3, 4, 5]
>>> irb(main):003:0> array
>>> => [1, 2, 3, 4, 5]
>>> irb(main):004:0> array2
>>> => [1, 2, 3, 4, 5]
>>> irb(main):005:0>
>>>
>>> I am curious about the time & space complexity of n << operations to
>>> an array. Is it O(n^2) or is it O(n)? Is there a doubling of
>>> allocated space going on behind the scenes?
>>
>> Take a look at rb_ary_store in array.c...
>>
>> It looks like an Array grows by half of its current capacity when an
>> index is larger than the current capacity, but by no less than
>> ARY_DEFAULT_SIZE (16 elements).
>
>so i guess it's since that's the worst case. note that one can avoid via
>
> a = Array::new(mb = 2 ** 20)
> a.size.times{|i| a[i] = method i}
>
>unless storing nil is required...
>
>interestingly real-world time with largish values is better for array than a
>container with O(log n) performance:

I think it is a bit of a stretch to call the following test
'real-world' :) That being said, Array does perform well.

Remember that an rb-tree is O(log n) for insertion, deletion
_and_ search, and it can be treated as sparse.

> harp:~ > cat a.rb
> require 'rbtree'
> def time label
> a = Time::now.to_f
> fork{ yield } and Process::wait
> b = Time::now.to_f
> puts "#{ label } @ #{ b - a }"
> end
>
> n = 2 ** 20
>
> time('rbtree2array') do
> rbtree = RBTree::new
> n.times{|i| rbtree[i] = rand}
> array = rbtree.values
> end
>
> time('array <<') do
> array = []
> n.times{|i| array << rand}
> end
>
> harp:~ > ruby a.rb
> rbtree2array @ 8.43025994300842
> array << @ 1.26344704627991

As a sidenote, you can use Benchmark for convenient timing:

require 'benchmark'
Benchmark.bm do |bench|
# Setup
# ...
bench.report { # test1 }
bench.report { # test2 }
# ...
end

>who knew. perhaps function call overhead ends up being responsible for the
>difference.

In your test, it most likely suffers from allocating
memory piecemeal rather than in chunks. Not sure how
effectively one could preallocate for a tree.

>-a

E

--
template<typename duck>
void quack(duck& d) { d.quack(); }



Ara.T.Howard

4/23/2005 3:36:00 AM

0

William Morgan

4/23/2005 4:03:00 AM

0

Excerpts from Ara.T.Howard@noaa.gov's mail of 22 Apr 2005 (EDT):
> yes that's true too. i've done tests in the past comparing bdb,
> gperf, and other hashing type look ups against a bsearch and found
> them to be quite a bit slower. on closer analysis it turned out that,
> although hashing is O(1), the call stack is so much deeper for each
> search as compared to a non-recursive bsearch that it predominates - i
> was suprised.

Just goes to show that theoretic bounds on worst case performance are
sometimes just that....

--
William <wmorgan-ruby-talk@masanjin.net>


Peter Suk

4/23/2005 4:03:00 AM

0


On Apr 22, 2005, at 9:48 PM, Saynatkari wrote:

> I think it is a bit of a stretch to call the following test
> 'real-world' :) That being said, Array does perform well.
>
> Remember that an rb-tree is O(log n) for insertion, deletion
> _and_ search, and it can be treated as sparse.

The key here is "real world." Array's trick of doubling its allocation
gives us an "Amortized" time of O(2n) = O(n) which beats n insertions
into a structure like a Binary Tree, which is O(n*log n).

One of the handful of truly useful things I learned from Algorithms.

A good summary for the Array trick:

http://www.eli.sdsu.edu/courses/fall96/cs660/notes/amortize...
amortizedAnalysis.html

Some more:

http://www.cs.duke.edu/~mlittman/courses/Archive/cps130-97...
lect10/node14.html


You can think of it as the "reverse Zeno's paradox."

1 + 2 + 4 + 8 + ....
2^0 + 2^1 + 2^2 + 2^3 + ... + 2^(n-1) = 2^n - 1

Or visually:

X
XX
XXXX
XXXXXXXX
XXXXXXXXXXXXXXXX

You'll always be able to stack the smaller levels into the missing
space on the next to the last level and still have 1 empty space left
over.

So, Ruby's Array = Smalltalk's OrderedCollection.

--Peter


--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.



Peter Suk

4/23/2005 4:22:00 AM

0

On Apr 22, 2005, at 11:02 PM, William Morgan wrote:

> Excerpts from Ara.T.Howard@noaa.gov's mail of 22 Apr 2005 (EDT):
>> yes that's true too. i've done tests in the past comparing bdb,
>> gperf, and other hashing type look ups against a bsearch and found
>> them to be quite a bit slower. on closer analysis it turned out that,
>> although hashing is O(1), the call stack is so much deeper for each
>> search as compared to a non-recursive bsearch that it predominates - i
>> was suprised.
>
> Just goes to show that theoretic bounds on worst case performance are
> sometimes just that....

It's actually that the upper bounds are too loose in the naive
analysis. Amortized worst-case bounds on the doubling re-allocation
are O(n) for n insertions, or the same O(1) per insertion as hashing.
Then, it should become clear that the constant is bigger for hashing.

--Peter

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.



William Morgan

4/23/2005 1:04:00 PM

0

Excerpts from Peter Suk's mail of 23 Apr 2005 (EDT):
> >Excerpts from Ara.T.Howard@noaa.gov's mail of 22 Apr 2005 (EDT):
> >>yes that's true too. i've done tests in the past comparing bdb,
> >>gperf, and other hashing type look ups against a bsearch and found
> >>them to be quite a bit slower. on closer analysis it turned out that,
> >>although hashing is O(1), the call stack is so much deeper for each
> >>search as compared to a non-recursive bsearch that it predominates - i
> >>was suprised.
> >
> >Just goes to show that theoretic bounds on worst case performance are
> >sometimes just that....
>
> It's actually that the upper bounds are too loose in the naive
> analysis. Amortized worst-case bounds on the doubling re-allocation
> are O(n) for n insertions, or the same O(1) per insertion as hashing.
> Then, it should become clear that the constant is bigger for hashing.

In this case he's talking about lookups and call stack depth.

--
William <wmorgan-ruby-talk@masanjin.net>


Paulo Sérgio

4/23/2005 7:14:00 PM

0

Hi all,

My question is very "simple" ... how to define an abstract class (or
method) in ruby?

--
Paulo S Medeiros - B.Sc Student - DCC/COPPE/UFRJ - Brazil



Christoph R.

4/23/2005 7:27:00 PM

0

Paulo Sérgio schrieb:

> Hi all,
>
> My question is very "simple" ... how to define an abstract class (or
> method) in ruby?
>
How about

a) just define a module

b)

class MyAbstractClass
private_class_method :new
end


or c)

class MyAbstractClass
def self.new
raise TypeError, "I am an abstract class"
end
end


/Christoph