[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

hash key performance

Matthias Wächter

10/2/2007 10:24:00 AM

I am just wondering why different hash keys result in such a
significantly different performance.

I have written 11 simple test cases, all using the following template:

num=ARGV[0].to_i

var=:key

a={:key => "test"}

num.times {
b=a[:key][0]
}

Beside this test_symbol.rb variant, I replace :key in the example
above with false (test_false.rb), an integer value 1 (test_int.rb),
with nil (test_nil.rb), with "id" (test_string.rb). In six other
variants I replace the key with var, and set var to a symbol
(test_var_symbol.rb), false (test_var_false.rb), an integer value
(test_var_int.rb), nil (test_var_nil.rb), "id" (test_var_string.rb),
and "id".frozen (test_var_frozen.rb).


This is my performance chart using time (dual-core CPU, called with
num=30_000_000). I ran it once (dropped timing values), then three
consecutive times and added the times (which is the sort key as well):

test_int.rb 37.029
test_var_int.rb 37.659
test_symbol.rb 37.773
test_var_symbol.rb 37.785
test_var_frozen.rb 37.852
test_var_string.rb 40.515
test_var_false.rb 45.468
test_false.rb 45.687
test_var_nil.rb 45.894
test_nil.rb 46.068
test_string.rb 53.032


No surprise is that repeated lookup of a string as a key results in
the worst performance, and using a Fixnum gives best performance.

More interesting is a bulk of tests that relate to using symbols and
variables for int, symbol and frozen string, all within timing
accuracy. A noticable, but still small gap can be seen for a
nonfrozen string in a variable.

But here it comes: Using false and, even worse, nil (in either
literal or variable form) is way slower (a good 20%) than using
integers and symbols, even if they follow the same "immutable
object" philosophy of fixed object_ids.

Why is that? What's so special about false and nil to have such a
degraded performance as a hash key?

- Matthias

3 Answers

Marcin Mielzynski

10/2/2007 2:50:00 PM

0

Matthias Wächter wrote:

> But here it comes: Using false and, even worse, nil (in either
> literal or variable form) is way slower (a good 20%) than using
> integers and symbols, even if they follow the same "immutable
> object" philosophy of fixed object_ids.
>
> Why is that? What's so special about false and nil to have such a
> degraded performance as a hash key?
>

hash.c:rb_any_hash is the answer to your question (funcall('hash') is
discarded only for Fixnums, Symbols and Strings. Note that String
numbers would be way better if it's hash was cached until first
destructive operation.

lopex

Matthias Wächter

10/3/2007 6:46:00 AM

0

On 02.10.2007 16:55, Marcin MielżyÅ?ski wrote:
> Matthias Wächter wrote:
>
>> But here it comes: Using false and, even worse, nil (in either
>> literal or variable form) is way slower (a good 20%) than using
>> integers and symbols, even if they follow the same "immutable
>> object" philosophy of fixed object_ids.
>>
>> Why is that? What's so special about false and nil to have such a
>> degraded performance as a hash key?
>>
>
> hash.c:rb_any_hash is the answer to your question (funcall('hash') is
> discarded only for Fixnums, Symbols and Strings.

So the first two cases of "switch (TYPE(a))" should be enhanced with
checks for nil, true and false like this:

switch (TYPE(a)) {
case T_FIXNUM:
case T_SYMBOL:
+ case T_NIL:
+ case T_FALSE:
+ case T_TRUE:
return (int)a;
break;


Objections?

- Matthias

Ryan Davis

10/3/2007 8:19:00 PM

0


On Oct 2, 2007, at 23:46 , Matthias Wächter wrote:

> On 02.10.2007 16:55, Marcin Mielzynski wrote:
>> Matthias Wächter wrote:
>>
>>> But here it comes: Using false and, even worse, nil (in either
>>> literal or variable form) is way slower (a good 20%) than using
>>> integers and symbols, even if they follow the same "immutable
>>> object" philosophy of fixed object_ids.
>>>
>>> Why is that? What's so special about false and nil to have such a
>>> degraded performance as a hash key?
>>>
>>
>> hash.c:rb_any_hash is the answer to your question (funcall('hash') is
>> discarded only for Fixnums, Symbols and Strings.
>
> So the first two cases of "switch (TYPE(a))" should be enhanced with
> checks for nil, true and false like this:
>
> switch (TYPE(a)) {
> case T_FIXNUM:
> case T_SYMBOL:
> + case T_NIL:
> + case T_FALSE:
> + case T_TRUE:
> return (int)a;
> break;

Send that and your original timings to ruby-core@ and/or file a bug
on rubyforge in the ruby project.