[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Dictionary Keys question

FireNWater

1/30/2008 10:48:00 PM

I'm curious why the different outputs of this code. If I make the
dictionary with letters as the keys, they are not listed in the
dictionary in alphabetical order, but if I use the integers then the
keys are in numerical order.

I know that the order of the keys is not important in a dictionary,
but I was just curious about what causes the differences. Thanks!!

list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
list2 = [1,2,3,4,5,6,7,8]

Dictionary = dict(zip(list1, list2))
print Dictionary

Dictionary1 = dict(zip(list2, list1))
print Dictionary1
11 Answers

Christian Heimes

1/30/2008 11:00:00 PM

0

FireNWater wrote:
> I'm curious why the different outputs of this code. If I make the
> dictionary with letters as the keys, they are not listed in the
> dictionary in alphabetical order, but if I use the integers then the
> keys are in numerical order.
>
> I know that the order of the keys is not important in a dictionary,
> but I was just curious about what causes the differences. Thanks!!

Currently the order of dict keys depend on the hash of the object. Try
hash(1) and hash('a').

Dustan

1/30/2008 11:09:00 PM

0

On Jan 30, 4:47 pm, FireNWater <kho...@gmail.com> wrote:
> I'm curious why the different outputs of this code. If I make the
> dictionary with letters as the keys, they are not listed in the
> dictionary in alphabetical order, but if I use the integers then the
> keys are in numerical order.
>
> I know that the order of the keys is not important in a dictionary,
> but I was just curious about what causes the differences. Thanks!!
>
> list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
> list2 = [1,2,3,4,5,6,7,8]
>
> Dictionary = dict(zip(list1, list2))
> print Dictionary
>
> Dictionary1 = dict(zip(list2, list1))
> print Dictionary1

The underlying order is a result, in part, of the key's hash codes*.
Integers are hash coded by their integer values, therefore, they
appear in numeric order. Strings, however, use an algorithm that
ensures as unique hash codes as possible. Notice the difference:

>>> map(hash,
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h',
1, 2, 3, 4, 5, 6, 7, 8])
[-468864544, -340864157, -212863774, -84863387, 43136996, 171137383,
299137766, 427138153, 1, 2, 3, 4, 5, 6, 7, 8]

* emphasis on the "in part". Other factors include the amount of
memory space available, order added, the current size of the
underlying hash table, etc.

Berteun Damman

1/30/2008 11:09:00 PM

0

On Wed, 30 Jan 2008 14:47:36 -0800 (PST), FireNWater <khoard@gmail.com> wrote:
> I'm curious why the different outputs of this code. If I make the
> dictionary with letters as the keys, they are not listed in the
> dictionary in alphabetical order, but if I use the integers then the
> keys are in numerical order.
>
> I know that the order of the keys is not important in a dictionary,
> but I was just curious about what causes the differences. Thanks!!

I don't know the exact way Python's hash function works, but I can take
a guess. I'm sorry if I explain something you already know.

A hash is for quickly looking up data. Yet, you don't want to waste too
much memory. So there is a limit number of spaces allocated, in which to
store objects. This number of spaces can be thought of as a list. Then,
if you put something into the dict, Python computes the 'hash' of this
object, which basically forms the index in the list where to store it.
So every object should be mapped onto some index within the list. (If
you retrieve it, the hash is computed again, and the value on that index
is looked up, like list indexing, these are fast operations.)

Say, if you have 100 spaces, and someone puts in integer in the list,
the hashfunction used might be % 100. So the first 100 integers would
always be placed at consecutive places. For strings however, a more
complicated hash-function would be used, which takes into account more
characters, so strings don't end up in order.

For integers, if you put in integers that are spread very widely apart,
they won't end up in order either (see the mod 100 example, 104 will
come before 10).

If you replace the list2 in your example by:
list2 = [10000 * x for x in range(1,9)]

You will see that this one doesn't end up in order either. So, there's
no exception for integers when it comes to the order, yet the particular
properties of the hash function will cause sequential integers to end up
in order under some circumstances.

Berteun

PS:
What happens if two values map onto the same space is of course an
obvious question, and the trick is choosing your hashfunction so this
occurs not very often on average. If it happens there are several
strategies. Wikipedia probably has an explanation of how hash-functions
can work in such a case.

Gabriel Genellina

1/30/2008 11:10:00 PM

0

En Wed, 30 Jan 2008 20:47:36 -0200, FireNWater <khoard@gmail.com> escribió:

> I'm curious why the different outputs of this code. If I make the
> dictionary with letters as the keys, they are not listed in the
> dictionary in alphabetical order, but if I use the integers then the
> keys are in numerical order.
>
> I know that the order of the keys is not important in a dictionary,
> but I was just curious about what causes the differences. Thanks!!
>
> list1 = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
> list2 = [1,2,3,4,5,6,7,8]
>
> Dictionary = dict(zip(list1, list2))
> print Dictionary
>
> Dictionary1 = dict(zip(list2, list1))
> print Dictionary1

Dictionaries use the hash value of the keys to distribute them in
"buckets". Compare these:

for key in list1: print key, hash(key)
for key in list2: print key, hash(key)

--
Gabriel Genellina

FireNWater

1/31/2008 1:02:00 AM

0

On Jan 30, 3:09 pm, Berteun Damman <berteun@NO_SPAMdds.nl> wrote:
> On Wed, 30 Jan 2008 14:47:36 -0800 (PST), FireNWater <kho...@gmail.com> wrote:
> > I'm curious why the different outputs of this code. If I make the
> > dictionary with letters as the keys, they are not listed in the
> > dictionary in alphabetical order, but if I use the integers then the
> > keys are in numerical order.
>
> > I know that the order of the keys is not important in a dictionary,
> > but I was just curious about what causes the differences. Thanks!!
>
> I don't know the exact way Python's hash function works, but I can take
> a guess. I'm sorry if I explain something you already know.
>
> A hash is for quickly looking up data. Yet, you don't want to waste too
> much memory. So there is a limit number of spaces allocated, in which to
> store objects. This number of spaces can be thought of as a list. Then,
> if you put something into the dict, Python computes the 'hash' of this
> object, which basically forms the index in the list where to store it.
> So every object should be mapped onto some index within the list. (If
> you retrieve it, the hash is computed again, and the value on that index
> is looked up, like list indexing, these are fast operations.)
>
> Say, if you have 100 spaces, and someone puts in integer in the list,
> the hashfunction used might be % 100. So the first 100 integers would
> always be placed at consecutive places. For strings however, a more
> complicated hash-function would be used, which takes into account more
> characters, so strings don't end up in order.
>
> For integers, if you put in integers that are spread very widely apart,
> they won't end up in order either (see the mod 100 example, 104 will
> come before 10).
>
> If you replace the list2 in your example by:
> list2 = [10000 * x for x in range(1,9)]
>
> You will see that this one doesn't end up in order either. So, there's
> no exception for integers when it comes to the order, yet the particular
> properties of the hash function will cause sequential integers to end up
> in order under some circumstances.
>
> Berteun
>
> PS:
> What happens if two values map onto the same space is of course an
> obvious question, and the trick is choosing your hashfunction so this
> occurs not very often on average. If it happens there are several
> strategies. Wikipedia probably has an explanation of how hash-functions
> can work in such a case.

Thank you for the explanation. . . I think I now have a (foggy)
understanding of hash tables. It seems to be a way to create order
(an index) out of disorder (random numbers or characters) behind the
scenes. .

Richard Szopa

1/31/2008 1:19:00 AM

0

On Jan 31, 12:08 am, Dustan <DustanGro...@gmail.com> wrote:

> The underlying order is a result, in part, of the key's hash codes*.
> Integers are hash coded by their integer values, therefore, they
> appear in numeric order. Strings, however, use an algorithm that
> ensures as unique hash codes as possible. Notice the difference:
>
> >>> map(hash,
>
> ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h',
> 1, 2, 3, 4, 5, 6, 7, 8])
> [-468864544, -340864157, -212863774, -84863387, 43136996, 171137383,
> 299137766, 427138153, 1, 2, 3, 4, 5, 6, 7, 8]
>
> * emphasis on the "in part". Other factors include the amount of
> memory space available, order added, the current size of the
> underlying hash table, etc.

BTW, can anybody explain me how is the hash function implemented in
Python?

Cheers,

-- Richard

Marc 'BlackJack' Rintsch

1/31/2008 7:43:00 AM

0

On Wed, 30 Jan 2008 17:19:13 -0800, Ryszard Szopa wrote:

> BTW, can anybody explain me how is the hash function implemented in
> Python?

It calls the `__hash__()` method on the object that was given as argument.
So there's not *the* hash function, but every type implements its own.

Fallback is the hash of the identity of an object:

In [415]: class A(object): pass
.....:

In [416]: a = A()

In [417]: hash(a)
Out[417]: 161029068

In [418]: hash(id(a))
Out[418]: 161029068

Ciao,
Marc 'BlackJack' Rintsch

Dustan

1/31/2008 12:39:00 PM

0

On Jan 30, 7:02 pm, FireNWater <kho...@gmail.com> wrote:
> Thank you for the explanation. . . I think I now have a (foggy)
> understanding of hash tables. It seems to be a way to create order
> (an index) out of disorder (random numbers or characters) behind the
> scenes. .

The key thing to realize is, quite simply, don't rely on order in a
dictionary. If you do, bad things can happen. The underlying
implementation is not important to know.

But if you really do want to know, let me correct you here, and give a
perhaps clearer explanation (if not, there's no need to read any
further):

The 'order' that your speaking of is not implemented by the hash
*table*, per se, but rather by the hash function, which returns an
integer (the hash code).

The hash table takes the hash code and calculates where in its list to
place the object (as explained before, using modulo to shrink the
integer into the range of the list). If multiple items end up in the
same list, they are placed into a kind of linked list, with each node
containing an object and pointing to the next. Of course, if too many
objects end up in the same bucket, the efficiency of finding an object
in the hash table reduces to that of a linked list, so hash functions
are generally implemented to ensure a unique number (or as unique as
possible) to every object.

Python dictionaries are hash tables that automatically grow as space
is needed. While integers in the range of the list will never change
location unless the list shrinks, larger hash codes can move around
quite apparently randomly. Space available is also a factor, as others
have found out on this list. The order of a dictionary *is*
determined, but factors involved in deciding that order may appear
surprisingly mundane, and certainly variable across runs of your
program.

Ben Finney

1/31/2008 1:36:00 PM

0

Dustan <DustanGroups@gmail.com> writes:

> On Jan 30, 7:02 pm, FireNWater <kho...@gmail.com> wrote:
> > Thank you for the explanation. . . I think I now have a (foggy)
> > understanding of hash tables. It seems to be a way to create order
> > (an index) out of disorder (random numbers or characters) behind
> > the scenes. .
>
> The key thing to realize is, quite simply, don't rely on order in a
> dictionary.

The poster to which you replied is using "order" as contrasted with
"disorder". Clearly dictionaries *do* have order that can be relied
upon.

I think you're using "order" in the colloquial manner; more accurate
would be to say "don't rely on *sequence* in a dictionary".

--
\ "Our task must be to free ourselves from our prison by widening |
`\ our circle of compassion to embrace all humanity and the whole |
_o__) of nature in its beauty." â??Albert Einstein |
Ben Finney

FireNWater

1/31/2008 8:16:00 PM

0

On Jan 31, 4:39 am, Dustan <DustanGro...@gmail.com> wrote:
> On Jan 30, 7:02 pm, FireNWater <kho...@gmail.com> wrote:
>
> > Thank you for the explanation. . . I think I now have a (foggy)
> > understanding of hash tables. It seems to be a way to create order

>
> The 'order' that your speaking of is not implemented by the hash
> *table*, per se, but rather by the hash function, which returns an
> integer (the hash code).
>
Dustan,

Yes, I think I understand it now. . . the order I was referring to is
provided "behind the scenes" by the return values of the hash
function.

Thanks to everyone with their explanations!!