Eric Sosman
3/29/2011 11:32:00 AM
On 3/29/2011 12:49 AM, aki wrote:
> Hi folks ,
>
> i am working on hastable nowdays and looking ways to optimize my
> code . Please comment on my below observations and let me know if you
> have any suggestions.
>
> I have an existing implementation of chained hash table (A), which
> uses singly link list for chaining.
>
> typedef struct hnode HNODE;
> struct hnode{
> HNODE *link;
> };
>
> typedef struct {
> unsigned long (*hash)(const void *);
> int (*compare)(const void *, const void *);
> const void *(*keyof)(const HNODE *);
> HNODE **table;
> unsigned int buckets;
> unsigned int count;
> } HASH;
>
>
> The existing implementation caters a huge amount of data.
> Well the deletion or search of a hash node based on key is not a quick
> operation . As this would require
> calculating the index using hash func , and then searching the linked
> list on that index for the node based on values.
But the linked list should be short; that's the reason for
hashing in the first place. Yes, you could be unlucky and have
all the keys hash to just a few buckets -- but if that's the case
micro-optimizing is not going to help. Not enough, anyhow.
> I would like to optimize the search and deletion .
Question: Have you actually measured the performance and found
it to be too slow, or are you just guessing that it might be? Do
not optimize until you have measured. Even if you are *sure* that
optimization will be needed, measure first so you can find out how
much improvement you get -- or detect any disimprovements!
> so i thought to modify my existing hash table (A)implemetaion .
> 1 . using a doubly link list for chaining .
This won't make things any faster, since you'll still need to
search the list. It might actually slow you down a little, since
there's more work involved in maintaining two links than one, and
since doubling the HNODE size means only half as many can fit in
the various memory caches.
> 2 . using a separate data stucture , preferably an other hash table
> (B)
> which would store a address of node inserted in hashtable (A) .
> Nodes in B would be inserted parallel to A .
>
> Nodes in A would be deleted using B , and the searching of node would
> also done using B .
> Nodes in B would be deleted parallel to A .
Doesn't this just make things worse? You'll face exactly the
same problem with (B) that you now face with (A), namely, locating
and/or deleting an item given its key. Except now, after you've
done all the (B) work, you still have to go back and fuss with (A).
Where's the benefit?
> Presently A has hash node structure as :
>
>
>
> I would like to perform the modification in hnode as below
>
> struct hnode{
> HNODE *link_pre;
> HNODE *link_post;
> };
>
> Would this give me the desired outcome.
I don't see how it could, but perhaps I have not understood
what you intend to do. Whatever your intent, though, measure!
--
Eric Sosman
esosman@ieee-dot-org.invalid