Asp Forum - hashtable: tree based implementation

Taoufik Dachraoui

6/10/2015 6:25:00 PM

Hi

Sometime ago (2012) I implemented a hashtable algorithm using a tree

it is fast (and faster than ccl hashtables when using make-hash-table in a loop)

For example, parsing an arithmetic expression with 1,000,000
integers:
1. using ccl hashtable
? (test-calc 1000000)
(CALC8 (GEN-EXPR N))
took 120,577 milliseconds (120.577 seconds) to run.
16,112 milliseconds ( 16.112 seconds, 13.36%) of which was spent in GC.
During that period, and with 2 available CPU cores,
73,890 milliseconds ( 73.890 seconds) were spent in user mode
46,282 milliseconds ( 46.282 seconds) were spent in system mode
6,768,400,432 bytes of memory allocated.
7,280 minor page faults, 0 major page faults, 0 swaps.
(T 6953416 999999)

2. using my hashtable tree based implementation
? (test-calc 1000000)
(CALC8 (GEN-EXPR N))
took 36,012 milliseconds (36.012 seconds) to run.
6,053 milliseconds ( 6.053 seconds, 16.81%) of which was spent in GC.
During that period, and with 2 available CPU cores,
35,170 milliseconds (35.170 seconds) were spent in user mode
645 milliseconds ( 0.645 seconds) were spent in system mode
2,677,967,216 bytes of memory allocated.
7,072 minor page faults, 0 major page faults, 0 swaps.
(T 6950988 999999)

With the tree based hashtable the CACL8 is 3 times faster

I would like to add the tree base hashtable to quicklisp (if possible), so anyone can use it
and improve it (the locking for multithreaded use is not yet implemented, we need to implement
#_bitlock and #_bitfree for maximum performance)

how to add the tree based hashtable to quicklisp? if someone can help i appreciate it
very much

Kind regards
Taoufik

59 Answers

Nicolas Neuss

6/11/2015 3:41:00 PM

Taoufik Dachraoui <dachraoui.taoufik@gmail.com> writes:

> 2. using my hashtable tree based implementation
> ? (test-calc 1000000)
> (CALC8 (GEN-EXPR N))
> took 36,012 milliseconds (36.012 seconds) to run.
> 6,053 milliseconds ( 6.053 seconds, 16.81%) of which was spent in GC.
> During that period, and with 2 available CPU cores,
> 35,170 milliseconds (35.170 seconds) were spent in user mode
> 645 milliseconds ( 0.645 seconds) were spent in system mode
> 2,677,967,216 bytes of memory allocated.
> 7,072 minor page faults, 0 major page faults, 0 swaps.
> (T 6950988 999999)
>
> With the tree based hashtable the CACL8 is 3 times faster

What do CALC8 and GEN-EXPR do?

> I would like to add the tree base hashtable to quicklisp (if
> possible), so anyone can use it and improve it (the locking for
> multithreaded use is not yet implemented, we need to implement
> #_bitlock and #_bitfree for maximum performance)
>
> how to add the tree based hashtable to quicklisp? if someone can help
> i appreciate it very much

There are already binary-tree implementations in Quicklisp, e.g. trees.
That is, it would be nice if you version would introduce some
improvement.

Nicolas

Taoufik Dachraoui

6/11/2015 5:08:00 PM

Taoufik Dachraoui

6/11/2015 5:22:00 PM

Taoufik Dachraoui

6/11/2015 5:33:00 PM

Pascal J. Bourguignon

6/11/2015 6:17:00 PM

Taoufik Dachraoui <dachraoui.taoufik@gmail.com> writes:

> On Wednesday, June 10, 2015 at 8:25:29 PM UTC+2, Taoufik Dachraoui wrote:
>> Hi
>>
>> Sometime ago (2012) I implemented a hashtable algorithm using a tree
>>
>> it is fast (and faster than ccl hashtables when using make-hash-table in a loop)
>>
>> For example, parsing an arithmetic expression with 1,000,000
>> integers:
>> 1. using ccl hashtable
>> ? (test-calc 1000000)
>> (CALC8 (GEN-EXPR N))
>> took 120,577 milliseconds (120.577 seconds) to run.
>> 16,112 milliseconds ( 16.112 seconds, 13.36%) of which was spent in GC.
>> During that period, and with 2 available CPU cores,
>> 73,890 milliseconds ( 73.890 seconds) were spent in user mode
>> 46,282 milliseconds ( 46.282 seconds) were spent in system mode
>> 6,768,400,432 bytes of memory allocated.
>> 7,280 minor page faults, 0 major page faults, 0 swaps.
>> (T 6953416 999999)
>>
>> 2. using my hashtable tree based implementation
>> ? (test-calc 1000000)
>> (CALC8 (GEN-EXPR N))
>> took 36,012 milliseconds (36.012 seconds) to run.
>> 6,053 milliseconds ( 6.053 seconds, 16.81%) of which was spent in GC.
>> During that period, and with 2 available CPU cores,
>> 35,170 milliseconds (35.170 seconds) were spent in user mode
>> 645 milliseconds ( 0.645 seconds) were spent in system mode
>> 2,677,967,216 bytes of memory allocated.
>> 7,072 minor page faults, 0 major page faults, 0 swaps.
>> (T 6950988 999999)
>>
>>
>> With the tree based hashtable the CACL8 is 3 times faster
>>
>> I would like to add the tree base hashtable to quicklisp (if possible), so anyone can use it
>> and improve it (the locking for multithreaded use is not yet implemented, we need to implement
>> #_bitlock and #_bitfree for maximum performance)
>>
>> how to add the tree based hashtable to quicklisp? if someone can help i appreciate it
>> very much
>>
>> Kind regards
>> Taoufik
>
> I just looked at quicklisp:trees, and I found it is not implementing a
> hashtable, it is a general purpose binary search tree; they do not
> hash keys to store values; so it should not be faster then CL
> hashtables
>
> The hashtable I implemented is using a tree to store the values based
> on the hash code of the key; the implementation is fast (as fast as
> the CL hashtable) and it is faster if you have a make-hashtable in a
> loop; an empty hashtable in my implementation is just 2 conses.

clhs hash-table:

The intent is that this mapping be implemented by a hashing
mechanism, such as that described in Section 6.4 ``Hashing'' of The
Art of Computer Programming, Volume 3 (pp506-549). In spite of this
intent, no conforming implementation is required to use any
particular technique to implement the mapping.

--
__Pascal Bourguignon__ http://www.informat...
â??The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.â? -- Carl Bass CEO Autodesk

Pascal J. Bourguignon

6/11/2015 6:41:00 PM

Taoufik Dachraoui <dachraoui.taoufik@gmail.com> writes:

> In CL the test function is one of eq, eql, equal and equalp; I do not
> know why do we have this constraint, if someone can comment on
> this. In the implementation of the tree based hashtable there is no
> such constraint, the test function can be any predicate with 2
> arguments.

There's a reason why there's a limited set of possible test functions.

These tests functions must define equivalence relationships. Two
equivalent objects will have to hash to the same value, so that they're
stored into the same collision list, and thus so we may retrieve them
with the test function.

This implies that you cannot use SXHASH for this purpose!

Since (clhs SXHASH):

(equal x y) implies (= (sxhash x) (sxhash y)).

It is insufficient for EQUALP. And also insufficient to use
user-defined test functions. Implementations have to define a hash
function for EQUALP, such as
(equalp x y) => (= (equalp-hash x) (equalp-hash y)).

Also, since SXHASH is as large as EQUAL, using it for EQL and EQ would
be suboptimal: it would put in the same collision list keys that could
have had different hash values with more specific hash function. An
implementation would be well advised to use more specific hash functions
for those tests.

If we wanted to allow arbitrary user defined test functions for hash
tables, we'd have to require from the user to provide also hash
functions (and they're not so easy to write).

The reason why SXHASH is of no use for equalp may be because EQUALP as a
test function for hash tables is a late addition; originally only EQ, EQL
and EQUAL were considered. cf. issue #191 HASH-TABLE-TESTS

When you have a tree to store keys, you don't need a hash function
because you can essentially put each key in its own bin (you essentially
insert a new bin when you insert an element in the tree). But for that
you need to have an order function, not an equivalence function.

If you take only an equivalence function, then you have to use sxhash to
obtain a key with a natural order (<), so that you may build a tree.
But this is silly and inefficient, since now you have a tree of bins,
and you put collision lists in the bins, instead of just having a tree
of ordered keys.

On the other hand, given that the standard doesn't prescribe a hash
table algorithm for hash tables, it could have allowed arbitrary order
or equivalence functions and when given a function out of a known set
(probably an implementation dependent superset of (EQ EQL EQUAL
EQUALP)), the implementation would use a tree or some other data
structure than hash tables. But this is also something that can be done
easily in a library wrapping various data structures (cf. for example
com.informatimago.common-lisp.cesarum.dictionary), so there was also
little point in including it in the standard. Such restrictions in the
language specifications also allow to write small implementations or
implementations targetting small systems.

--
__Pascal Bourguignon__ http://www.informat...
â??The factory of the future will have only two employees, a man and a
dog. The man will be there to feed the dog. The dog will be there to
keep the man from touching the equipment.â? -- Carl Bass CEO Autodesk

Taoufik Dachraoui

6/11/2015 7:35:00 PM

Taoufik Dachraoui

6/11/2015 7:55:00 PM

Taoufik Dachraoui

6/11/2015 9:03:00 PM

On Wednesday, June 10, 2015 at 8:25:29 PM UTC+2, Taoufik Dachraoui wrote:
> Hi
>
> Sometime ago (2012) I implemented a hashtable algorithm using a tree
>
> it is fast (and faster than ccl hashtables when using make-hash-table in a loop)
>
> For example, parsing an arithmetic expression with 1,000,000
> integers:
> 1. using ccl hashtable
> ? (test-calc 1000000)
> (CALC8 (GEN-EXPR N))
> took 120,577 milliseconds (120.577 seconds) to run.
> 16,112 milliseconds ( 16.112 seconds, 13.36%) of which was spent in GC.
> During that period, and with 2 available CPU cores,
> 73,890 milliseconds ( 73.890 seconds) were spent in user mode
> 46,282 milliseconds ( 46.282 seconds) were spent in system mode
> 6,768,400,432 bytes of memory allocated.
> 7,280 minor page faults, 0 major page faults, 0 swaps.
> (T 6953416 999999)
>
> 2. using my hashtable tree based implementation
> ? (test-calc 1000000)
> (CALC8 (GEN-EXPR N))
> took 36,012 milliseconds (36.012 seconds) to run.
> 6,053 milliseconds ( 6.053 seconds, 16.81%) of which was spent in GC.
> During that period, and with 2 available CPU cores,
> 35,170 milliseconds (35.170 seconds) were spent in user mode
> 645 milliseconds ( 0.645 seconds) were spent in system mode
> 2,677,967,216 bytes of memory allocated.
> 7,072 minor page faults, 0 major page faults, 0 swaps.
> (T 6950988 999999)
>
>
> With the tree based hashtable the CACL8 is 3 times faster
>
> I would like to add the tree base hashtable to quicklisp (if possible), so anyone can use it
> and improve it (the locking for multithreaded use is not yet implemented, we need to implement
> #_bitlock and #_bitfree for maximum performance)
>
> how to add the tree based hashtable to quicklisp? if someone can help i appreciate it
> very much
>
> Kind regards
> Taoufik

Hi

The following is an example of using the tree-based hashtable implementation, and showing the average length of the collision lists (also the performance compared to CL hashtable)

;;; gen-obj returns a random list of integers
? (gen-obj)
(665 42 280 313 827 368 38 574 563 715 683 633 218 50 760 834 641 359 469 651 333 114 979 36 478 573 54 372 865 286 174 611 585 399 943 490 332 517 338 645 4 675 663 901 692 385 971 397 322 158 879 264 567 681 72 928 731 748 212 251 208 350 868 99 340 89 577 903 238 506 411 44 516 94 866 331 570 650 179)
? (gen-obj)
(229 963 766 708 931 392 987 567 807 404 576 242 969 22 841 227 221 676 163 141 342 945 153 912 20 741 246 60 427 360 412 423)
? (gen-obj)
(283)

? (setf h (make-hashtable))
T
? (time (dotimes (i 100000) (setf (getkey (gen-obj) h) t)))
(DOTIMES (I 100000) (SETF (GETKEY (GEN-OBJ) H) T))
took 2,423 milliseconds (2.423 seconds) to run.
1,792 milliseconds (1.792 seconds, 73.96%) of which was spent in GC.
During that period, and with 2 available CPU cores,
2,360 milliseconds (2.360 seconds) were spent in user mode
54 milliseconds (0.054 seconds) were spent in system mode
56,386,160 bytes of memory allocated.
10,821 minor page faults, 0 major page faults, 0 swaps.
NIL
? (length (hashvalues h))
100000
? (length (hashvalues *hashcodes*)) ; different number of hash codes used to create 100000 keys
99402
? (* 1.0 *average-collisions*)
1.0002214

? (setf h (make-hash-table))
#<HASH-TABLE :TEST EQL size 0/60 #x12A3441E>
? (time (dotimes (i 100000) (setf (gethash (gen-obj) h) t)))
(DOTIMES (I 100000) (SETF (GETHASH (GEN-OBJ) H) T))
took 4,225 milliseconds (4.225 seconds) to run.
1,764 milliseconds (1.764 seconds, 41.75%) of which was spent in GC.
During that period, and with 2 available CPU cores,
4,153 milliseconds (4.153 seconds) were spent in user mode
70 milliseconds (0.070 seconds) were spent in system mode
146,951,992 bytes of memory allocated.
6,340 minor page faults, 0 major page faults, 0 swaps.

Do not know how to determine the average length of collision lists in CL HT

- collision lists are of length 1 in average
- tree-based implementation is faster than CL hashtable
- maphash is faster than hashvalues (I need to optimize it further)

? (time (progn (hashvalues h) t))
(PROGN (HASHVALUES H) T)
took 41 milliseconds (0.041 seconds) to run.
24 milliseconds (0.024 seconds, 58.54%) of which was spent in GC.
During that period, and with 2 available CPU cores,
39 milliseconds (0.039 seconds) were spent in user mode
2 milliseconds (0.002 seconds) were spent in system mode
1,946,304 bytes of memory allocated.
282 minor page faults, 0 major page faults, 0 swaps.
T

? (time (maphash #'(lambda (k v) (declare (ignore k v))) h))
(MAPHASH #'(LAMBDA (K V) (DECLARE (IGNORE K V))) H)
took 13 milliseconds (0.013 seconds) to run.
8 milliseconds (0.008 seconds, 61.54%) of which was spent in GC.
During that period, and with 2 available CPU cores,
12 milliseconds (0.012 seconds) were spent in user mode
0 milliseconds (0.000 seconds) were spent in system mode
800,016 bytes of memory allocated.
218 minor page faults, 0 major page faults, 0 swaps.

The performance difference is even greater if you use make-hash-table within a loop

But, I should say that the implementation is very basic, it may not be comparable to cl ht;

I wish to implement weak-hash-tables in near future (I do not have any idea how to do it for now)

-Taoufik

Taoufik Dachraoui

6/11/2015 9:08:00 PM

comp.lang.lisp

hashtable: tree based implementation

Taoufik Dachraoui

Nicolas Neuss

Taoufik Dachraoui

Taoufik Dachraoui

Taoufik Dachraoui

Pascal J. Bourguignon

Pascal J. Bourguignon

Taoufik Dachraoui

Taoufik Dachraoui

Taoufik Dachraoui

Taoufik Dachraoui

x Login to ForumsZone