Asp Forum - Re: Understanding Lisp Hash-Tables (a little better

Wenwei Peng

4/22/2015 2:07:00 AM

? 2002?12?29???? UTC+8??6:08:19,J. Romano??:
> Hi,
>
> In the Fall of 2001, I was a Graduate Teaching Assistant for a
> class at my University. Although Lisp was a requirement for the
> class, few people knew it as well as they knew C++ or Java.
>
> When the instructor explained how to make hash tables in Lisp, many
> students were confused, partly due to the fact that many of them
> didn't even know what a hash table was (and when to use it).
> Therefore, I created the following write-up to help the students in my
> class understand how to create and use hash tables in Lisp, as well as
> how they could be used.
>
> I'm posting the write-up here on the chance that somebody somewhere
> might find it useful. Keep in mind that my goal in composing the
> write-up was to use simple, understandable wording (instead of aiming
> for completeness using technical terms). Anybody who has basic
> knowldege of Lisp and knowledge of arrays in either C++ or Java should
> be able to understand my write-up.
>
> Feel free to distribute my write-up, especially for educational
> purposes. I would ask, however, that my write-up not be changed
> unless it is to correct an error.
>
> Finally, here is my write-up:
>
> ------------------------------------------------------
>
> A Guide to Understanding Lisp Hash-Tables (a little better)
>
> by Jean-Luc Romano
>
> There is normally some confusion when students try to learn about hash
> tables in any language. Therefore, this file has been made in the
> hopes that much of the confusion can be cleared up.
>
>
> A LITTLE BIT OF C++ CODE
>
> Before we begin discussing hash-tables, let's examine some of the
> differences between C++ and Lisp (if you've never used C++, don't
> worry -- the code used here will be understood by any Java
> programmer).
>
> In C++, if we want to set variable myVar to 5, we write:
> myVar = 5;
> In Lisp, we write:
> (setf myVar 5)
>
> If we want myVar to be on the right-hand side, in C++ we write:
> a = myVar;
> In Lisp, we write:
> (setf a myVar)
>
> Notice that instead of using the equal sign, Lisp uses the "setf"
> function.
>
>
> C++ ARRAYS
>
> Almost every programmer knows about arrays. In C++ the syntax is very
> simple. To assign a value to the first element of an array in C++, we
> write:
>
> A[0] = 5;
>
> The following lines are also valid, provided that the array index is
> within range:
>
> A[3] = 55;
> A[42] = 13;
>
> But there are certain constraints to remember when assigning (or
> referencing) an array:
> a. The array index must be in the proper range
> b. The array index must always be an integer
>
> These two rules make sense to us. How can you have a floating point
> number as an index to an array? If that were allowed, then any array
> would have to have an infinite number of elements and therefore use an
> infinite amount of space. The very idea is too bizarre to even
> consider.
>
> Or is it? What if I wanted to write:
> A[2.71] = 6;
> Or maybe something with an array index that is a string, like:
> A["hello"] = 2;
> Is this even possible?
>
> This IS possible, ever since the introduction of C++ Standard Template
> Libraries (STL). In these libraries, there is a class type called
> "map" which allows a user to create a special kind of array that does
> not have to use an integer as the array index. The following line:
>
> map<string,float> A;
>
> creates a variable A that behaves like an array, but whose array index
> must be a string and whose values must all be floating point numbers.
> Now the following line is perfectly valid:
> A["pi"] = 3.14;
> and the expression:
> float b = 2 * A["pi"];
> will set b equal to 6.28 .
>
> This explanation of the variable A is one way of looking at a hash
> table (it is by no means the only way, however). A hash table stores
> a value based on a "key" which, in C++, looks like an array index.
> Unlike an array, however, the key can be a string, float, character,
> integer, or even a user-defined type.
>
> So what is the difference between an array and a hash table? Here are
> a few:
> a. a hash table will allow any index (or key) to
> be used, not just a positive integer
> b. since arrays have pre-allocated memory, their
> access time is much quicker than hash tables
> c. hash tables have to worry about how to
> organize and retreive the values internally;
> however, the programmer does not have to worry
> about this (the code that manages this has
> already been written!)
>
> In case you may be wondering why this might be useful, consider
> storing someone's age in a hash table, with his/her name as a key (or
> index):
>
> Age["Joe"] = 24;
> Age["Ann"] = 18;
> cout << "Joe's age is " << Age["Joe"] << endl;
> cout << "Ann's age is " << Age["Ann"] << endl;
>
>
> HASH TABLES IN LISP
>
> So how do we create hash tables in Lisp? Very easily. We use the
> function:
> (make-hash-table :test #'equal)
>
> Note that this function RETURNS a hash table, not declares one.
> Therefore, we must place its return value into a variable that we want
> to use as a hash-table:
> (setf a (make-hash-table :test #'equal) )
>
> Now, how do we set/retreive the value at key "pi", for instance? The
> answer is not quite as intuitive as the C++ method, but it is still
> rather simple:
> (gethash "pi" a)
>
> Therefore, (gethash "pi" a) is Lisp's equivalent code for C++'s
> A["pi"] .
>
> Two things to note:
> a. "gethash", unlike "make-hash-table", does not
> contain any hyphens
> b. unlike C++, the key (in this case "pi") comes
> before the hash table's name (here, ''a''
> comes after "pi")
>
> So how do we set a value to a particular key? Just like we set a
> value to any other variable. If we wanted to set 3.14 to the variable
> pi, we would write:
> (setf pi 3.14)
> But if we want to set it to A["pi"], we substitute (gethash "pi" a)
> for pi:
> (setf (gethash "pi" a) 3.14)
>
> Now, how do we read a value that was set to a particular key? Simple:
> we just use (gethash "pi" a) just like any old variable in Lisp.
> Instead of
> (setf newVar pi)
> we write:
> (setf newVar (gethash "pi" a) )
>
> There are a few things to remember about hash tables in Lisp:
>
> a. because Lisp allows any variable to bind
> to any variable (that is, a variable
> assigned to an integer can also be assigned
> to a float, string, or a list), neither the
> key nor the value types have to be specified
> in Lisp, even though they do have to be
> specified in C++. This has both advantages
> and disadvantages, but at least it is less
> stuff to remember for the Lisp programmer.
>
> b. if two values are written to the same key,
> the old value will be replaced (or
> "clobbered") by the new value. For example,
> the lines:
> (setf (gethash "pi" a) 3.14)
> (setf (gethash "pi" a) 7)
> will set A["pi"] (that is, (gethash "pi" a) )
> to 7 .
>
> c. if no value has been set for a particular key,
> its value will be NIL.
>
>
> Well, that's about it for this file. I hope you find it helpful!
>
> ------------------------------------------------------

Title: The core of the core of the big data solutions -- Map
Author: pengwenwei
Email: wenwei19710430
Language: c++
Platform: Windows, linux
Technology: Perfect hash algorithm
Level: Advanced
Description: Map algorithm with high performance
Section MFC c++ map stl
SubSection c++ algorithm
License: (GPLv3)

Download demo project - 1070 Kb
Download source - 1070 Kb

Introduction:
For the c++ program, map is used everywhere.And bottleneck of program performance is often the performance of map.Especially in the case of large data,and the business association closely and unable to realize the data distribution and parallel processing condition.So the performance of map becomes the key technology.

In the work experience with telecommunications industry and the information security industry, I was dealing with the big bottom data,especially the most complex information security industry data,all can’t do without map.

For example, IP table, MAC table, telephone number list, domain name resolution table, ID number table query, the Trojan horse virus characteristic code of cloud killing etc..

The map of STL library using binary chop, its has the worst performance.Google Hash map has the optimal performance and memory at present, but it has repeated collision probability.Now the big data rarely use a collision probability map,especially relating to fees, can’t be wrong.

Now I put my algorithms out here,there are three kinds of map,after the build is Hash map.We can test the comparison,my algorithm has the zero probability of collision,but its performance is also better than the hash algorithm, even its ordinary performance has no much difference with Google.

My algorithm is perfect hash algorithm,its key index and the principle of compression algorithm is out of the ordinary,the most important is a completely different structure,so the key index compression is fundamentally different.The most direct benefit for program is that for the original map need ten servers for solutions but now I only need one server.
Declare: the code can not be used for commercial purposes, if for commercial applications,you can contact me with QQ 75293192.
Download:
https://sourceforge.net/projects/pwwhas...

Applications:
First,modern warfare can’t be without the mass of information query, if the query of enemy target information slows down a second, it could lead to the delaying fighter, leading to failure of the entire war. Information retrieval is inseparable from the map, if military products use pwwhashMap instead of the traditional map,you must be the winner.

Scond,the performance of the router determines the surfing speed, just replace open source router code map for pwwHashMap, its speed can increase ten times.
There are many tables to query and set in the router DHCP ptotocol,such as IP,Mac ,and all these are completed by map.But until now,all map are using STL liabrary,its performance is very low,and using the Hash map has error probability,so it can only use multi router packet dispersion treatment.If using pwwHashMap, you can save at least ten sets of equipment.

Third,Hadoop is recognized as the big data solutions at present,and its most fundamental thing is super heavy use of the map,instead of SQL and table.Hadoop assumes the huge amounts of data so that the data is completely unable to move, people must carry on the data analysis in the local.But as long as the open source Hadoop code of the map changes into pwwHashMap, the performance will increase hundredfold without any problems.

Background to this article that may be useful such as an introduction to the basic ideas presented:
http://blog.csdn.net/chixinmuzi/article/detai...

comp.lang.lisp

Re: Understanding Lisp Hash-Tables (a little better

Wenwei Peng

x Login to ForumsZone