Copyright (C) 2000-2012 |
GNU Info (slib.info)HashingHashing ------- `(require 'hash)' These hashing functions are for use in quickly classifying objects. Hash tables use these functions. - Function: hashq obj k - Function: hashv obj k - Function: hash obj k Returns an exact non-negative integer less than K. For each non-negative integer less than K there are arguments OBJ for which the hashing functions applied to OBJ and K returns that integer. For `hashq', `(eq? obj1 obj2)' implies `(= (hashq obj1 k) (hashq obj2))'. For `hashv', `(eqv? obj1 obj2)' implies `(= (hashv obj1 k) (hashv obj2))'. For `hash', `(equal? obj1 obj2)' implies `(= (hash obj1 k) (hash obj2))'. `hash', `hashv', and `hashq' return in time bounded by a constant. Notice that items having the same `hash' implies the items have the same `hashv' implies the items have the same `hashq'. `(require 'sierpinski)' - Function: make-sierpinski-indexer max-coordinate Returns a procedure (eg hash-function) of 2 numeric arguments which preserves _nearness_ in its mapping from NxN to N. MAX-COORDINATE is the maximum coordinate (a positive integer) of a population of points. The returned procedures is a function that takes the x and y coordinates of a point, (non-negative integers) and returns an integer corresponding to the relative position of that point along a Sierpinski curve. (You can think of this as computing a (pseudo-) inverse of the Sierpinski spacefilling curve.) Example use: Make an indexer (hash-function) for integer points lying in square of integer grid points [0,99]x[0,99]: (define space-key (make-sierpinski-indexer 100)) Now let's compute the index of some points: (space-key 24 78) => 9206 (space-key 23 80) => 9172 Note that locations (24, 78) and (23, 80) are near in index and therefore, because the Sierpinski spacefilling curve is continuous, we know they must also be near in the plane. Nearness in the plane does not, however, necessarily correspond to nearness in index, although it _tends_ to be so. Example applications: * Sort points by Sierpinski index to get heuristic solution to _travelling salesman problem_. For details of performance, see L. Platzman and J. Bartholdi, "Spacefilling curves and the Euclidean travelling salesman problem", JACM 36(4):719-737 (October 1989) and references therein. * Use Sierpinski index as key by which to store 2-dimensional data in a 1-dimensional data structure (such as a table). Then locations that are near each other in 2-d space will tend to be near each other in 1-d data structure; and locations that are near in 1-d data structure will be near in 2-d space. This can significantly speed retrieval from secondary storage because contiguous regions in the plane will tend to correspond to contiguous regions in secondary storage. (This is a standard technique for managing CAD/CAM or geographic data.) `(require 'soundex)' - Function: soundex name Computes the _soundex_ hash of NAME. Returns a string of an initial letter and up to three digits between 0 and 6. Soundex supposedly has the property that names that sound similar in normal English pronunciation tend to map to the same key. Soundex was a classic algorithm used for manual filing of personal records before the advent of computers. It performs adequately for English names but has trouble with other languages. See Knuth, Vol. 3 `Sorting and searching', pp 391-2 To manage unusual inputs, `soundex' omits all non-alphabetic characters. Consequently, in this implementation: (soundex <string of blanks>) => "" (soundex "") => "" Examples from Knuth: (map soundex '("Euler" "Gauss" "Hilbert" "Knuth" "Lloyd" "Lukasiewicz")) => ("E460" "G200" "H416" "K530" "L300" "L222") (map soundex '("Ellery" "Ghosh" "Heilbronn" "Kant" "Ladd" "Lissajous")) => ("E460" "G200" "H416" "K530" "L300" "L222") Some cases in which the algorithm fails (Knuth): (map soundex '("Rogers" "Rodgers")) => ("R262" "R326") (map soundex '("Sinclair" "St. Clair")) => ("S524" "S324") (map soundex '("Tchebysheff" "Chebyshev")) => ("T212" "C121") automatically generated by info2www version 1.2.2.9 |