GNU Info

Info Node: (slib.info)Hashing

(slib.info)Hashing


Next: Object Prev: Hash Tables Up: Data Structures
Enter node , (file) or (file)node

Hashing
-------

  `(require 'hash)'

  These hashing functions are for use in quickly classifying objects.
Hash tables use these functions.

 - Function: hashq obj k
 - Function: hashv obj k
 - Function: hash obj k
     Returns an exact non-negative integer less than K.  For each
     non-negative integer less than K there are arguments OBJ for which
     the hashing functions applied to OBJ and K returns that integer.

     For `hashq', `(eq? obj1 obj2)' implies `(= (hashq obj1 k) (hashq
     obj2))'.

     For `hashv', `(eqv? obj1 obj2)' implies `(= (hashv obj1 k) (hashv
     obj2))'.

     For `hash', `(equal? obj1 obj2)' implies `(= (hash obj1 k) (hash
     obj2))'.

     `hash', `hashv', and `hashq' return in time bounded by a constant.
     Notice that items having the same `hash' implies the items have
     the same `hashv' implies the items have the same `hashq'.

  `(require 'sierpinski)'

 - Function: make-sierpinski-indexer max-coordinate
     Returns a procedure (eg hash-function) of 2 numeric arguments which
     preserves _nearness_ in its mapping from NxN to N.

     MAX-COORDINATE is the maximum coordinate (a positive integer) of a
     population of points.  The returned procedures is a function that
     takes the x and y coordinates of a point, (non-negative integers)
     and returns an integer corresponding to the relative position of
     that point along a Sierpinski curve.  (You can think of this as
     computing a (pseudo-) inverse of the Sierpinski spacefilling
     curve.)

     Example use: Make an indexer (hash-function) for integer points
     lying in square of integer grid points [0,99]x[0,99]:
          (define space-key (make-sierpinski-indexer 100))
     Now let's compute the index of some points:
          (space-key 24 78)               => 9206
          (space-key 23 80)               => 9172

     Note that locations (24, 78) and (23, 80) are near in index and
     therefore, because the Sierpinski spacefilling curve is
     continuous, we know they must also be near in the plane.  Nearness
     in the plane does not, however, necessarily correspond to nearness
     in index, although it _tends_ to be so.

     Example applications:
        * Sort points by Sierpinski index to get heuristic solution to
          _travelling salesman problem_.  For details of performance,
          see L. Platzman and J. Bartholdi, "Spacefilling curves and the
          Euclidean travelling salesman problem", JACM 36(4):719-737
          (October 1989) and references therein.

        * Use Sierpinski index as key by which to store 2-dimensional
          data in a 1-dimensional data structure (such as a table).
          Then locations that are near each other in 2-d space will
          tend to be near each other in 1-d data structure; and
          locations that are near in 1-d data structure will be near in
          2-d space.  This can significantly speed retrieval from
          secondary storage because contiguous regions in the plane
          will tend to correspond to contiguous regions in secondary
          storage.  (This is a standard technique for managing CAD/CAM
          or geographic data.)


  `(require 'soundex)'

 - Function: soundex name
     Computes the _soundex_ hash of NAME.  Returns a string of an
     initial letter and up to three digits between 0 and 6.  Soundex
     supposedly has the property that names that sound similar in normal
     English pronunciation tend to map to the same key.

     Soundex was a classic algorithm used for manual filing of personal
     records before the advent of computers.  It performs adequately for
     English names but has trouble with other languages.

     See Knuth, Vol. 3 `Sorting and searching', pp 391-2

     To manage unusual inputs, `soundex' omits all non-alphabetic
     characters.  Consequently, in this implementation:

          (soundex <string of blanks>)    => ""
          (soundex "")                    => ""

     Examples from Knuth:

          (map soundex '("Euler" "Gauss" "Hilbert" "Knuth"
                                 "Lloyd" "Lukasiewicz"))
                  => ("E460" "G200" "H416" "K530" "L300" "L222")
          
          (map soundex '("Ellery" "Ghosh" "Heilbronn" "Kant"
                                  "Ladd" "Lissajous"))
                  => ("E460" "G200" "H416" "K530" "L300" "L222")

     Some cases in which the algorithm fails (Knuth):

          (map soundex '("Rogers" "Rodgers"))     => ("R262" "R326")
          
          (map soundex '("Sinclair" "St. Clair")) => ("S524" "S324")
          
          (map soundex '("Tchebysheff" "Chebyshev")) => ("T212" "C121")


automatically generated by info2www version 1.2.2.9