GNU Info

Info Node: (python2.1-lib.info)difflib

(python2.1-lib.info)difflib


Next: fpformat Prev: struct Up: String Services
Enter node , (file) or (file)node

Helpers for computing deltas
============================

Helpers for computing differences between objects.  This module was
written by Tim Peters <tim.one@home.com>.
This manual section was written by Tim Peters <tim.one@home.com>.
_Added in Python version 2.1_

`get_close_matches(word, possibilities[, n[, cutoff]])'
     Return a list of the best "good enough" matches.  WORD is a
     sequence for which close matches are desired (typically a string),
     and POSSIBILITIES is a list of sequences against which to match
     WORD (typically a list of strings).

     Optional argument N (default `3') is the maximum number of close
     matches to return; N must be greater than `0'.

     Optional argument CUTOFF (default `0.6') is a float in the range
     [0, 1].  Possibilities that don't score at least that similar to
     WORD are ignored.

     The best (no more than N) matches among the possibilities are
     returned in a list, sorted by similarity score, most similar first.

          >>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
          ['apple', 'ape']
          >>> import keyword
          >>> get_close_matches('wheel', keyword.kwlist)
          ['while']
          >>> get_close_matches('apple', keyword.kwlist)
          []
          >>> get_close_matches('accept', keyword.kwlist)
          ['except']

`SequenceMatcher(...)'
     This is a flexible class for comparing pairs of sequences of any
     type, so long as the sequence elements are hashable.  The basic
     algorithm predates, and is a little fancier than, an algorithm
     published in the late 1980's by Ratcliff and Obershelp under the
     hyperbolic name "gestalt pattern matching."  The idea is to find
     the longest contiguous matching subsequence that contains no
     "junk" elements (the Ratcliff and Obershelp algorithm doesn't
     address junk).  The same idea is then applied recursively to the
     pieces of the sequences to the left and to the right of the
     matching subsequence.  This does not yield minimal edit sequences,
     but does tend to yield matches that "look right" to people.

     *Timing:* The basic Ratcliff-Obershelp algorithm is cubic time in
     the worst case and quadratic time in the expected case.
     `SequenceMatcher' is quadratic time for the worst case and has
     expected-case behavior dependent in a complicated way on how many
     elements the sequences have in common; best case time is linear.

See also:
     `Pattern Matching: The Gestalt Approach'{Discussion of a similar
     algorithm by John W. Ratcliff and D. E. Metzener. This was
     published in in July, 1988.}

SequenceMatcher Objects
Examples 2

automatically generated by info2www version 1.2.2.9