Copyright (C) 2000-2012 |
GNU Info (slib.info)Sequence ComparisonSequence Comparison ------------------- `(require 'diff)' This package implements the algorithm: S. Wu, E. Myers, U. Manber, and W. Miller, "An O(NP) Sequence Comparison Algorithm," Information Processing Letters 35, 6 (1990), 317-323. <http://www.cs.arizona.edu/people/gene/vita.html> If the items being sequenced are text lines, then the computed edit-list is equivalent to the output of the "diff" utility program. If the items being sequenced are words, then it is like the lesser known "spiff" program. The values returned by `diff:edit-length' can be used to gauge the degree of match between two sequences. I believe that this algorithm is currently the fastest for these tasks, but genome sequencing applications fuel extensive research in this area. - Function: diff:longest-common-subsequence array1 array2 =? - Function: diff:longest-common-subsequence array1 array2 ARRAY1 and ARRAY2 are one-dimensional arrays. The procedure =? is used to compare sequence tokens for equality. =? defaults to `eqv?'. `diff:longest-common-subsequence' returns a one-dimensional array of length `(quotient (- (+ len1 len2) (fp:edit-length ARRAY1 ARRAY2)) 2)' holding the longest sequence common to both ARRAYs. - Function: diff:edits array1 array2 =? - Function: diff:edits array1 array2 ARRAY1 and ARRAY2 are one-dimensional arrays. The procedure =? is used to compare sequence tokens for equality. =? defaults to `eqv?'. `diff:edits' returns a list of length `(fp:edit-length ARRAY1 ARRAY2)' composed of a shortest sequence of edits transformaing ARRAY1 to ARRAY2. Each edit is a list of an integer and a symbol: (J insert) Inserts `(array-ref ARRAY1 J)' into the sequence. (K delete) Deletes `(array-ref ARRAY2 K)' from the sequence. - Function: diff:edit-length array1 array2 =? - Function: diff:edit-length array1 array2 ARRAY1 and ARRAY2 are one-dimensional arrays. The procedure =? is used to compare sequence tokens for equality. =? defaults to `eqv?'. `diff:edit-length' returns the length of the shortest sequence of edits transformaing ARRAY1 to ARRAY2. (diff:longest-common-subsequence '#(f g h i e j c k l m) '#(f g e h i j k p q r l m)) => #(f g h i j k l m) (diff:edit-length '#(f g h i e j c k l m) '#(f g e h i j k p q r l m)) => 6 (pretty-print (diff:edits '#(f g h i e j c k l m) '#(f g e h i j k p q r l m))) -| ((3 insert) ; e (4 delete) ; c (6 delete) ; h (7 insert) ; p (8 insert) ; q (9 insert)) ; r automatically generated by info2www version 1.2.2.9 |