GNU Info

Info Node: (gmp.info)Other Multiplication

(gmp.info)Other Multiplication


Prev: FFT Multiplication Up: Multiplication Algorithms
Enter node , (file) or (file)node

Other Multiplication
--------------------

   The 3-way Toom-Cook algorithm described above (Note: Toom-Cook 3-Way
Multiplication) generalizes to split into an arbitrary number of
pieces, as per Knuth section 4.3.3 algorithm C.  This is not currently
used, though it's possible a Toom-4 might fit in between Toom-3 and the
FFTs.  The notes here are merely for interest.

   In general a split into r+1 pieces is made, and evaluations and
pointwise multiplications done at 2*r+1 points.  A 4-way split does 7
pointwise multiplies, 5-way does 9, etc.  Asymptotically an (r+1)-way
algorithm is O(N^(log(2*r+1)/log(r+1))).  Only the pointwise
multiplications count towards big-O complexity, but the time spent in
the evaluate and interpolate stages grows with r and has a significant
practical impact, with the asymptotic advantage of each r realized only
at bigger and bigger sizes.  The overheads grow as O(N*r), whereas in
an r=2^k FFT they grow only as O(N*log(r)).

   Knuth algorithm C evaluates at points 0,1,2,...,2*r, but exercise 4
uses -r,...,0,...,r and the latter saves some small multiplies in the
evaluate stage (or rather trades them for additions), and has a further
saving of nearly half the interpolate steps.  The idea is to separate
odd and even final coefficients and then perform algorithm C steps C7
and C8 on them separately.  The divisors at step C7 become j^2 and the
multipliers at C8 become 2*t*j-j^2.

   Splitting odd and even parts through positive and negative points
can be thought of as using -1 as a square root of unity.  If a 4th root
of unity was available then a further split and speedup would be
possible, but no such root exists for plain integers.  Going to complex
integers with i=sqrt(-1) doesn't help, essentially because in cartesian
form it takes three real multiplies to do a complex multiply.  The
existence of 2^k'th roots of unity in a suitable ring or field lets the
fast fourier transform keep splitting and get to O(N*log(r)).


automatically generated by info2www version 1.2.2.9