Copyright (C) 2000-2012 |
GNU Info (gmp.info)Divide and Conquer DivisionDivide and Conquer Division --------------------------- For divisors larger than `DC_THRESHOLD', division is done by dividing. Or to be precise by a recursive divide and conquer algorithm based on work by Moenck and Borodin, Jebelean, and Burnikel and Ziegler (Note: References). The algorithm consists essentially of recognising that a 2NxN division can be done with the basecase division algorithm (Note: Basecase Division), but using N/2 limbs as a base, not just a single limb. This way the multiplications that arise are (N/2)x(N/2) and can take advantage of Karatsuba and higher multiplication algorithms (Note: Multiplication Algorithms). The "digits" of the quotient are formed by recursive Nx(N/2) divisions. If the (N/2)x(N/2) multiplies are done with a basecase multiplication then the work is about the same as a basecase division, but with more function call overheads and with some subtractions separated from the multiplies. These overheads mean that it's only when N/2 is above `KARATSUBA_MUL_THRESHOLD' that divide and conquer is of use. `DC_THRESHOLD' is based on the divisor size N, so it will be somewhere above twice `KARATSUBA_MUL_THRESHOLD', but how much above depends on the CPU. An optimized `mpn_mul_basecase' can lower `DC_THRESHOLD' a little by offering a ready-made advantage over repeated `mpn_submul_1' calls. Divide and conquer is asymptotically O(M(N)*log(N)) where M(N) is the time for an NxN multiplication done with FFTs. The actual time is a sum over multiplications of the recursed sizes, as can be seen near the end of section 2.2 of Burnikel and Ziegler. For example, within the Toom-3 range, divide and conquer is 2.63*M(N). With higher algorithms the M(N) term improves and the multiplier tends to log(N). In practice, at moderate to large sizes, a 2NxN division is about 2 to 4 times slower than an NxN multiplication. Newton's method used for division is asymptotically O(M(N)) and should therefore be superior to divide and conquer, but it's believed this would only be for large to very large N. automatically generated by info2www version 1.2.2.9 |