Copyright (C) 2000-2012 |
GNU Info (fftw.info)Tips for Optimal ThreadingTips for Optimal Threading -------------------------- Not all transforms are equally well-parallelized by the multi-threaded FFTW routines. (This is merely a consequence of laziness on the part of the implementors, and is not inherent to the algorithms employed.) Mainly, the limitations are in the parallel one-dimensional transforms. The things to avoid if you want optimal parallelization are as follows: Parallelization deficiencies in one-dimensional transforms ---------------------------------------------------------- * Large prime factors can sometimes parallelize poorly. Of course, you should avoid these anyway if you want high performance. * Single in-place transforms don't parallelize completely. (Multiple in-place transforms, i.e. `howmany > 1', are fine.) Again, you should avoid these in any case if you want high performance, as they require transforming to a scratch array and copying back. * Single real-complex (`rfftw') transforms don't parallelize completely. This is unfortunate, but parallelizing this correctly would have involved a lot of extra code (and a much larger library). You still get some benefit from additional processors, but if you have a very large number of processors you will probably be better off using the parallel complex (`fftw') transforms. Note that multi-dimensional real transforms or multiple one-dimensional real transforms are fine. automatically generated by info2www version 1.2.2.9 |