GNU Info

Info Node: (fftw.info)Tips for Optimal Threading

(fftw.info)Tips for Optimal Threading


Prev: Using Multi-threaded FFTW in a Multi-threaded Program Up: Multi-threaded FFTW
Enter node , (file) or (file)node

Tips for Optimal Threading
--------------------------

   Not all transforms are equally well-parallelized by the
multi-threaded FFTW routines.  (This is merely a consequence of
laziness on the part of the implementors, and is not inherent to the
algorithms employed.)  Mainly, the limitations are in the parallel
one-dimensional transforms.  The things to avoid if you want optimal
parallelization are as follows:

Parallelization deficiencies in one-dimensional transforms
----------------------------------------------------------

   * Large prime factors can sometimes parallelize poorly.  Of course,
     you should avoid these anyway if you want high performance.

   * Single in-place transforms don't parallelize completely.  (Multiple
     in-place transforms, i.e. `howmany > 1', are fine.)  Again, you
     should avoid these in any case if you want high performance, as
     they require transforming to a scratch array and copying back.

   * Single real-complex (`rfftw') transforms don't parallelize
     completely.  This is unfortunate, but parallelizing this correctly
     would have involved a lot of extra code (and a much larger
     library).  You still get some benefit from additional processors,
     but if you have a very large number of processors you will
     probably be better off using the parallel complex (`fftw')
     transforms.  Note that multi-dimensional real transforms or
     multiple one-dimensional real transforms are fine.


automatically generated by info2www version 1.2.2.9