GNU Info

Info Node: (fftw.info)Usage of Multi-threaded FFTW

(fftw.info)Usage of Multi-threaded FFTW


Next: How Many Threads to Use? Prev: Installation and Supported Hardware/Software Up: Multi-threaded FFTW
Enter node , (file) or (file)node

Usage of Multi-threaded FFTW
----------------------------

   Here, it is assumed that the reader is already familiar with the
usage of the uniprocessor FFTW routines, described elsewhere in this
manual.  We only describe what one has to change in order to use the
multi-threaded routines.

   First, instead of including `<fftw.h>' or `<rfftw.h>', you should
include the files `<fftw_threads.h>' or `<rfftw_threads.h>',
respectively.

   Second, before calling any FFTW routines, you should call the
function:

     int fftw_threads_init(void);

   This function, which should only be called once (probably in your
`main()' function), performs any one-time initialization required to
use threads on your system.  It returns zero if successful, and a
non-zero value if there was an error (in which case, something is
seriously wrong and you should probably exit the program).

   Third, when you want to actually compute the transform, you should
use one of the following transform routines instead of the ordinary FFTW
functions:

     fftw_threads(nthreads, plan, howmany, in, istride,
                  idist, out, ostride, odist);
     
     fftw_threads_one(nthreads, plan, in, out);
     
     fftwnd_threads(nthreads, plan, howmany, in, istride,
                    idist, out, ostride, odist);
     
     fftwnd_threads_one(nthreads, plan, in, out);
     
     rfftw_threads(nthreads, plan, howmany, in, istride,
                   idist, out, ostride, odist);
     
     rfftw_threads_one(nthreads, plan, in, out);
     
     rfftwnd_threads_real_to_complex(nthreads, plan, howmany, in,
                                     istride, idist, out, ostride, odist);
     
     rfftwnd_threads_one_real_to_complex(nthreads, plan, in, out);
     
     rfftwnd_threads_complex_to_real(nthreads, plan, howmany, in,
                                     istride, idist, out, ostride, odist);
     
     rfftwnd_threads_one_real_to_complex(nthreads, plan, in, out);
     
     rfftwnd_threads_one_complex_to_real(nthreads, plan, in, out);

   All of these routines take exactly the same arguments and have
exactly the same effects as their uniprocessor counterparts (i.e.
without the ``_threads'') *except* that they take one extra parameter,
`nthreads' (of type `int'), before the normal parameters.(1)  The
`nthreads' parameter specifies the number of threads of execution to
use when performing the transform (actually, the maximum number of
threads).

   For example, to parallelize a single one-dimensional transform of
complex data, instead of calling the uniprocessor `fftw_one(plan, in,
out)', you would call `fftw_threads_one(nthreads, plan, in, out)'.
Passing an `nthreads' of `1' means to use only one thread (the main
thread), and is equivalent to calling the uniprocessor routine.
Passing an `nthreads' of `2' means that the transform is potentially
parallelized over two threads (and two processors, if you have them),
and so on.

   These are the only changes you need to make to your source code.
Calls to all other FFTW routines (plan creation, destruction, wisdom,
etcetera) are not parallelized and remain the same.  (The same plans and
wisdom are used by both uniprocessor and multi-threaded transforms.)
Your arrays are allocated and formatted in the same way, and so on.

   Programs using the parallel complex transforms should be linked with
`-lfftw_threads -lfftw -lm' on Unix.  Programs using the parallel real
transforms should be linked with `-lrfftw_threads -lfftw_threads
-lrfftw -lfftw -lm'.  You will also need to link with whatever library
is responsible for threads on your system (e.g. `-lpthread' on Linux).

   ---------- Footnotes ----------

   (1) There is one exception: when performing one-dimensional in-place
transforms, the `out' parameter is always ignored by the multi-threaded
routines, instead of being used as a workspace if it is non-`NULL' as
in the uniprocessor routines.  The multi-threaded routines always
allocate their own workspace (the size of which depends upon the number
of threads).


automatically generated by info2www version 1.2.2.9