Strides in In-place RFFTWND
---------------------------
The fact that the input and output datatypes are different for
rfftwnd complicates the meaning of the `stride' and `dist' parameters
of in-place transforms--are they in units of `fftw_real' or
`fftw_complex' elements? When reading the input, they are interpreted
in units of the datatype of the input data. When writing the output,
the `istride' and `idist' are translated to the output datatype's
"units" in one of two ways, corresponding to the two most common
situations in which `stride' and `dist' parameters are useful. Below,
we refer to these "translated" parameters as `ostride_t' and `odist_t'.
(Note that these are computed internally by rfftwnd; the actual
`ostride' and `odist' parameters are ignored for in-place transforms.)
First, there is the case where you are transforming a number of
contiguous arrays located one after another in memory. In this
situation, `istride' is `1' and `idist' is the product of the physical
dimensions of the array. `ostride_t' and `odist_t' are then chosen so
that the output arrays are contiguous and lie on top of the input
arrays. `ostride_t' is therefore `1'. For a real-to-complex
transform, `odist_t' is `idist/2'; for a complex-to-real transform,
`odist_t' is `idist*2'.
The second case is when you have an array in which each element has
`nc' components (e.g. a structure with `nc' numeric fields), and you
want to transform all of the components at once. Here, `istride' is
`nc' and `idist' is `1'. For this case, it is natural to want the
output to also have `nc' consecutive components, now of the output data
type; this is exactly what rfftwnd does. Specifically, it uses an
`ostride_t' equal to `istride', and an `odist_t' of `1'. (Astute
readers will realize that some extra buffer space is required in order
to perform such a transform; this is handled automatically by rfftwnd.)
The general rule is as follows. `ostride_t' equals `istride'. If
`idist' is `1' and `idist' is less than `istride', then `odist_t' is
`1'. Otherwise, for a real-to-complex transform `odist_t' is `idist/2'
and for a complex-to-real transform `odist_t' is `idist*2'.