Results 1 - 10
of
80
The design and implementation of FFTW3
- Proceedings of the IEEE
, 2005
"... FFTW is an implementation of the discrete Fourier transform (DFT) that adapts to the hardware in order to maximize performance. This paper shows that such an approach can yield an implementation that is competitive with hand-optimized libraries, and describes the software structure that makes our cu ..."
Abstract
-
Cited by 255 (4 self)
- Add to MetaCart
FFTW is an implementation of the discrete Fourier transform (DFT) that adapts to the hardware in order to maximize performance. This paper shows that such an approach can yield an implementation that is competitive with hand-optimized libraries, and describes the software structure that makes our current FFTW3 version flexible and adaptive. We further discuss a new algorithm for real-data DFTs of prime size, a new way of implementing DFTs by means of machine-specific single-instruction, multiple-data (SIMD) instructions, and how a special-purpose compiler can derive optimized implementations of the discrete cosine and sine transforms automatically from a DFT algorithm. Keywords—Adaptive software, cosine transform, fast Fourier transform (FFT), Fourier transform, Hartley transform, I/O tensor.
Numerical Recipes in C: The Art of Scientific Computing. Second Edition
, 1992
"... This reprinting is corrected to software version 2.10 ..."
Abstract
-
Cited by 75 (0 self)
- Add to MetaCart
This reprinting is corrected to software version 2.10
The Fractional Fourier Transform and Applications
, 1995
"... This paper describes the "fractional Fourier transform", which admits computation by an algorithm that has complexity proportional to the fast Fourier transform algorithm. Whereas the discrete Fourier transform (DFT) is based on integral roots of unity e \Gamma2ßi=n , the fractional Fourier transf ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
This paper describes the "fractional Fourier transform", which admits computation by an algorithm that has complexity proportional to the fast Fourier transform algorithm. Whereas the discrete Fourier transform (DFT) is based on integral roots of unity e \Gamma2ßi=n , the fractional Fourier transform is based on fractional roots of unity e \Gamma2ßiff , where ff is arbitrary. The fractional Fourier transform and the corresponding fast algorithm are useful for such applications as computing DFTs of sequences with prime lengths, computing DFTs of sparse sequences, analyzing sequences with non-integer periodicities, performing high-resolution trigonometric interpolation, detecting lines in noisy images and detecting signals with linearly drifting frequencies. In many cases, the resulting algorithms are faster by arbitrarily large factors than conventional techniques. Bailey is with the Numerical Aerodynamic Simulation (NAS) Systems Division at NASA Ames Research Center, Moffett Field,...
Fast Discrete Polynomial Transforms with Applications to Data Analysis for Distance Transitive Graphs
, 1997
"... . Let P = fP 0 ; : : : ; Pn\Gamma1 g denote a set of polynomials with complex coefficients. Let Z = fz 0 ; : : : ; z n\Gamma1 g ae C denote any set of sample points. For any f = (f 0 ; : : : ; fn\Gamma1 ) 2 C n the discrete polynomial transform of f (with respect to P and Z) is defined as the col ..."
Abstract
-
Cited by 32 (7 self)
- Add to MetaCart
. Let P = fP 0 ; : : : ; Pn\Gamma1 g denote a set of polynomials with complex coefficients. Let Z = fz 0 ; : : : ; z n\Gamma1 g ae C denote any set of sample points. For any f = (f 0 ; : : : ; fn\Gamma1 ) 2 C n the discrete polynomial transform of f (with respect to P and Z) is defined as the collection of sums, f b f(P 0 ); : : : ; b f(Pn\Gamma1 )g, where f(P j ) = hf; P j i = P n\Gamma1 i=0 f i P j (z i )w(i) for some associated weight function w. These sorts of transforms find important applications in areas such as medical imaging and signal processing. In this paper we present fast algorithms for computing discrete orthogonal polynomial transforms. For a system of N orthogonal polynomials of degree at most N \Gamma 1 we give an O(N log 2 N) algorithm for computing a discrete polynomial transform at an arbitrary set of points instead of the N 2 operations required by direct evaluation. Our algorithm depends only on the fact that orthogonal polynomial sets satisfy a thre...
Stereo Inverse Perspective Mapping: Theory and Applications
- Image and Vision Computing Journal
, 1998
"... This paper discusses an extension to the Inverse Perspective Mapping geometrical transform to the processing of stereo images and presents the calibration method used on the ARGO autonomous vehicle. The article features also an example of application in the automotive field, in which the stereo Inve ..."
Abstract
-
Cited by 23 (15 self)
- Add to MetaCart
This paper discusses an extension to the Inverse Perspective Mapping geometrical transform to the processing of stereo images and presents the calibration method used on the ARGO autonomous vehicle. The article features also an example of application in the automotive field, in which the stereo Inverse Perspective Mapping helps to speed up the process. 1 Introduction The processing of images is generally performed at different levels, the lowest of which is characterized by the preservation of the data structure after the processing. Different techniques have been introduced for low-level image processing and can be classified in three main categories: Pointwise operations, Cellular Automaton operations, and Global operations [1]. In particular Global operations are transforms between different domains; their application simplifies the detection of image features which, conversely, would require a more complex computation in the original domain. They are not based on a one-to-one map...
Lossless Acceleration Of Fractal Image Compression By Fast Convolution
- Proc. IEEE Int. Conf. on Image Processing
, 1996
"... In fractal image compression the encoding step is computationally expensive. We present a new technique for reducing the computational complexity. It is lossless, i.e., it does not sacrifice any image quality for the sake of the speedup. It is based on a codebook coherence characteristic to fractal ..."
Abstract
-
Cited by 18 (8 self)
- Add to MetaCart
In fractal image compression the encoding step is computationally expensive. We present a new technique for reducing the computational complexity. It is lossless, i.e., it does not sacrifice any image quality for the sake of the speedup. It is based on a codebook coherence characteristic to fractal image compression and leads to a novel application of the fast Fourier transformbased convolution. The method provides a new conceptual view of fractal image compression. This paper focuses on the implementation issues and presents the first empirical experiments analyzing the performance benefits of the convolution approach to fractal image compression depending on image size, range size, and codebook size. The results show acceleration factors for large ranges up to 23 (larger factors possible), outperforming all other currently known lossless acceleration methods for such range sizes. 1. INTRODUCTION In fractal image compression [1, 2] image blocks (ranges) have to be compared against a...
Performing out-of-core FFTs on parallel disk systems
- PARALLEL COMPUTING
, 1998
"... The Fast Fourier Transform (FFT) plays a key role in many areas of computational science and engineering. Although most one-dimensional FFT problems can be solved entirely in main memory, some important classes of applications require out-of-core techniques. For these, use of parallel I/O systems ca ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
The Fast Fourier Transform (FFT) plays a key role in many areas of computational science and engineering. Although most one-dimensional FFT problems can be solved entirely in main memory, some important classes of applications require out-of-core techniques. For these, use of parallel I/O systems can improve performance considerably. This paper shows how to perform one-dimensional FFTs using a parallel disk system with independent disk accesses. We present both analytical and experimental results for performing out-of-core FFTs in two ways: using traditional virtual memory with demand paging, and using a provably asymptotically optimal algorithm for the Parallel Disk Model (PDM) of Vitter and Shriver. When run on a DEC 2100 server with a large memory and eight parallel disks, the optimal algorithm for the PDM runs up to 144.7 times faster than in-core methods under demand paging. Moreover, even including I/O costs, the normalized times for the optimal PDM algorithm are competitive, or better than, those for in-core methods even when they run entirely in memory.
Automatic Generation of Fast Discrete Signal Transforms
, 2001
"... This paper presents an algorithm that derives fast versions for a broad class of discrete signal transforms symbolically. The class includes but is not limited to the discrete Fourier and the discrete trigonometric transforms. This is achieved by finding fast sparse matrix factorizations for the mat ..."
Abstract
-
Cited by 14 (7 self)
- Add to MetaCart
This paper presents an algorithm that derives fast versions for a broad class of discrete signal transforms symbolically. The class includes but is not limited to the discrete Fourier and the discrete trigonometric transforms. This is achieved by finding fast sparse matrix factorizations for the matrix representations of these transforms. Unlike previous methods, the algorithm is entirely automatic and uses the defining matrix as its sole input. The sparse matrix factorization algorithm consists of two steps: First, the "symmetry" of the matrix is computed in the form of a pair of group representations; second, the representations are stepwise decomposed, giving rise to a sparse factorization of the original transform matrix. We have successfully demonstrated the method by computing automatically efficient transforms in several important cases: For the DFT, we obtain the Cooley--Tukey FFT; for a class of transforms including the DCT, type II, the number of arithmetic operations for our fast transforms is the same as for the best-known algorithms. Our approach provides new insights and interpretations for the structure of these signal transforms and the question of why fast algorithms exist. The sparse matrix factorization algorithm is implemented within the software package AREP.

