Results 1  10
of
26
FFTW: An Adaptive Software Architecture For The FFT
, 1998
"... FFT literature has been mostly concerned with minimizing the number of floatingpoint operations performed by an algorithm. Unfortunately, on presentday microprocessors this measure is far less important than it used to be, and interactions with the processor pipeline and the memory hierarchy have ..."
Abstract

Cited by 605 (4 self)
 Add to MetaCart
(Show Context)
FFT literature has been mostly concerned with minimizing the number of floatingpoint operations performed by an algorithm. Unfortunately, on presentday microprocessors this measure is far less important than it used to be, and interactions with the processor pipeline and the memory hierarchy have a larger impact on performance. Consequently, one must know the details of a computer architecture in order to design a fast algorithm. In this paper, we propose an adaptive FFT program that tunes the computation automatically for any particular hardware. We compared our program, called FFTW, with over 40 implementations of the FFT on 7 machines. Our tests show that FFTW's selfoptimizing approach usually yields significantly better performance than all other publicly available software. FFTW also compares favorably with machinespecific, vendoroptimized libraries. 1. INTRODUCTION The discrete Fourier transform (DFT) is an important tool in many branches of science and engineering [1] and...
A Fast Fourier Transform Compiler
, 1999
"... FFTW library for computing the discrete Fourier transform (DFT) has gained a wide acceptance in both academia and industry, because it provides excellent performance on a variety of machines (even competitive with or faster than equivalent libraries supplied by vendors). In FFTW, most of the perform ..."
Abstract

Cited by 193 (5 self)
 Add to MetaCart
(Show Context)
FFTW library for computing the discrete Fourier transform (DFT) has gained a wide acceptance in both academia and industry, because it provides excellent performance on a variety of machines (even competitive with or faster than equivalent libraries supplied by vendors). In FFTW, most of the performancecritical code was generated automatically by a specialpurpose compiler, called genfft, that outputs C code. Written in Objective Caml, genfft can produce DFT programs for any input length, and it can specialize the DFT program for the common case where the input data are real instead of complex. Unexpectedly, genfft “discovered” algorithms that were previously unknown, and it was able to reduce the arithmetic complexity of some other existing algorithms. This paper describes the internals of this specialpurpose compiler in some detail, and it argues that a specialized compiler is a valuable tool.
FAST FOURIER TRANSFORMS: A TUTORIAL REVIEW AND A STATE OF THE ART
, 1990
"... The publication of the CooleyTukey fast Fourier transform (FIT) algorithm in 1965 has opened a new area in digital signal processing by reducing the order of complexity of some crucial computational tasks like Fourier transform and convolution from N 2 to N log2 N, where N is the problem size. The ..."
Abstract

Cited by 129 (2 self)
 Add to MetaCart
The publication of the CooleyTukey fast Fourier transform (FIT) algorithm in 1965 has opened a new area in digital signal processing by reducing the order of complexity of some crucial computational tasks like Fourier transform and convolution from N 2 to N log2 N, where N is the problem size. The development of the major algorithms (CooleyTukey and splitradix FFT, prime factor algorithm and Winograd fast Fourier transform) is reviewed. Then, an attempt is made to indicate the state of the art on the subject, showing the standing of research, open problems and implementations.
SPIRAL: A Generator for PlatformAdapted Libraries of Signal Processing Algorithms
 Journal of High Performance Computing and Applications
, 2004
"... SPIRAL is a generator for libraries of fast software implementations of linear signal processing transforms. These libraries are adapted to the computing platform and can be reoptimized as the hardware is upgraded or replaced. This paper describes the main components of SPIRAL: the mathematical fra ..."
Abstract

Cited by 82 (20 self)
 Add to MetaCart
(Show Context)
SPIRAL is a generator for libraries of fast software implementations of linear signal processing transforms. These libraries are adapted to the computing platform and can be reoptimized as the hardware is upgraded or replaced. This paper describes the main components of SPIRAL: the mathematical framework that concisely describes signal transforms and their fast algorithms; the formula generator that captures at the algorithmic level the degrees of freedom in expressing a particular signal processing transform; the formula translator that encapsulates the compilation degrees of freedom when translating a specific algorithm into an actual code implementation; and, finally, an intelligent search engine that finds within the large space of alternative formulas and implementations
The Fastest Fourier Transform in the West
 the Proceedings of the 1998 International Conference on Acoustics, Speech, and Signal Processing, ICASSP '98
, 1997
"... This paper describes FFTW, a portable C package for computing the one and multidimensional complex discrete Fourier transform (DFT). FFTW is typically faster than all other publicly available DFT software, including the wellknown FFTPACK and the code from Numerical Recipes. More interestingly, FFT ..."
Abstract

Cited by 73 (2 self)
 Add to MetaCart
This paper describes FFTW, a portable C package for computing the one and multidimensional complex discrete Fourier transform (DFT). FFTW is typically faster than all other publicly available DFT software, including the wellknown FFTPACK and the code from Numerical Recipes. More interestingly, FFTW is competitive with or better than proprietary, highlytuned codes such as Sun's Performance Library and IBM's ESSL library. FFTW implements the CooleyTukey fast Fourier transform, and is freely available on the Web at http://theory.lcs.mit.edu/fftw. Three main ideas are the keys to FFTW's performance. First, the computation of the transform is performed by an executor consisting of highlyoptimized, composable blocks of C code called codelets. Second, at runtime, a planner finds an efficient way (called a `plan') to compose the codelets. Through the planner, FFTW adapts itself to the architecture of the machine it is running on. Third, the codelets are automatically generated by a code...
Multidigit Multiplication For Mathematicians
, 2001
"... This paper surveys techniques for multiplying elements of various commutative rings. It covers Karatsuba multiplication, dual Karatsuba multiplication, Toom multiplication, dual Toom multiplication, the FFT trick, the twisted FFT trick, the splitradix FFT trick, Good's trick, the SchönhageStr ..."
Abstract

Cited by 35 (8 self)
 Add to MetaCart
This paper surveys techniques for multiplying elements of various commutative rings. It covers Karatsuba multiplication, dual Karatsuba multiplication, Toom multiplication, dual Toom multiplication, the FFT trick, the twisted FFT trick, the splitradix FFT trick, Good's trick, the SchönhageStrassen trick, Schönhage's trick, Nussbaumer's trick, the cyclic SchönhageStrassen trick, and the CantorKaltofen theorem. It emphasizes the underlying ring homomorphisms.
Optimizing Sorting with Genetic Algorithms
 In The International Symposium on Code Generation and Optimization
, 2005
"... 1 ..."
Fast Automatic Generation of DSP Algorithms
, 2001
"... SPIRAL is a generator of optimized, platformadapted libraries for digital signal processing algorithms. SPIRAL's strategy translates the implementation task into a search in an expanded space of alternatives. ..."
Abstract

Cited by 21 (9 self)
 Add to MetaCart
SPIRAL is a generator of optimized, platformadapted libraries for digital signal processing algorithms. SPIRAL's strategy translates the implementation task into a search in an expanded space of alternatives.
Stochastic Search for Signal Processing Algorithm Optimization
, 2001
"... This paper presents an evolutionary algorithm for searching for the optimal implementations of signal transforms and compares this approach against other search techniques. A single signal processing algorithm can be represented by a very large number of different but mathematically equivalent ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
This paper presents an evolutionary algorithm for searching for the optimal implementations of signal transforms and compares this approach against other search techniques. A single signal processing algorithm can be represented by a very large number of different but mathematically equivalent formulas. When these formulas are implemented in actual code, unfortunately their running times differ significantly. Signal processing algorithm optimization aims at finding the fastest formula.
Portable HighPerformance Programs
, 1999
"... right notice and this permission notice are preserved on all copies. ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
(Show Context)
right notice and this permission notice are preserved on all copies.