Results 1  10
of
50
The design and implementation of FFTW3
 Proceedings of the IEEE
, 2005
"... FFTW is an implementation of the discrete Fourier transform (DFT) that adapts to the hardware in order to maximize performance. This paper shows that such an approach can yield an implementation that is competitive with handoptimized libraries, and describes the software structure that makes our cu ..."
Abstract

Cited by 407 (3 self)
 Add to MetaCart
FFTW is an implementation of the discrete Fourier transform (DFT) that adapts to the hardware in order to maximize performance. This paper shows that such an approach can yield an implementation that is competitive with handoptimized libraries, and describes the software structure that makes our current FFTW3 version flexible and adaptive. We further discuss a new algorithm for realdata DFTs of prime size, a new way of implementing DFTs by means of machinespecific singleinstruction, multipledata (SIMD) instructions, and how a specialpurpose compiler can derive optimized implementations of the discrete cosine and sine transforms automatically from a DFT algorithm. Keywords—Adaptive software, cosine transform, fast Fourier transform (FFT), Fourier transform, Hartley transform, I/O tensor.
A Fast Fourier Transform Compiler
, 1999
"... FFTW library for computing the discrete Fourier transform (DFT) has gained a wide acceptance in both academia and industry, because it provides excellent performance on a variety of machines (even competitive with or faster than equivalent libraries supplied by vendors). In FFTW, most of the perform ..."
Abstract

Cited by 154 (5 self)
 Add to MetaCart
FFTW library for computing the discrete Fourier transform (DFT) has gained a wide acceptance in both academia and industry, because it provides excellent performance on a variety of machines (even competitive with or faster than equivalent libraries supplied by vendors). In FFTW, most of the performancecritical code was generated automatically by a specialpurpose compiler, called genfft, that outputs C code. Written in Objective Caml, genfft can produce DFT programs for any input length, and it can specialize the DFT program for the common case where the input data are real instead of complex. Unexpectedly, genfft “discovered” algorithms that were previously unknown, and it was able to reduce the arithmetic complexity of some other existing algorithms. This paper describes the internals of this specialpurpose compiler in some detail, and it argues that a specialized compiler is a valuable tool.
Superfast solution of real positive definite Toeplitz systems
 SIAM J. Matrix Anal. Appl
, 1988
"... Abstract. We describe an implementation of the generalized Schur algorithm for the superfast solution of real positive definite Toeplitz systems of order n + 1, where n = 2ν. Our implementation uses the splitradix fast Fourier transform algorithms for real data of Duhamel. We are able to obtain the ..."
Abstract

Cited by 54 (1 self)
 Add to MetaCart
Abstract. We describe an implementation of the generalized Schur algorithm for the superfast solution of real positive definite Toeplitz systems of order n + 1, where n = 2ν. Our implementation uses the splitradix fast Fourier transform algorithms for real data of Duhamel. We are able to obtain the nth Szegő polynomial using fewer than 8n log2 2 n real arithmetic operations without explicit use of the bitreversal permutation. Since Levinson’s algorithm requires slightly more than 2n2 operations to obtain this polynomial, we achieve crossover with Levinson’s algorithm at n = 256. Key words. Toeplitz matrix, Schur’s algorithm, splitradix Fast Fourier Transform
PocketSphinx: A free, realtime continuous speech recognition system for handheld devices
 in Proceedings of ICASSP
, 2006
"... The availability of realtime continuous speech recognition on mobile and embedded devices has opened up a wide range of research opportunities in humancomputer interactive applications. Unfortunately, most of the work in this area to date has been confined to proprietary software, or has focused o ..."
Abstract

Cited by 36 (1 self)
 Add to MetaCart
The availability of realtime continuous speech recognition on mobile and embedded devices has opened up a wide range of research opportunities in humancomputer interactive applications. Unfortunately, most of the work in this area to date has been confined to proprietary software, or has focused on limited domains with constrained grammars. In this paper, we present a preliminary case study on the porting and optimization of CMU SPHINXII, a popular open source large vocabulary continuous speech recognition (LVCSR) system, to handheld devices. The resulting system operates in an average 0.87 times realtime on a 206MHz device, 8.03 times faster than the baseline system. To our knowledge, this is the first handheld LVCSR system available under an opensource license. 1.
A modified splitradix FFT with fewer arithmetic operations
 IEEE Trans. Signal Processing
, 2007
"... ..."
Portable HighPerformance Programs
, 1999
"... right notice and this permission notice are preserved on all copies. ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
right notice and this permission notice are preserved on all copies.
A realtime blind source separation scheme and its application to reverberant and noisy acoustic environments
, 2006
"... ..."
Analytic Confidence Level Calculations Using the Likelihood Ratio and Fourier Transform CERN
, 2000
"... The interpretation of new particle search results involves a confidence level calculation on either the discovery hypothesis or the backgroundonly (“null”) hypothesis. A typical approach uses toy Monte Carlo experiments to build an expected experiment estimator distribution against which an observe ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
The interpretation of new particle search results involves a confidence level calculation on either the discovery hypothesis or the backgroundonly (“null”) hypothesis. A typical approach uses toy Monte Carlo experiments to build an expected experiment estimator distribution against which an observed experiment’s estimator may be compared. In this note, a new approach is presented which calculates analytically the experiment estimator distribution via a Fourier transform, using the likelihood ratio as an ordering estimator. The analytic approach enjoys an enormous speed advantage over the toy Monte Carlo method, making it possible to quickly and precisely calculate confidence level results. 1
Active contour external force using vector field convolution for image segmentation
 IEEE Transactions on Image Processing
"... Abstract—Snakes, or active contours, have been widely used in image processing applications. Typical roadblocks to consistent performance include limited capture range, noise sensitivity, and poor convergence to concavities. This paper proposes a new external force for active contours, called vector ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Abstract—Snakes, or active contours, have been widely used in image processing applications. Typical roadblocks to consistent performance include limited capture range, noise sensitivity, and poor convergence to concavities. This paper proposes a new external force for active contours, called vector field convolution (VFC), to address these problems. VFC is calculated by convolving the edge map generated from the image with the userdefined vector field kernel. We propose two structures for the magnitude function of the vector field kernel, and we provide an analytical method to estimate the parameter of the magnitude function. Mixed VFC is introduced to alleviate the possible leakage problem caused by choosing inappropriate parameters. We also demonstrate that the standard external force and the gradient vector flow (GVF) external force are special cases of VFC in certain scenarios. Examples and comparisons with GVF are presented in this paper to show the advantages of this innovation, including superior noise robustness, reduced computational cost, and the flexibility of tailoring the force field. Index Terms—Active contours, deformable models, external force, gradient vector flow (GVF), snakes, vector field convolution (VFC). I.
Robust extended multidelay filter and doubletalk detector for acoustic echo cancellation
 IEEE TRANS. SPEECH AUDIO PROCESS
, 2006
"... We propose an integrated acoustic echo cancellation solution based on a novel class of efficient and robust adaptive algorithms in the frequency domain, the extended multidelay filter (EMDF). The approach is tailored to very long adaptive filters and highly autocorrelated input signals as they ari ..."
Abstract

Cited by 7 (6 self)
 Add to MetaCart
We propose an integrated acoustic echo cancellation solution based on a novel class of efficient and robust adaptive algorithms in the frequency domain, the extended multidelay filter (EMDF). The approach is tailored to very long adaptive filters and highly autocorrelated input signals as they arise in wideband fullduplex audio applications. The EMDF algorithm allows an attractive tradeoff between the wellknown multidelay filter and the recursive leastsquares algorithm. It exhibits fast convergence, superior tracking capabilities of the signal statistics, and very low delay. The low computational complexity of the conventional frequencydomain adaptive algorithms can be maintained thanks to efficient fast realizations. We also show how this approach can be combined efficiently with a suitable doubletalk detector (DTD). We consider a corresponding extension of a recently proposed DTD based on a normalized crosscorrelation vector whose performance was shown to be superior compared to other DTDs based on the crosscorrelation coefficient. Since the resulting DTD also has an EMDF structure it is easy to implement, and the fast realization also carries over to the DTD scheme. Moreover, as the robustness issue during double talk is particularly crucial for fastconverging algorithms, we apply the concept of robust statistics into our extended frequencydomain approach. Due to the robust generalization of the cost function leading to a socalled Mestimator, the algorithms become inherently less sensitive to outliers, i.e., short bursts that may be caused by inevitable detection failures of a DTD. The proposed structure is also well suited for an efficient generalization to the multichannel case.