## Autoregressive modeling of temporal envelopes (2007)

### Cached

### Download Links

- [www.ee.columbia.edu]
- [labrosa.ee.columbia.edu]
- [www.ee.columbia.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | URL http://www.ee.columbia.edu/ ∼ dpwe/pubs/AthinE07-fdlp.pdf |

Citations: | 22 - 3 self |

### BibTeX

@INPROCEEDINGS{Athineos07autoregressivemodeling,

author = {Marios Athineos and Student Member and Daniel P. W. Ellis and Senior Member},

title = {Autoregressive modeling of temporal envelopes},

booktitle = {URL http://www.ee.columbia.edu/ ∼ dpwe/pubs/AthinE07-fdlp.pdf},

year = {2007}

}

### OpenURL

### Abstract

Abstract—Autoregressive (AR) models are commonly obtained from the linear autocorrelation of a discrete-time signal to obtain an all-pole estimate of the signal’s power spectrum. We are concerned with the dual, frequency-domain problem. We derive the relationship between the discrete-frequency linear autocorrelation of a spectrum and the temporal envelope of a signal. In particular, we focus on the real spectrum obtained by a type-I odd-length discrete cosine transform (DCT-Io) which leads to the all-pole envelope of the corresponding symmetric squared Hilbert temporal envelope. A compact linear algebra notation for the familiar concepts of AR modeling clearly reveals the dual symmetries between modeling in time and frequency domains. By using AR models in both domains in cascade, we can jointly estimate the temporal and spectral envelopes of a signal. We model the temporal envelope of the residual of regular AR modeling to efficiently capture signal structure in the most appropriate domain. Index Terms—Autoregressive (AR) modeling, frequency-domain linear prediction (FDLP), Hilbert envelope, linear prediction in spectral domain (LPSD), temporal noise shaping (TNS). I.

### Citations

940 |
Theory of communication
- Gabor
- 1946
(Show Context)
Citation Context ... input vector prior to the transform. Note that affects only the first element of the time and frequency domain signals. D. Discrete-Time “Analytic” Signal The analytic signal was introduced by Gabor =-=[16]-=-. Its fundamental property is that its spectrum vanishes for negative frequencies or, put another way, it is “causal” in the frequency domain. By this definition, a discrete-time signal cannot be anal... |

629 |
Perceptual linear predictive (PLP) analysis of speech
- Hermansky
- 1990
(Show Context)
Citation Context ... suppress fine detail while preserving the broad structure of formants has led to its widespread use in speech recognition preprocessing, for instance in “perceptual linear prediction” (PLP) features =-=[2]-=-. In this application, the all-pole filter defined by the optimal difference equation coefficients is taken as the description of a smoothed spectral envelope—the magnitude of the -transform of that f... |

465 |
Linear prediction: a tutorial review
- Makhoul
- 1975
(Show Context)
Citation Context ...residual from passing the original signal through the FIR filter defined by the AR coefficients is simply the first elements of from (20) i.e., . The average minimum total squared error as defined in =-=[22]-=- is given by (22) and it will be used as a goodness-of-fit measure in Section IV-B.sATHINEOS AND ELLIS: AUTOREGRESSIVE MODELING OF TEMPORAL ENVELOPES 5241 Fig. 2. Block diagrams of AR models. On the l... |

329 |
Circulant Matrices
- Davis
- 1979
(Show Context)
Citation Context ...in a matrix form. If and are - and -dimensional vectors respectively, we must zero-pad each one to length when convolving them, i.e., to avoid circular aliasing (19) where is a right-circulant matrix =-=[20]-=-, [21] with as its first column (i.e., generated by ). Convolution is commutative, so (19) is also equal to where is the right-circulant matrix generated by . AR modeling is equivalent to finding the ... |

319 | Time-Frequency Analysis - Cohen - 1995 |

239 |
Digital Coding of Waveforms – Principles and Applications
- Jayant, Noll
- 1984
(Show Context)
Citation Context ...iltering of a low-order Fourier approximation. This idea was first applied in audio coding by Herre and Johnston [3] who dubbed it temporal noise shaping (TNS). This frequency-domain version of D*PCM =-=[4]-=- was used to eliminate pre-echo artifacts associated with transients in perceptual audio coders such as MPEG2 AAC by factoring-out the parameterized time envelope prior to quantization, then reintrodu... |

102 |
Code Excited Linear Prediction (CELP): High Quality Speech at Very Low Bit Rates
- Schroeder, Atal
- 1985
(Show Context)
Citation Context ...ication we present is modeling the residual of a regular time-domain AR model. A common way to model the pitch pulses typically remaining in the residual is through a second long-term predictor (LTP) =-=[27]-=-. Our cascade and joint AR models can parameterize each pitch pulse in the second, frequency-domain AR model, and thereby flatten the temporal envelope of the overall residual. In Fig. 5, we plot the ... |

74 |
Time–Frequency Analysis. Englewood Cliffs
- Cohen
- 1995
(Show Context)
Citation Context ...nalytic signal using the DFT. In the time domain, the analytic signal is complex with its real part being the original signal and its imaginary part being the Hilbert transform of the original signal =-=[19]-=-. The squareds5240 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 11, NOVEMBER 2007 magnitude of this time-domain signal is the temporal envelope we will approximate through AR modeling, by show... |

48 |
Symmetric convolution and the discrete sine and cosine transforms
- Martucci
- 1994
(Show Context)
Citation Context ...s rotated by elements to start with . (1) (2) (3) (4) (5) (6)sATHINEOS AND ELLIS: AUTOREGRESSIVE MODELING OF TEMPORAL ENVELOPES 5239 B. WSHS Symmetry of Autocorrelation In the terminology of Martucci =-=[14]-=-, autocorrelation is left whole-sample symmetric—right half-sample symmetric (WSHS). Formally, an -point sequence is WSHS symmetric if and is always an odd-length sequence. An infinite periodic extens... |

41 |
Enhancing the performance of perceptual audio coders by using temporal noise shaping
- Herre, Johnston
- 1996
(Show Context)
Citation Context ...a even for low-order models, this approach can be preferable to the implicit low-pass filtering of a low-order Fourier approximation. This idea was first applied in audio coding by Herre and Johnston =-=[3]-=- who dubbed it temporal noise shaping (TNS). This frequency-domain version of D*PCM [4] was used to eliminate pre-echo artifacts associated with transients in perceptual audio coders such as MPEG2 AAC... |

33 |
Computing the discrete-time “analytic” signal via FFT
- MARPLE
- 1999
(Show Context)
Citation Context ...ine a discrete-time “analytic” signal is by forcing the spectrum to be “periodically causal” [17] meaning that the second half of each periodic repetition of the spectrum is forced to be zero. Marple =-=[18]-=- used this definition in order to derive a discrete-time analytic signal using the DFT. In the time domain, the analytic signal is complex with its real part being the original signal and its imaginar... |

31 |
Speech Formant Trajectory Estimation Using Dynamic Programming with Modulated Transition Costs
- Talkin
- 1987
(Show Context)
Citation Context ...tting AR models to short-time windows of the speech signal, factoring to identify the individual poles, then constructing formant trajectories from the succession of center frequencies of these poles =-=[1]-=-. While explicit formant tracks turn out to be a brittle basis for speech recognition, the properties of AR modeling to suppress fine detail while preserving the broad structure of formants has led to... |

31 | Advances in parametric coding for high-quality audio
- Schuijers, Oomen, et al.
(Show Context)
Citation Context ...audio textures. Sounds such as rain or footsteps are rich in temporal micro-transients which are well represented by the FDLP model [24]. A similar application in coding was investigated by Schuijers =-=[25]-=-. He used FDLP to model the temporal envelope of a noise-excited segment but substituted the spectral AR-moving average (ARMA) model with a Laguerre filter. Schuijers’ model later became part of the M... |

28 | Model-based approach to envelope and positive instantaneous frequency estimation of signals with speech applications - Kumaresan, Rao - 1999 |

19 | Frequency-domain linear prediction for temporal features
- Athineos, Ellis
- 2003
(Show Context)
Citation Context ...med signal , are the coefficients of the filter we are estimating—this time in the DCT-Io domain—and is again a residual to be minimized. We call this model frequency-domain linear prediction or FDLP =-=[23]-=-. The solution of this equation involves the autocorrelation of , i.e., the circulant matrix and the vector . From (17) and (18), we can see that the magnitude of the AR filter defined by evaluated on... |

19 | Sound texture modelling with linear prediction in both time and frequency domains
- Athineos, Ellis
- 2003
(Show Context)
Citation Context ... cascade model for audio synthesis, specifically in the modeling of audio textures. Sounds such as rain or footsteps are rich in temporal micro-transients which are well represented by the FDLP model =-=[24]-=-. A similar application in coding was investigated by Schuijers [25]. He used FDLP to model the temporal envelope of a noise-excited segment but substituted the spectral AR-moving average (ARMA) model... |

9 |
The discrete W transform
- Wang, Hunt
- 1985
(Show Context)
Citation Context ... then the corresponding sampled power spectrum in (6) will also be WSHS. C. Discrete Cosine Transform Type I Odd (DCT-Io) Out of the 16 discrete trigonometric transforms (DTT) first tabulated by Wang =-=[15]-=-, we are interested in the DCT-Io which is the only one related to the DFT through the WSHS SEO operator [14], meaning that the WSHS symmetry property of the autocorrelation can be preserved through t... |

5 | Temporal noise shaping, quantization and coding methods in perceptual audio coding: a tutorial introduction
- Herre
- 1999
(Show Context)
Citation Context ...O. 11, NOVEMBER 2007 “reverberation” or temporal-smearing pre-echo artifacts to signals that were “peaky” in time, and TNS eliminated these artifacts. In their original and subsequent papers [3], [5]–=-=[7]-=- Herre and Johnston motivated TNS by citing the duality between the squared Hilbert envelope and the power spectrum for continuous signals, but no exact derivation for finite-length discrete-time sign... |

5 | An inverse signal approach to computing the envelope of a real valued signal
- Kumaresan
- 1998
(Show Context)
Citation Context ...S by citing the duality between the squared Hilbert envelope and the power spectrum for continuous signals, but no exact derivation for finite-length discrete-time signals was given. Kumaresan et al. =-=[8]-=-–[13] have also addressed the problem of AR modeling of the temporal envelope of a signal. Specifically, in [8] Kumaresan formulated the so-called linear prediction in the spectral domain (LPSD) equat... |

4 |
A continuously signal-adaptive filter bank for high quality perceptual audio coding
- Herre, Johnston
- 1997
(Show Context)
Citation Context ...5, NO. 11, NOVEMBER 2007 “reverberation” or temporal-smearing pre-echo artifacts to signals that were “peaky” in time, and TNS eliminated these artifacts. In their original and subsequent papers [3], =-=[5]-=-–[7] Herre and Johnston motivated TNS by citing the duality between the squared Hilbert envelope and the power spectrum for continuous signals, but no exact derivation for finite-length discrete-time ... |

4 | Exploiting Both Time and Frequency Structure in a System that Uses an Analysis / Synthesis Filterbank with High Frequency Resolution - Herre, Johnston - 1997 |

2 |
On representing signals using only timing information
- Kumaresan, Wang
- 2002
(Show Context)
Citation Context ... citing the duality between the squared Hilbert envelope and the power spectrum for continuous signals, but no exact derivation for finite-length discrete-time signals was given. Kumaresan et al. [8]–=-=[13]-=- have also addressed the problem of AR modeling of the temporal envelope of a signal. Specifically, in [8] Kumaresan formulated the so-called linear prediction in the spectral domain (LPSD) equations.... |

2 | Coding of audio-visual objects. Part3: audio, AMENDMENT 2: parametric coding of high quality audio,” ISO/IEC Int. Std - ISOIEC - 2004 |

1 | Unique positive FM-AM decomposition of signals - Kumaresan, Rao - 1998 |

1 | On minimum/maximum/all-pass decompositions in time and frequency domains - Kumaresan - 2000 |

1 | On the relationship between line-spectral frequencies and zero-crossings of signals - Kumaresan, Wang - 2001 |

1 |
Convolution theorems for linear transforms
- Stone
- 1998
(Show Context)
Citation Context ...atrix form. If and are - and -dimensional vectors respectively, we must zero-pad each one to length when convolving them, i.e., to avoid circular aliasing (19) where is a right-circulant matrix [20], =-=[21]-=- with as its first column (i.e., generated by ). Convolution is commutative, so (19) is also equal to where is the right-circulant matrix generated by . AR modeling is equivalent to finding the FIR fi... |

1 | approach to envelope and positive instantaneous frequency estimation of signals with speech applications - “Model-based - 1999 |

1 | representing signals using only timing information - “On - 2002 |

1 | texture modelling with linear prediction in both time and frequency domains - “Sound - 2003 |