Results 11 - 20
of
89
Sound Texture Modelling with Linear Prediction in Both Time and Frequency Domains
- in Proc. ICASSP
, 2003
"... Sound textures—for instance, a crackling fire, running water, or applause—constitute a large and largely neglected class of audio signals. Whereas tonal sounds have been effectively and flexibly modelled with sinusoids, aperiodic energy is usually modelled as white noise filtered to match the approx ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Sound textures—for instance, a crackling fire, running water, or applause—constitute a large and largely neglected class of audio signals. Whereas tonal sounds have been effectively and flexibly modelled with sinusoids, aperiodic energy is usually modelled as white noise filtered to match the approximate spectrum of the original over 10-30 ms windows, which fails to provide a perceptually satisfying reproduction of many real-world noisy sound textures. We attribute this failure to the loss of short-term temporal structure, and we introduce a second modelling stage in which the time envelope of the residual from conventional linear predictive modelling is itself modelled with linear prediction in the spectral domain. This cascade time- and frequency-domain linear prediction (CTFLP) leads to noise-excited resyntheses that have high perceptual fidelity. We perform a novel quantitative error analysis by measuring the proportional error within time-frequency cells across a range of timescales. 1.
Integrating Complementary Spectral Models In The Design Of A Musical Synthesizer
- Proceedings of the ICMC
, 1997
"... Spectral based analysis/synthesis techniques offer powerful tools for the processing of sounds. ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
Spectral based analysis/synthesis techniques offer powerful tools for the processing of sounds.
Transmitting audio Content as Sound Objects
- IN PROCEEDINGS OF AES22 INTERNATIONAL CONFERENCE ON VIRTUAL, SYNTHETIC AND ENTERTAINMENT AUDIO. ESPOO, FINLAND.
, 2002
"... As audio and music applications tend to a higher level of abstraction and to fill in the gap between the signal processing world and the end-user we are more and more interested on processing content and not (only) signal. This change in point of view leads to the redefinition of several “classica ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
As audio and music applications tend to a higher level of abstraction and to fill in the gap between the signal processing world and the end-user we are more and more interested on processing content and not (only) signal. This change in point of view leads to the redefinition of several “classical†concepts, and a new conceptual framework needs to be set to give support to these new trends. In [2], a model for the transmission of audio content was introduced. The model is now extended to include the idea of Sound Objects. With these thoughts in mind, examples of design decisions that have led to the implementation of the CLAM framework are also given.
Sound Source Separation in Monaural Music Signals
, 2006
"... Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separat ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separation of monaural, or, one-channel music recordings. We concentrate on separation methods, where the sources to be separated are not known beforehand. Instead, the separation is enabled by utilizing the common properties of real-world sound sources, which are their continuity, sparseness, and repetition in time and frequency, and their harmonic spectral structures. One of the separation approaches taken here use unsupervised learning and the other uses model-based inference based on sinusoidal modeling. Most of the existing unsupervised separation algorithms are based on a linear instantaneous signal model, where each frame of the input mixture signal is
Radial Basis Function Networks for Conversion of Sound Spectra
- Proc. of the DAFX99 Conf
, 1999
"... In many high-level signal processing tasks, such as pitch shifting, voice conversion or sound synthesis, accurate spectral processing is required. Here, the use of Radial Basis Function Networks (RBFN) is proposed for modeling the relationships among sets of spectral envelopes. The identification of ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
In many high-level signal processing tasks, such as pitch shifting, voice conversion or sound synthesis, accurate spectral processing is required. Here, the use of Radial Basis Function Networks (RBFN) is proposed for modeling the relationships among sets of spectral envelopes. The identification of such conversion functions is based on a procedure which learns the shape of the conversion from few couples of original target spectra (training set). The generalization properties of RBFNs provides for interpolation with respect to the pitch range. In the construction of the training set, mel-cepstral encoding of the spectrum is used to catch the perceptually most relevant spectral changes. Moreover, singular value decomposition (SVD) is used to reduce the dimension of conversion functions. The RBFN conversion functions introduced are characterized by a perceptually-based fast training procedure, desirable interpolation properties and computational efficiency. 1.
Alias-free, Multiresolution Sinusoidal Modeling for Polyphonic, Wideband Audio
- Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
, 1997
"... In this paper, we describe an improved method of generating more accurate sinusoidal parameters famplitude, frequency, phaseg from a wideband polyphonic audio source in a multiresolution, nonaliased fashion. This significantly improves upon previous work of sinusoidal modeling that assumes a single- ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
In this paper, we describe an improved method of generating more accurate sinusoidal parameters famplitude, frequency, phaseg from a wideband polyphonic audio source in a multiresolution, nonaliased fashion. This significantly improves upon previous work of sinusoidal modeling that assumes a single-pitched monophonic source, such as speech or an individual musical instrument. In addition to a more general analysis, we can now perform high-quality transformations such as time-stretching and pitch-shifting on polyphonic audio with ease. 1. Introduction Sinusoidal modeling has been developed as flexible, parametric method of representing speech [1] and musical instruments [2]. These methods assume that most speech and audio signals can be well represented by many time-varying sinusoids. 1 The trick is estimating these time-varying sinusoids in an efficient and perceptually meaningful manner. We solve this problem of parameter estimation by first splitting the input signal into octave-s...
Musical instrument classification and duet analysis employing music information retrieval techniques
, 2004
"... The aim of this paper is to present solutions related to identifying musical data. These are discussed mainly on the basis of experiments carried out at the Multimedia Systems Department, ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
The aim of this paper is to present solutions related to identifying musical data. These are discussed mainly on the basis of experiments carried out at the Multimedia Systems Department,
Waveform Preserving Time Stretching and Pitch Shifting for Sinusoidal Models of Sound
- In Proceedings of the COST-G6 Digital Audio Effects Workshop
, 1998
"... A method for performing waveform invariant time stretching and pitch shifting on a quasi harmonic and sinusoidally modeled sound is presented. The method is based on the relative phase delay representation of the phase, defined as the difference between the phase delay of the partials and the pha ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
A method for performing waveform invariant time stretching and pitch shifting on a quasi harmonic and sinusoidally modeled sound is presented. The method is based on the relative phase delay representation of the phase, defined as the difference between the phase delay of the partials and the phase delay of the fundamental. This representation makes the waveform characterization independent from the phase of the first partial. It is therefore possible to compute a smooth trajectory for the phase of the modified fundamental and rebuild the waveform on the synthesis frame boundaries by adding the relative phase delays to the new fundamental phase delay. 1 Introduction Traditional time stretching techniques which make use of time - frequency models often neglect phase, assuming that ear is sensitive only to frequency and amplitude of the partials. When the signal is not perfectly periodic, this assumption results in a waveform dispersion which gives the sound a `phasy' or reverber...
Spectral Approach to the Modeling of the Singing Voice
- Proceedings of 111th AES Convention
, 2001
"... This convention paper has been reproduced from the author's advance manuscript, without editing, corrections, or consideration by the Review Board. The AES takes no responsibility for the contents. Additional papers may be obtained by sending request ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
This convention paper has been reproduced from the author's advance manuscript, without editing, corrections, or consideration by the Review Board. The AES takes no responsibility for the contents. Additional papers may be obtained by sending request

