Results 1 -
8 of
8
Sound Source Separation in Monaural Music Signals
, 2006
"... Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separat ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separation of monaural, or, one-channel music recordings. We concentrate on separation methods, where the sources to be separated are not known beforehand. Instead, the separation is enabled by utilizing the common properties of real-world sound sources, which are their continuity, sparseness, and repetition in time and frequency, and their harmonic spectral structures. One of the separation approaches taken here use unsupervised learning and the other uses model-based inference based on sinusoidal modeling. Most of the existing unsupervised separation algorithms are based on a linear instantaneous signal model, where each frame of the input mixture signal is
Low Bitrate Object Coding of Musical Audio Using Bayesian Harmonic Models
, 2006
"... This article deals with the decomposition of music signals into pitched sound objects made of harmonic sinusoidal partials for very low bitrate coding purposes. After a brief review of existing methods, we recast this problem in the Bayesian framework. We propose a family of probabilistic signal mod ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
This article deals with the decomposition of music signals into pitched sound objects made of harmonic sinusoidal partials for very low bitrate coding purposes. After a brief review of existing methods, we recast this problem in the Bayesian framework. We propose a family of probabilistic signal models combining learnt object priors and various perceptually motivated distortion measures. We design efficient algorithms to infer object parameters and build a coder based on the interpolation of frequency and amplitude parameters. Listening tests suggest that the loudness-based distortion measure outperforms other distortion measures and that our coder results in a better sound quality than baseline transform and parametric coders at 8 kbit/s and 2 kbit/s. This work constitutes a new step towards a fully object-based coding system, which would represent audio signals as collections of meaningful note-like sound objects.
Joint Detection and Tracking of Time-Varying Harmonic Components: a Flexible Bayesian Approach
- in "IEEE transactions on Speech, Audio and Language Processing
, 2006
"... This paper addresses the joint estimation and detection of time-varying harmonic components in audio signals. We follow a flexible viewpoint, where several frequency/amplitude trajectories are tracked in spectrogram using particle filtering. The core idea is that each harmonic component (composed of ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
This paper addresses the joint estimation and detection of time-varying harmonic components in audio signals. We follow a flexible viewpoint, where several frequency/amplitude trajectories are tracked in spectrogram using particle filtering. The core idea is that each harmonic component (composed of a fundamental partial together with several overtone partials) is considered a target. Tracking requires to define a state-space model with state transition and measurement equations. Particle filtering algorithms rely on a so-called sequential importance distribution, and we show that it can be built on previous multipitch estimation algorithms, so as to yield an even more efficient estimation procedure with established convergence properties. Moreover, as our model captures all the harmonic model information, it actually separates the harmonic sources. Simulations on synthetic and real music data show the interest of our approach.
Single-channel mixture decomposition using Bayesian harmonic models
- in Proc. ICA, 2006
, 2006
"... Abstract. We consider the source separation problem for single-channel music signals. After a brief review of existing methods, we focus on decomposing a mixture into components made of harmonic sinusoidal partials. We address this problem in the Bayesian framework by building a probabilistic model ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. We consider the source separation problem for single-channel music signals. After a brief review of existing methods, we focus on decomposing a mixture into components made of harmonic sinusoidal partials. We address this problem in the Bayesian framework by building a probabilistic model of the mixture combining generic priors for harmonicity, spectral envelope, note duration and continuity. Experiments suggest that the derived blind decomposition method leads to better separation results than nonnegative matrix factorization for certain mixtures. 1
Predominant-f0 estimation using Bayesian harmonic waveform models
- In MIREX Audio Melody Extraction Contest Abstracts
, 2005
"... This paper describes a predominant pitch extraction system based on a family of Bayesian harmonic models. These models represent the short term waveform of the observed signal as a sum of harmonic partials and a residual noise. The amplitudes of the partials are modelled by a prior learnt on a train ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
This paper describes a predominant pitch extraction system based on a family of Bayesian harmonic models. These models represent the short term waveform of the observed signal as a sum of harmonic partials and a residual noise. The amplitudes of the partials are modelled by a prior learnt on a training set, whereas the residual is modelled by a psycho-acoustically motivated prior. Efficient algorithms are provided to estimate the best fundamental frequency according to the MAP criterion. The performance of the method is evaluated in the framework of
FAST FACTORIZATION-BASED INFERENCE FOR BAYESIAN HARMONIC MODELS
"... Harmonic sinusoidal models are a fundamental tool for audio signal analysis. Bayesian harmonic models guarantee a good resynthesis quality and allow joint use of learnt parameter priors and auditory motivated distortion measures. However inference algorithms based on Monte Carlo sampling are rather ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Harmonic sinusoidal models are a fundamental tool for audio signal analysis. Bayesian harmonic models guarantee a good resynthesis quality and allow joint use of learnt parameter priors and auditory motivated distortion measures. However inference algorithms based on Monte Carlo sampling are rather slow for realistic data. In this paper, we investigate fast inference algorithms based on approximate factorization of the joint posterior into a product of independent distributions on small subsets of parameters. We discuss the conditions under which these approximations hold true and evaluate their performance experimentally. We suggest how they could be used together with Monte Carlo algorithms for a faster sampling-based inference. 1.
ESTIMATING PARAMETERS FROM AUDIO FOR AN EG+LFO MODEL OF PITCH ENVELOPES
"... Envelope generator (EG) and Low Frequency Oscillator (LFO) parameters give a compact representation of audio pitch envelopes. By estimating these parameters from audio per-note, they could be used as part of an audio coding scheme. Recordings of various instruments and articulations were examined, a ..."
Abstract
- Add to MetaCart
Envelope generator (EG) and Low Frequency Oscillator (LFO) parameters give a compact representation of audio pitch envelopes. By estimating these parameters from audio per-note, they could be used as part of an audio coding scheme. Recordings of various instruments and articulations were examined, and pitch envelopes found. Using an evolutionary algorithm, EG and LFO parameters for the envelopes were estimated. The resulting estimated envelopes are compared to both the original envelope, and to a fixedpitch estimate. Envelopes estimated using EG+LFO can closely represent the envelope from the original audio and provide a more accurate estimate than the mean pitch. 1.
Harmonic Sinusoidal+Noise Modeling of Audio based on Multiple F0 Estimation
"... The papers at this Convention have been selected on the basis of a submitted abstract and extended precis that have been peer reviewed by at least two qualified anonymous reviewers. This convention paper has been reproduced from the author's advance manuscript, without editing, corrections, or consi ..."
Abstract
- Add to MetaCart
The papers at this Convention have been selected on the basis of a submitted abstract and extended precis that have been peer reviewed by at least two qualified anonymous reviewers. This convention paper has been reproduced from the author's advance manuscript, without editing, corrections, or consideration by the Review Board. The AES takes no responsibility for the contents.

