Results 1  10
of
22
Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria
 IEEE Trans. On Audio, Speech and Lang. Processing
, 2007
"... Abstract—An unsupervised learning algorithm for the separation of sound sources in onechannel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a timevarying gain ..."
Abstract

Cited by 167 (29 self)
 Add to MetaCart
Abstract—An unsupervised learning algorithm for the separation of sound sources in onechannel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a timevarying gain. Each sound source, in turn, is modeled as a sum of one or more components. The parameters of the components are estimated by minimizing the reconstruction error between the input spectrogram and the model, while restricting the component spectrograms to be nonnegative and favoring components whose gains are slowly varying and sparse. Temporal continuity is favored by using a cost term which is the sum of squared differences between the gains in adjacent frames, and sparseness is favored by penalizing nonzero gains. The proposed iterative estimation algorithm is initialized with random values, and the gains and the spectra are then alternatively updated using multiplicative update rules until the values converge. Simulation experiments were carried out using generated mixtures of pitched musical instrument samples and drum sounds. The performance of the proposed method was compared with independent subspace analysis and basic nonnegative matrix factorization, which are based on the same linear model. According to these simulations, the proposed method enables a better separation quality than the previous algorithms. Especially, the temporal continuity criterion improved the detection of pitched musical sounds. The sparseness criterion did not produce significant improvements. Index Terms—Acoustic signal analysis, audio source separation, blind source separation, music, nonnegative matrix factorization, sparse coding, unsupervised learning. I.
Sound Source Separation in Monaural Music Signals
, 2006
"... Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separat ..."
Abstract

Cited by 32 (4 self)
 Add to MetaCart
(Show Context)
Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separation of monaural, or, onechannel music recordings. We concentrate on separation methods, where the sources to be separated are not known beforehand. Instead, the separation is enabled by utilizing the common properties of realworld sound sources, which are their continuity, sparseness, and repetition in time and frequency, and their harmonic spectral structures. One of the separation approaches taken here use unsupervised learning and the other uses modelbased inference based on sinusoidal modeling. Most of the existing unsupervised separation algorithms are based on a linear instantaneous signal model, where each frame of the input mixture signal is
Nonnegative tensor factorisation for sound source separation
 IN: PROCEEDINGS OF IRISH SIGNALS AND SYSTEMS CONFERENCE
, 2005
"... ... is introduced which extends current matrix factorisation techniques to deal with tensors. The effectiveness of the algorithm is then demonstrated through tests on synthetic data. The algorithm is then employed as a means of performing sound source separation on two channel mixtures, and the sepa ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
... is introduced which extends current matrix factorisation techniques to deal with tensors. The effectiveness of the algorithm is then demonstrated through tests on synthetic data. The algorithm is then employed as a means of performing sound source separation on two channel mixtures, and the separation capabilities of the algorithm demonstrated on a two channel mixture containing saxophone, strings and bass guitar.
Sound Source Separation using Shifted Nonnegative Tensor Factorisation
 Proceedings on the IEE Conference on Audio and Speech Signal Processing (ICASSP
, 2006
"... Recently, shifted Nonnegative Matrix Factorisation was developed as a means of separating harmonic instruments from single channel mixtures. However, in many cases two or more channels are available, in which case it would be advantageous to have a multichannel version of the algorithm. To this end ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
(Show Context)
Recently, shifted Nonnegative Matrix Factorisation was developed as a means of separating harmonic instruments from single channel mixtures. However, in many cases two or more channels are available, in which case it would be advantageous to have a multichannel version of the algorithm. To this end, a shifted Nonnegative Tensor Factorisation algorithm is derived, which extends shifted Nonnegative Matrix Factorisation to the multichannel case. The use of this algorithm for multichannel sound source separation of harmonic instruments is demonstrated. Further, it is shown that the algorithm can be used to perform Nonnegative Tensor Deconvolution, a multichannel version of Nonnegative Matrix Deconvolution, to separate sound sources which have time evolving spectra from multichannel signals. 1.
Generalised prior subspace analysis for polyphonic pitch transcription
 in Proc. Int. Conf. on Digital Audio Effects (DAFx
, 2005
"... A reformulation of Prior Subspace Analysis (PSA) is presented, which restates the problem as that of fitting an undercomplete signal dictionary to a spectrogram. Further, a generalization of PSA is derived which allows the transcription of polyphonic pitched instruments. This involves the translatio ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
A reformulation of Prior Subspace Analysis (PSA) is presented, which restates the problem as that of fitting an undercomplete signal dictionary to a spectrogram. Further, a generalization of PSA is derived which allows the transcription of polyphonic pitched instruments. This involves the translation of a single frequency prior subspace of a note to approximate other notes, overcoming the problem of needing a separate basis function for each note played by an instrument. Examples are then demonstrated which show the utility of the generalised PSA algorithm for the purposes of polyphonic pitch transcription. 1.
Monaural Sound Source Separation by Perceptually Weighted NonNegative Matrix Factorization
"... Abstract — A dataadaptive algorithm for the separation of sound sources from onechannel signals is presented. The algorithm applies weighted nonnegative matrix factorization on the power spectrogram of the input signal. Perceptually motivated weights for each critical band in each frame are used ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
Abstract — A dataadaptive algorithm for the separation of sound sources from onechannel signals is presented. The algorithm applies weighted nonnegative matrix factorization on the power spectrogram of the input signal. Perceptually motivated weights for each critical band in each frame are used to model the loudness perception of the human auditory system. The method compresses highenergy components, and enables the estimation of perceptually significant lowenergy characteristics of sources. The power spectrogram is factorized into a sum of components which have a fixed magnitude spectrum with a timevarying gain. Each source consists of one or more components. The parameters of the components are estimated by minimizing the weighted divergence between the observed power spectrogram and the model, for which a weighted nonnegative matrix factorization algorithm is proposed. Simulation experiments were carried out using generated mixtures of pitched musical instrument samples and percussive sounds. The performance of the proposed method was compared with other separation algorithms which are based on the same signal model. These include for example independent subspace analysis and sparse coding. According to the simulations the proposed method enables perceptually better separation quality than the existing algorithms. Demonstration signals are available at
ACOUSTIC MODELLING OF DRUM SOUNDS WITH HIDDEN MARKOV MODELS FOR MUSIC TRANSCRIPTION
"... This paper describes two methods for applying hidden Markov models (HMMs) to acoustic modelling of drum sound events for polyphonic music transcription. The proposed methods are instrumentwise binary modelling and modelling of instrument combinations. In the first, each target instrument is modelled ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
This paper describes two methods for applying hidden Markov models (HMMs) to acoustic modelling of drum sound events for polyphonic music transcription. The proposed methods are instrumentwise binary modelling and modelling of instrument combinations. In the first, each target instrument is modelled with a “sound ” model and all target instruments share a “silence ” model. Each instrument is transcribed independently from the others. In the latter method, different instrument combinations are modelled, and an additional “silence ” model is created. The proposed methods are evaluated with simulations with acoustic data, and compared with two reference methods. Simulations show that combination modelling performs better than instrumentwise modelling. 1.
Shifted 2D Nonnegative Tensor Factorisation
"... ... developed as a means of separating harmonic instruments from single channel mixtures. This technique uses a model which is convolutive in both time and frequency, and so can capture instruments which have both timevarying spectra and timevarying fundamental frequencies simultaneously. However, ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
... developed as a means of separating harmonic instruments from single channel mixtures. This technique uses a model which is convolutive in both time and frequency, and so can capture instruments which have both timevarying spectra and timevarying fundamental frequencies simultaneously. However, in many cases two or more channels are available, in which case it would be advantageous to have a multichannel version of the algorithm. To this end, a shifted 2D Nonnegative Tensor Factorisation algorithm is derived, which extends Nonnegative Matrix Factor 2D Deconvolution to the multichannel case. The use of this algorithm for multichannel sound source separation of pitched instruments is demonstrated.
Some Case Studies in Automatic Descriptor Extraction
"... Abstract. This work aims to evaluate the effectiveness of EDS as a tool to automatically extract descriptors for realworld problems, such as melody extraction, chord recognition, and sound classification, comparing its performance and development time to traditional approaches. Each of these proble ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
Abstract. This work aims to evaluate the effectiveness of EDS as a tool to automatically extract descriptors for realworld problems, such as melody extraction, chord recognition, and sound classification, comparing its performance and development time to traditional approaches. Each of these problems constitutes a case study, and along with the comparative results we present some remarks about the descriptor extraction procedure. 1.