Results 1  10
of
30
Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria
 IEEE Trans. On Audio, Speech and Lang. Processing
, 2007
"... Abstract—An unsupervised learning algorithm for the separation of sound sources in onechannel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a timevarying gain ..."
Abstract

Cited by 189 (30 self)
 Add to MetaCart
Abstract—An unsupervised learning algorithm for the separation of sound sources in onechannel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a timevarying gain. Each sound source, in turn, is modeled as a sum of one or more components. The parameters of the components are estimated by minimizing the reconstruction error between the input spectrogram and the model, while restricting the component spectrograms to be nonnegative and favoring components whose gains are slowly varying and sparse. Temporal continuity is favored by using a cost term which is the sum of squared differences between the gains in adjacent frames, and sparseness is favored by penalizing nonzero gains. The proposed iterative estimation algorithm is initialized with random values, and the gains and the spectra are then alternatively updated using multiplicative update rules until the values converge. Simulation experiments were carried out using generated mixtures of pitched musical instrument samples and drum sounds. The performance of the proposed method was compared with independent subspace analysis and basic nonnegative matrix factorization, which are based on the same linear model. According to these simulations, the proposed method enables a better separation quality than the previous algorithms. Especially, the temporal continuity criterion improved the detection of pitched musical sounds. The sparseness criterion did not produce significant improvements. Index Terms—Acoustic signal analysis, audio source separation, blind source separation, music, nonnegative matrix factorization, sparse coding, unsupervised learning. I.
Sound Source Separation in Monaural Music Signals
, 2006
"... Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separat ..."
Abstract

Cited by 36 (4 self)
 Add to MetaCart
(Show Context)
Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separation of monaural, or, onechannel music recordings. We concentrate on separation methods, where the sources to be separated are not known beforehand. Instead, the separation is enabled by utilizing the common properties of realworld sound sources, which are their continuity, sparseness, and repetition in time and frequency, and their harmonic spectral structures. One of the separation approaches taken here use unsupervised learning and the other uses modelbased inference based on sinusoidal modeling. Most of the existing unsupervised separation algorithms are based on a linear instantaneous signal model, where each frame of the input mixture signal is
Nonnegative tensor factorisation for sound source separation
 IN: PROCEEDINGS OF IRISH SIGNALS AND SYSTEMS CONFERENCE
, 2005
"... ... is introduced which extends current matrix factorisation techniques to deal with tensors. The effectiveness of the algorithm is then demonstrated through tests on synthetic data. The algorithm is then employed as a means of performing sound source separation on two channel mixtures, and the sepa ..."
Abstract

Cited by 28 (2 self)
 Add to MetaCart
... is introduced which extends current matrix factorisation techniques to deal with tensors. The effectiveness of the algorithm is then demonstrated through tests on synthetic data. The algorithm is then employed as a means of performing sound source separation on two channel mixtures, and the separation capabilities of the algorithm demonstrated on a two channel mixture containing saxophone, strings and bass guitar.
Sound Source Separation using Shifted Nonnegative Tensor Factorisation
 Proceedings on the IEE Conference on Audio and Speech Signal Processing (ICASSP
, 2006
"... Recently, shifted Nonnegative Matrix Factorisation was developed as a means of separating harmonic instruments from single channel mixtures. However, in many cases two or more channels are available, in which case it would be advantageous to have a multichannel version of the algorithm. To this end ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
(Show Context)
Recently, shifted Nonnegative Matrix Factorisation was developed as a means of separating harmonic instruments from single channel mixtures. However, in many cases two or more channels are available, in which case it would be advantageous to have a multichannel version of the algorithm. To this end, a shifted Nonnegative Tensor Factorisation algorithm is derived, which extends shifted Nonnegative Matrix Factorisation to the multichannel case. The use of this algorithm for multichannel sound source separation of harmonic instruments is demonstrated. Further, it is shown that the algorithm can be used to perform Nonnegative Tensor Deconvolution, a multichannel version of Nonnegative Matrix Deconvolution, to separate sound sources which have time evolving spectra from multichannel signals. 1.
Monaural Sound Source Separation by Perceptually Weighted NonNegative Matrix Factorization
"... Abstract — A dataadaptive algorithm for the separation of sound sources from onechannel signals is presented. The algorithm applies weighted nonnegative matrix factorization on the power spectrogram of the input signal. Perceptually motivated weights for each critical band in each frame are used ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
Abstract — A dataadaptive algorithm for the separation of sound sources from onechannel signals is presented. The algorithm applies weighted nonnegative matrix factorization on the power spectrogram of the input signal. Perceptually motivated weights for each critical band in each frame are used to model the loudness perception of the human auditory system. The method compresses highenergy components, and enables the estimation of perceptually significant lowenergy characteristics of sources. The power spectrogram is factorized into a sum of components which have a fixed magnitude spectrum with a timevarying gain. Each source consists of one or more components. The parameters of the components are estimated by minimizing the weighted divergence between the observed power spectrogram and the model, for which a weighted nonnegative matrix factorization algorithm is proposed. Simulation experiments were carried out using generated mixtures of pitched musical instrument samples and percussive sounds. The performance of the proposed method was compared with other separation algorithms which are based on the same signal model. These include for example independent subspace analysis and sparse coding. According to the simulations the proposed method enables perceptually better separation quality than the existing algorithms. Demonstration signals are available at
Generalised prior subspace analysis for polyphonic pitch transcription
 in Proc. Int. Conf. on Digital Audio Effects (DAFx
, 2005
"... A reformulation of Prior Subspace Analysis (PSA) is presented, which restates the problem as that of fitting an undercomplete signal dictionary to a spectrogram. Further, a generalization of PSA is derived which allows the transcription of polyphonic pitched instruments. This involves the translatio ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
(Show Context)
A reformulation of Prior Subspace Analysis (PSA) is presented, which restates the problem as that of fitting an undercomplete signal dictionary to a spectrogram. Further, a generalization of PSA is derived which allows the transcription of polyphonic pitched instruments. This involves the translation of a single frequency prior subspace of a note to approximate other notes, overcoming the problem of needing a separate basis function for each note played by an instrument. Examples are then demonstrated which show the utility of the generalised PSA algorithm for the purposes of polyphonic pitch transcription. 1.
Separation of Musical Sources and Structure from SingleChannel Polyphonic Recordings University of
, 2006
"... The thesis deals principally with the separation of pitched sources from singlechannel polyphonic musical recordings. The aim is to extract from a mixture a set of pitched instruments or sources, where each source contains a set of similarly sounding events or notes, and each note is seen as compri ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
The thesis deals principally with the separation of pitched sources from singlechannel polyphonic musical recordings. The aim is to extract from a mixture a set of pitched instruments or sources, where each source contains a set of similarly sounding events or notes, and each note is seen as comprising partial, transient and noise content. The work also has implications for separating nonpitched or percussive sounds from recordings, and in general, for unsupervised clustering of a list of detected audio events in a recording into a meaningful set of source classes. The alignment of a symbolic score/MIDI representation with the recording constitutes a preprocessing stage. The three main areas of contribution are: firstly, the design of harmonic tracking algorithms and spectralfiltering techniques for removing harmonics from the mixture, where particular attention has been paid to the case of harmonics which are overlapping in frequency. Secondly, some studies will be presented for separating transient attacks from recordings, both when they are distinguishable from and when they are overlapping in time with other transients. This section also includes a method which proposes that the behaviours of the harmonic and noise components of a note are partially correlated. This is used to share the noise component of a mixture of pitched notes between the interfering sources. Thirdly, unsupervised clustering has been applied to the task of grouping a set of separated notes from the recording into sources, where notes belonging to the same source ideally have similar features or attributes. Issues relating to feature computation, feature selection, dimensionality and dependence on a symbolic music representation are explored. Applications of this work exist in audio spatialisation, audio restoration, music content description, effects processing and elsewhere.
Shifted 2D Nonnegative Tensor Factorisation
"... ... developed as a means of separating harmonic instruments from single channel mixtures. This technique uses a model which is convolutive in both time and frequency, and so can capture instruments which have both timevarying spectra and timevarying fundamental frequencies simultaneously. However, ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
... developed as a means of separating harmonic instruments from single channel mixtures. This technique uses a model which is convolutive in both time and frequency, and so can capture instruments which have both timevarying spectra and timevarying fundamental frequencies simultaneously. However, in many cases two or more channels are available, in which case it would be advantageous to have a multichannel version of the algorithm. To this end, a shifted 2D Nonnegative Tensor Factorisation algorithm is derived, which extends Nonnegative Matrix Factor 2D Deconvolution to the multichannel case. The use of this algorithm for multichannel sound source separation of pitched instruments is demonstrated.
ACOUSTIC MODELLING OF DRUM SOUNDS WITH HIDDEN MARKOV MODELS FOR MUSIC TRANSCRIPTION
"... This paper describes two methods for applying hidden Markov models (HMMs) to acoustic modelling of drum sound events for polyphonic music transcription. The proposed methods are instrumentwise binary modelling and modelling of instrument combinations. In the first, each target instrument is modelled ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
This paper describes two methods for applying hidden Markov models (HMMs) to acoustic modelling of drum sound events for polyphonic music transcription. The proposed methods are instrumentwise binary modelling and modelling of instrument combinations. In the first, each target instrument is modelled with a “sound ” model and all target instruments share a “silence ” model. Each instrument is transcribed independently from the others. In the latter method, different instrument combinations are modelled, and an additional “silence ” model is created. The proposed methods are evaluated with simulations with acoustic data, and compared with two reference methods. Simulations show that combination modelling performs better than instrumentwise modelling. 1.