#### DMCA

## Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria (2007)

Venue: | IEEE Trans. On Audio, Speech and Lang. Processing |

Citations: | 187 - 30 self |

### Citations

2308 | Independent Component Analysis
- Hyvärinen, Karhunen, et al.
- 2001
(Show Context)
Citation Context ...es an observation vector by finding an unmixing matrix , so that the estimated variables, i.e., the elements of vector are statistically independent from each other. The convolutive extension of ICA (=-=[15]-=-, pp. 361–370) suits well for multichannel sound source separation, where the elements of the observation vector are the signals recorded with different microphones. In ICA the number of sources has t... |

1242 | Algorithms for non-negative matrix factorization
- Lee, Seung
- 2001
(Show Context)
Citation Context ...s to be entry-wise nonnegative. Moreover, the components can be restricted to be purely additive, meaning that the gains are restricted to be nonnegative. The NMF algorithms proposed by Lee and Seung =-=[21]-=- do the decomposition by minimizing the reconstruction error between the observation matrix and the model while constraining the matrices to be entry-wise nonnegative. The algorithms have been used in... |

1180 |
Nonlinear programming
- Bertsekas
- 1999
(Show Context)
Citation Context ...y Hoyer [24] combines NMF and sparse coding. He estimated the matrices by combining the multiplicative update rule proposed by Lee and Seung [21] with projected gradient descent [discussed, e.g., in (=-=[25]-=-, pp. 203–224)]. The algorithm was used with a temporal continuity criterion for sound separation by Virtanen [10]. No large-scale evaluation has been carried out to investigate whether the use of a s... |

955 | Sparse coding with an overcomplete basis set: A strategy employed by VI
- Olshausen, Field
- 1997
(Show Context)
Citation Context ...better approximation of human auditory perception. 3) Sparse Coding: An unsupervised learning technique called sparse coding has been successfully used for example to model the early stages of vision =-=[22]-=-. The term sparse refers to a signal model, where the data is represented in terms of a small number of active elements chosen out of a larger set. In the signal model (1), this means that the probabi... |

884 | Fast and robust fixed-point algorithms for independent component analysis - Hyvärinen - 1999 |

496 | Non-negative matrix factorization with sparseness constraints
- Hoyer
(Show Context)
Citation Context ...y Abdallah and Plumbley [9], [11], Benaroya, McDonagh, Bimbot, and Gribonval [23], and Blumensath and Davies [12], to mention a few examples. The nonnegative sparse coding algorithm proposed by Hoyer =-=[24]-=- combines NMF and sparse coding. He estimated the matrices by combining the multiplicative update rule proposed by Lee and Seung [21] with projected gradient descent [discussed, e.g., in ([25], pp. 20... |

274 |
Performance measurement in blind audio source separation,”
- Vincent, Gribonval, et al.
- 2006
(Show Context)
Citation Context ...t 9.6 for an ISA algorithm, in which the time-domain signals of the sources were trained before mixing [4]. The term Source to Distortion Ratio has also been used to refer to this performance measure =-=[38]-=-. The SNR (in decibels) was averaged over all the sources and mixtures to get the total measure of the separation performance. If no components were assigned to a source, the source was defined to be ... |

240 | Non-negative matrix factorization for polyphonic music transcription,” in
- Smaragdis, Brown
- 2003
(Show Context)
Citation Context ...n where the sources are statistically independent or nonredundant. Algorithms have been proposed that are based on independent component analysis (ICA) [5]–[7], nonnegative matrix factorization (NMF) =-=[8]-=-, and sparse coding [9]–[11]. This paper proposes an unsupervised sound source separation algorithm which combines NMF with temporal continuity and sparseness objectives. The proposed algorithm is sho... |

201 | Emergence of phase and shift invariant features by decomposition of natural images into independent feature subspaces.
- Hyvarinen, Hoyer
- 2000
(Show Context)
Citation Context ...frequency line as a phase-invariant feature calculated in each frame. The factorization of the spectrogram can be seen as separation of phase-independent features into (2) invariant feature subspaces =-=[16]-=-. Letting the magnitude or power spectrum in frame to be the observation, the separation can be done using basic ICA, as explained above. With this procedure the estimated gains of different component... |

172 | Signal estimation from modified short-time Fourier transform
- Griffin, Lim
- 1984
(Show Context)
Citation Context ...nt within frames is calculated as . To get complex spectra, the phases of the original spectrogram can be used for the separated components, or the phase generation method proposed by Griffin and Lim =-=[31]-=- with the improvements proposed by Slaney, Naar, and Lyon [32] can be used. In most cases where the separation is successful, the use of the original phases produces good results. It also allows the s... |

148 | One microphone source separation,”
- Roweis
- 2000
(Show Context)
Citation Context ...get signals has to be restricted. The most successful algorithms have been those which try to extract only the most prominent source [1], [2], or which utilize prior information of the source signals =-=[3]-=-, [4]. Recently, unsupervised machine learning algorithms have been successfully used in one-channel source separation. These are typically based on a simple linear model, and instead of using prior k... |

127 | Monaural speech segregation based on pitch tracking and amplitude modulation.
- Hu, Wang
- 2004
(Show Context)
Citation Context ...ited separation quality, and usually the complexity of the target signals has to be restricted. The most successful algorithms have been those which try to extract only the most prominent source [1], =-=[2]-=-, or which utilize prior information of the source signals [3], [4]. Recently, unsupervised machine learning algorithms have been successfully used in one-channel source separation. These are typicall... |

123 |
Separation of mixed audio sources by independent subspace analysis. Mitsubishi Electric Research Laboratory,
- Casey
- 2001
(Show Context)
Citation Context ...the separation is done by finding a decomposition where the sources are statistically independent or nonredundant. Algorithms have been proposed that are based on independent component analysis (ICA) =-=[5]-=-–[7], nonnegative matrix factorization (NMF) [8], and sparse coding [9]–[11]. This paper proposes an unsupervised sound source separation algorithm which combines NMF with temporal continuity and spar... |

58 |
Sound source separation using sparse coding with temporal continuity objective
- Virtanen
(Show Context)
Citation Context ... proposed by Lee and Seung [21] with projected gradient descent [discussed, e.g., in ([25], pp. 203–224)]. The algorithm was used with a temporal continuity criterion for sound separation by Virtanen =-=[10]-=-. No large-scale evaluation has been carried out to investigate whether the use of a sparse prior increases the separation quality in the case of audio signals. The spectra which typically constitute ... |

56 | Separation of drums from polyphonic music using nonnegtive matrix factorization and support vector machine,” in in
- Heln, Virtanen
- 2005
(Show Context)
Citation Context ...lustering methods have been proposed [5], [6], but in our simulations their performance was not sufficient. Supervised clustering based on pattern recognition techniques produces better results [27], =-=[28]-=-, but these require that the sources are known and their models trained beforehand. In this paper, we do not consider the clustering problem but circumvent this step by using the original signals as a... |

53 |
M.D.: Polyphonic transcription by nonnegative sparse coding of power spectra. In:
- Abdallah, Plumbley
- 2004
(Show Context)
Citation Context ...tistically independent or nonredundant. Algorithms have been proposed that are based on independent component analysis (ICA) [5]–[7], nonnegative matrix factorization (NMF) [8], and sparse coding [9]–=-=[11]-=-. This paper proposes an unsupervised sound source separation algorithm which combines NMF with temporal continuity and sparseness objectives. The proposed algorithm is shown to provide a better separ... |

51 |
MPEG-7 Sound Recognition Tools
- Casey
- 2001
(Show Context)
Citation Context ...FitzGerald, Coyle, and Lawlor [18], Uhle, Dittmar, and Sporer [19], and Brown and Smaragdis [7]. Also, a sound recognition system based on ISA has been adopted in the MPEG-7 standardization framework =-=[20]-=-. 2) Nonnegative Matrix Factorization: In addition to statistical independence, some other estimation principles have also been found useful in finding the decomposition . Each component is modeled us... |

48 | T.: Extraction of drum tracks from polyphonic music using independent subspace analysis
- Uhle, Dittmar, et al.
- 2003
(Show Context)
Citation Context ...independent from each other. ISA has been used in one-channel sound source separation, for example, by Casey and Westner [5], Orife [17], FitzGerald, Coyle, and Lawlor [18], Uhle, Dittmar, and Sporer =-=[19]-=-, and Brown and Smaragdis [7]. Also, a sound recognition system based on ISA has been adopted in the MPEG-7 standardization framework [20]. 2) Nonnegative Matrix Factorization: In addition to statisti... |

47 | A maximum likelihood approach to single-channel source separation,”
- Jang, Lee
- 2003
(Show Context)
Citation Context ...ignals has to be restricted. The most successful algorithms have been those which try to extract only the most prominent source [1], [2], or which utilize prior information of the source signals [3], =-=[4]-=-. Recently, unsupervised machine learning algorithms have been successfully used in one-channel source separation. These are typically based on a simple linear model, and instead of using prior knowle... |

46 | Sparse and shift-invariant representations of music
- Blumensath, Davies
(Show Context)
Citation Context ... following sections, we briefly review some commonly used separation principles. The model (1) can be extended to allow time-varying components; some proposals have been made by Blumensath and Davies =-=[12]-=-, Smaragdis [13], and Virtanen [14]. More complex models can potentially enable better separation quality, but in this paper we compare only methods which are based on the linear model (1). 1) Indepen... |

46 | Separation of Sound Sources by Convolutive Sparse Coding”
- Virtanen
- 2004
(Show Context)
Citation Context ...iew some commonly used separation principles. The model (1) can be extended to allow time-varying components; some proposals have been made by Blumensath and Davies [12], Smaragdis [13], and Virtanen =-=[14]-=-. More complex models can potentially enable better separation quality, but in this paper we compare only methods which are based on the linear model (1). 1) Independent Subspace Analysis (ICA): ICA h... |

45 | Non negative sparse representation for wiener based source separation with a single sensor.
- Benaroya, Donagh, et al.
- 2003
(Show Context)
Citation Context ...onstruction error term and a term which penalizes nonzero gains . Sparse coding has been used for audio signal separation by Abdallah and Plumbley [9], [11], Benaroya, McDonagh, Bimbot, and Gribonval =-=[23]-=-, and Blumensath and Davies [12], to mention a few examples. The nonnegative sparse coding algorithm proposed by Hoyer [24] combines NMF and sparse coding. He estimated the matrices by combining the m... |

37 | Auditory Model Inversion for Sound Separation
- Slaney, Naar, et al.
- 1994
(Show Context)
Citation Context ...e phases of the original spectrogram can be used for the separated components, or the phase generation method proposed by Griffin and Lim [31] with the improvements proposed by Slaney, Naar, and Lyon =-=[32]-=- can be used. In most cases where the separation is successful, the use of the original phases produces good results. It also allows the synthesis of sharp attacks with an accuracy which would otherwi... |

36 | Sub-band independent subspace analysis for drum transcription,”
- FitzGerald, Coyle, et al.
- 2002
(Show Context)
Citation Context ...nt components are statistically independent from each other. ISA has been used in one-channel sound source separation, for example, by Casey and Westner [5], Orife [17], FitzGerald, Coyle, and Lawlor =-=[18]-=-, Uhle, Dittmar, and Sporer [19], and Brown and Smaragdis [7]. Also, a sound recognition system based on ISA has been adopted in the MPEG-7 standardization framework [20]. 2) Nonnegative Matrix Factor... |

33 | Discovering Auditory Objects Through NonNegativity Constraints,''
- Smaragdis
- 2004
(Show Context)
Citation Context ...ons, we briefly review some commonly used separation principles. The model (1) can be extended to allow time-varying components; some proposals have been made by Blumensath and Davies [12], Smaragdis =-=[13]-=-, and Virtanen [14]. More complex models can potentially enable better separation quality, but in this paper we compare only methods which are based on the linear model (1). 1) Independent Subspace An... |

30 | Automatic Drum Transcription and Source Separation. - Fitzgerald - 2004 |

26 | Separation of vocals from polyphonic audio recordings”.
- Vembu, Baumann
- 2005
(Show Context)
Citation Context ...ised clustering methods have been proposed [5], [6], but in our simulations their performance was not sufficient. Supervised clustering based on pattern recognition techniques produces better results =-=[27]-=-, [28], but these require that the sources are known and their models trained beforehand. In this paper, we do not consider the clustering problem but circumvent this step by using the original signal... |

22 |
Music transcription with ISA and HMM
- Vincent, Rodet
- 2004
(Show Context)
Citation Context ...the separation can be directed towards the spectrograms A and B. Temporal continuity was addressed in a system proposed Vincent and Rodet who modeled the activity of a source by a hidden Markov model =-=[30]-=-. In this paper, we apply a simple temporal continuity criterion which does not require training beforehand. Temporal continuity of the components is measured by assigning a cost to large changes betw... |

22 |
Introduction to Mathematical Statistics, 4th Ed
- Hogg, Craig
- 1978
(Show Context)
Citation Context ...age SNR. D. Results The average SNRs and detection error rates are shown in Table II. The averages are shown for all sources, and separately for pitched and drum sounds. The 95% confidence intervals (=-=[39]-=-, pp. 212–219) for the average detection error rate and the SNR were smaller than 1% and 1 dB, respectively, for all the algorithms, which means that the differences between the algorithms are statist... |

13 |
Independent Component Analysis for Automatic Note Extraction from Musical Trills”,
- Brown, Smaragdis
- 2004
(Show Context)
Citation Context ...separation is done by finding a decomposition where the sources are statistically independent or nonredundant. Algorithms have been proposed that are based on independent component analysis (ICA) [5]–=-=[7]-=-, nonnegative matrix factorization (NMF) [8], and sparse coding [9]–[11]. This paper proposes an unsupervised sound source separation algorithm which combines NMF with temporal continuity and sparsene... |

10 | An independent component analysis approach to automatic music transcription
- Abdallah, Plumbley
- 2003
(Show Context)
Citation Context ... statistically independent or nonredundant. Algorithms have been proposed that are based on independent component analysis (ICA) [5]–[7], nonnegative matrix factorization (NMF) [8], and sparse coding =-=[9]-=-–[11]. This paper proposes an unsupervised sound source separation algorithm which combines NMF with temporal continuity and sparseness objectives. The proposed algorithm is shown to provide a better ... |

10 |
The Science of Sound, 2nd ed
- Rossing
- 1990
(Show Context)
Citation Context ...terms, respectively. A. Reconstruction Error Term The human auditory system has a wide dynamic range: the difference between the threshold of hearing and the threshold of pain is approximately 100 dB =-=[29]-=-. Unsupervised learning algorithms tend to be more sensitive to high-energy observations, and some methods fail to separate low-energy sources even though these are perceptually and musically meaningf... |

9 | Riddim : A rhythm analysis and decomposition tool based on independent subspace analysis
- Orife
- 2001
(Show Context)
Citation Context ...edure the estimated gains of different components are statistically independent from each other. ISA has been used in one-channel sound source separation, for example, by Casey and Westner [5], Orife =-=[17]-=-, FitzGerald, Coyle, and Lawlor [18], Uhle, Dittmar, and Sporer [19], and Brown and Smaragdis [7]. Also, a sound recognition system based on ISA has been adopted in the MPEG-7 standardization framewor... |

8 |
Extracting sound objects by independent subspace analysis
- Dubnov
(Show Context)
Citation Context ...mponents are used which are then clustered to sound sources. Automatic clustering of the components has turned out to be a difficult task. Some unsupervised clustering methods have been proposed [5], =-=[6]-=-, but in our simulations their performance was not sufficient. Supervised clustering based on pattern recognition techniques produces better results [27], [28], but these require that the sources are ... |

7 | A predominant-F0 estimation method for real-world musical audio signals: MAP estimation for incorporating prior knowledge about F0s and tone models
- Goto
- 2001
(Show Context)
Citation Context ...a limited separation quality, and usually the complexity of the target signals has to be restricted. The most successful algorithms have been those which try to extract only the most prominent source =-=[1]-=-, [2], or which utilize prior information of the source signals [3], [4]. Recently, unsupervised machine learning algorithms have been successfully used in one-channel source separation. These are typ... |

1 | 2003 [Online]. Available: http://www.toontrack.com/ superior.shtml, Toontrack Music - Superior - 2007 |