Results 1 - 10
of
15
A general flexible framework for the handling of prior information in audio source separation
- IEEE Transactions on Audio, Speech and Signal Processing
, 2012
"... Abstract—Most of audio source separation methods are developed for a particular scenario characterized by the number of sources and channels and the characteristics of the sources and the mixing process. In this paper we introduce a general audio source separation framework based on a library of str ..."
Abstract
-
Cited by 45 (17 self)
- Add to MetaCart
(Show Context)
Abstract—Most of audio source separation methods are developed for a particular scenario characterized by the number of sources and channels and the characteristics of the sources and the mixing process. In this paper we introduce a general audio source separation framework based on a library of structured source models that enable the incorporation of prior knowledge about each source via user-specifiable constraints. While this framework generalizes several existing audio source separation methods, it also allows to imagine and implement new efficient methods that were not yet reported in the literature. We first introduce the framework by describing the model structure and constraints, explaining its generality, and summarizing its algorithmic implementation using a generalized expectation-maximization algorithm. Finally, we illustrate the above-mentioned capabilities of the framework by applying it in several new and existing configurations to different source separation problems. We have released a software tool named Flexible Audio Source Separation Toolbox (FASST) implementing a baseline version of the framework in Matlab. Index Terms—Audio source separation, local Gaussian model, nonnegative matrix factorization, expectation-maximization I.
A general modular framework for audio source separation
- in "Proc. 9th Int. Conf. on Latent Variable Analysis and Signal Separation (LVA/ICA
"... Abstract. Most of audio source separation methods are developed for a particular scenario characterized by the number of sources and channels and the characteristics of the sources and the mixing process. In this paper we introduce a general modular audio source separation framework based on a libr ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
(Show Context)
Abstract. Most of audio source separation methods are developed for a particular scenario characterized by the number of sources and channels and the characteristics of the sources and the mixing process. In this paper we introduce a general modular audio source separation framework based on a library of flexible source models that enable the incorporation of prior knowledge about the characteristics of each source. First, this framework generalizes several existing audio source separation methods, while bringing a common formulation for them. Second, it allows to imagine and implement new efficient methods that were not yet reported in the literature. We first introduce the framework by describing the flexible model, explaining its generality, and summarizing our modular implementation using a Generalized Expectation-Maximization algorithm. Finally, we illustrate the above-mentioned capabilities of the framework by applying it in several new and existing configurations to different source separation scenarios.
Multichannel extensions of non-negative matrix factorization with complex-valued data
- IEEE Transactions on Audio, Speech and Language Processing
, 2013
"... Abstract—This paper presents new formulations and algorithms for multichannel extensions of non-negative matrix factorization (NMF). The formulations employ Hermitian positive semidefinite matrices to represent a multichannel version of non-negative elements. Multichannel Euclidean distance and mult ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Abstract—This paper presents new formulations and algorithms for multichannel extensions of non-negative matrix factorization (NMF). The formulations employ Hermitian positive semidefinite matrices to represent a multichannel version of non-negative elements. Multichannel Euclidean distance and multichannel Itakura-Saito (IS) divergence are defined based on appropriate statistical models utilizing multivariate complex Gaussian distri-butions. To minimize this distance/divergence, efficient optimiza-tion algorithms in the form of multiplicative updates are derived by using properly designed auxiliary functions. Two methods are proposed for clustering NMF bases according to the estimated spatial property. Convolutive blind source separation (BSS) is performed by the multichannel extensions of NMF with the clus-tering mechanism. Experimental results show that 1) the derived multiplicative update rules exhibited good convergence behavior, and 2) BSS tasks for several music sources with two microphones and three instrumental parts were evaluated successfully. Index Terms—Blind source separation, clustering, convolutive mixture, multichannel, non-negative matrix factorization. I.
New formulations and efficient algorithms for multichannel NMF
- in Proc. WASPAA ’11
, 2011
"... This paper proposes new formulations and algorithms for a multi-channel extension of nonnegative matrix factorization (NMF), in-tending convolutive sound source separation with multiple micro-phones. The proposed formulation employs Hermitian positive semidefinite matrices to represent a multichanne ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
(Show Context)
This paper proposes new formulations and algorithms for a multi-channel extension of nonnegative matrix factorization (NMF), in-tending convolutive sound source separation with multiple micro-phones. The proposed formulation employs Hermitian positive semidefinite matrices to represent a multichannel version of non-negative elements. Such matrices are basically estimated for NMF bases, but a source separation task can be performed by introducing variables that relate NMF bases and sources. Efficient optimiza-tion algorithms in the form of multiplicative updates are derived by using properly designed auxiliary functions. Experimental results show that two instrumental sounds coming from different directions were successfully separated by the proposed algorithm. Index Terms — nonnegative matrix factorization, multichan-nel, positive semidefinite, auxiliary function, source separation 1.
Direction of arrival based spatial covariance model for blind sound source separation
- IEEE Transactions on Audio, Speech, and Language Processing
, 2014
"... Abstract—This paper addresses the problem of sound source separation from a multichannel microphone array capture via estimation of source spatial covariance matrix (SCM) of a short-time Fourier transformed mixture signal. In many conventional audio separation algorithms the source mixing parameter ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
(Show Context)
Abstract—This paper addresses the problem of sound source separation from a multichannel microphone array capture via estimation of source spatial covariance matrix (SCM) of a short-time Fourier transformed mixture signal. In many conventional audio separation algorithms the source mixing parameter esti-mation is done separately for each frequency thus making them prone to errors and leading to suboptimal source estimates. In this paper we propose a SCM model which consists of a weighted sum of direction of arrival (DoA) kernels and estimate only the weights dependent on the source directions. In the proposed algorithm, the spatial properties of the sources become jointly optimized over all frequencies, leading to more coherent source estimates and mitigating the effect of spatial aliasing at high frequencies. The proposed SCM model is combined with a linear model for magnitudes and the parameter estimation is formulated in a complex-valued non-negative matrix factorization (CNMF) framework. Simulations consist of recordings done with a hand-held device sized array having multiple microphones embedded inside the device casing. Separation quality of the proposed algorithm is shown to exceed the performance of existing state of the art separation methods with two sources when evaluated by objective separation quality metrics. Index Terms—multichannel source separation, spatial covari-ance models, non-negative matrix factorization, direction of arrival estimation, array signal processing I.
MULTICHANNEL AUDIO SEPARATION BY DIRECTION OF ARRIVAL BASED SPATIAL COVARIANCE MODEL AND NON-NEGATIVE MATRIX FACTORIZATION
"... This paper studies multichannel audio separation using non-negative matrix factorization (NMF) combined with a new model for spatial covariance matrices (SCM). The proposed model for SCMs is pa-rameterized by source direction of arrival (DoA) and its parameters can be optimized to yield a spatially ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
This paper studies multichannel audio separation using non-negative matrix factorization (NMF) combined with a new model for spatial covariance matrices (SCM). The proposed model for SCMs is pa-rameterized by source direction of arrival (DoA) and its parameters can be optimized to yield a spatially coherent solution over frequen-cies thus avoiding permutation ambiguity and spatial aliasing. The model constrains the estimation of SCMs to a set of geometrically possible solutions. Additionally we present a method for using a pri-ori DoA information of the sources extracted blindly from the mix-ture for the initialization of the parameters of the proposed model. The simulations show that the proposed algorithm exceeds the sepa-ration quality of existing spatial separation methods. Index Terms — Spatial sound separation, non-negative matrix factorization, spatial covariance models 1.
1Reverberant Audio Source Separation via Sparse and Low-Rank Modeling
"... The performance of audio source separation from underde-termined convolutive mixture assuming known mixing filters can be significantly improved by using an analysis sparse prior optimized by a reweighting `1 scheme and a wideband data-fidelity term, as demonstrated by a recent article. In this lett ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
The performance of audio source separation from underde-termined convolutive mixture assuming known mixing filters can be significantly improved by using an analysis sparse prior optimized by a reweighting `1 scheme and a wideband data-fidelity term, as demonstrated by a recent article. In this letter, we show that the performance can be improved even more significantly by exploiting a low-rank prior on the source spectrograms. We present a new algorithm to estimate the sources based on i) an analysis sparse prior, ii) a reweighting scheme so as to increase the sparsity, iii) a wideband data-fidelity term in a constrained form, and iv) a low-rank constraint on the source spectrograms. Evaluation on reverberant music mixtures shows that the resulting algorithm improves state-of-the-art methods by more than 2 dB of signal-to-distortion ratio. I.
unknown title
"... Doubly sparse models for multiple filter estimation in sparse echoic environments ..."
Abstract
- Add to MetaCart
(Show Context)
Doubly sparse models for multiple filter estimation in sparse echoic environments
Author manuscript, published in "9th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA'10) (2010)" A General Modular Framework for Audio Source Separation
, 2011
"... Abstract. Most of audio source separation methods are developed for a particular scenario characterized by the number of sources and channels and the characteristics of the sources and the mixing process. In this paper we introduce a general modular audio source separation framework based on a libra ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Most of audio source separation methods are developed for a particular scenario characterized by the number of sources and channels and the characteristics of the sources and the mixing process. In this paper we introduce a general modular audio source separation framework based on a library of flexible source models that enable the incorporation of prior knowledge about the characteristics of each source. First, this framework generalizes several existing audio source separation methods, while bringing a common formulation for them. Second, it allows to imagine and implement new efficient methods that were not yet reported in the literature. We first introduce the framework by describing the flexible model, explaining its generality, and summarizing our modular implementation using a Generalized Expectation-Maximization algorithm. Finally, we illustrate the above-mentioned capabilities of the framework by applying it in several new and existing configurations to different source separation scenarios. 1
Author manuscript, published in "9th International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA'10) (2010)" The 2010 Signal Separation Evaluation Campaign (SiSEC2010):- Audio source separation-
, 2011
"... Abstract. This paper introduces the audio part of the 2010 communitybased ..."
Abstract
- Add to MetaCart
Abstract. This paper introduces the audio part of the 2010 communitybased