Results 1 - 10
of
11
Unsupervised single-channel music source separation by average harmonic structure modeling
- IEEE Trans. Audio Speech Language Process
"... ..."
(Show Context)
INVESTIGATING SINGLE-CHANNEL AUDIO SOURCE SEPARATION METHODS BASED ON NON-NEGATIVE MATRIX FACTORIZATION
"... Our research aims to separate multiple sound sources from a single-channel audio mixture, and in this paper, we present a framework featured by Non-negative Matrix Factorization (NMF). Within this framework, we proposed two approaches which are referred as Un-directed and Directed NMF model. The Un- ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
(Show Context)
Our research aims to separate multiple sound sources from a single-channel audio mixture, and in this paper, we present a framework featured by Non-negative Matrix Factorization (NMF). Within this framework, we proposed two approaches which are referred as Un-directed and Directed NMF model. The Un-directed NMF model decomposes the mixing data in an unsupervised manner but requires human interaction for clustering. We have developed a simple graphical user interface for this task. Provided with isolated training data, the Directed NMF is performed under the direction of pre-trained models and therefore, does not need user interaction. Experimental results show this framework is a feasible way to achieve high quality separation. Successful separation of individual sound sources could assist with other tasks such as automatic music transcription, object coding, special sound effects and so on. Keywords: Single-channel separation, Non-negative Matrix Factorization, Blind Source Separtion
BLIND AUDIOVISUAL SOURCE SEPARATION USING SPARSE REPRESENTATIONS
"... In this work we present a method to jointly separate active audio and visual structures on a given mixture. Blind Audiovisual Source Separation is achieved exploiting the coherence between a video signal and a one-microphone audio track. The efficient representation of audio and video sequences allo ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
(Show Context)
In this work we present a method to jointly separate active audio and visual structures on a given mixture. Blind Audiovisual Source Separation is achieved exploiting the coherence between a video signal and a one-microphone audio track. The efficient representation of audio and video sequences allows to build relationships between correlated structures on both modalities. Video structures exhibiting strong correlations with the audio signal and that are spatially close are grouped using a robust clustering algorithm that can count and localize audiovisual sources. Using such information and exploiting audio-video correlation, audio sources are also localized and separated. To the best of our knowledge this is the first blind audiovisual source separation algorithm conceived to deal with a video sequence and the corresponding mono audio signal. Index Terms — Audiovisual processing, blind source separation, sparse signal representation. 1.
Source extraction from two-channel mixtures by joint cosine packet analysis
- in Proc. European Signal Processing Conf. (EUSIPCO
, 2006
"... This paper describes novel, computationally efficient approaches to source separation of underdetermined instantaneous two-channel mixtures. A best basis algorithm is applied to trees of local cosine bases to determine a sparse transform. We assume that the mixing parameters are known and focus on d ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
(Show Context)
This paper describes novel, computationally efficient approaches to source separation of underdetermined instantaneous two-channel mixtures. A best basis algorithm is applied to trees of local cosine bases to determine a sparse transform. We assume that the mixing parameters are known and focus on demixing sources by binary time-frequency masking. We describe a method for deriving a best local cosine basis from the mixtures by minimising an l 1 norm cost function. This basis is adapted to the input of the masking process. Then, we investigate how to increase sparsity by adapting local cosine bases to the expected output of a single source instead of to the input mixtures. The heuristically derived cost function maximises the energy of the transform coefficients associated with a particular direction. Experiments on a mixture of four musical instruments are performed, and results are compared. It is shown that local cosine bases can give better results than fixed-basis representations. 1.
Musical Audio Analysis Using Sparse Representations
, 2006
"... this paper we will given an overview of work by ourselves and others in this area, to give a flavour of the work being undertaken, and to give some pointers for further information about this interesting and challenging research topic ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
this paper we will given an overview of work by ourselves and others in this area, to give a flavour of the work being undertaken, and to give some pointers for further information about this interesting and challenging research topic
SEPARATING SOURCES FROM SINGLE-CHANNEL MUSICAL MATERIAL: A REVIEW AND FUTURE DIRECTIONS
"... The problem of separating multiple audio streams from singlechannel polyphonic source material is a very challenging one, since it is extremely underdetermined. Additional information is hence essential to assist with constraining the infinite solution set, depending on the application. This paper r ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
The problem of separating multiple audio streams from singlechannel polyphonic source material is a very challenging one, since it is extremely underdetermined. Additional information is hence essential to assist with constraining the infinite solution set, depending on the application. This paper reviews the projects in our group that have addressed this problem along with related research done elsewhere, illustrating how the possible future approaches for the current doctorial study follow naturally from this foundation.
Author manuscript, published in "17th IASC Symp. in Computational Statistics (COMPSTAT) (2006) 104--117" Musical Audio Analysis using Sparse Representations
, 2010
"... Summary. Sparse representations are becoming an increasingly useful tool in the analysis of musical audio signals. In this paper we will given an overview of work by ourselves and others in this area, to give a flavour of the work being undertaken, and to give some pointers for further information a ..."
Abstract
- Add to MetaCart
(Show Context)
Summary. Sparse representations are becoming an increasingly useful tool in the analysis of musical audio signals. In this paper we will given an overview of work by ourselves and others in this area, to give a flavour of the work being undertaken, and to give some pointers for further information about this interesting and challenging research topic. 1
A SIGNAL-ADAPTIVE LOCAL COSINE TRANSFORM FOR SOURCE SEPARATION BY TIME-FREQUENCY MASKING
"... Time-frequency masking is often used for source separation of underdetermined audio mixtures. It depends on the fact that the sources can be represented disjointly in some transform domain. The focus of this paper is on demixing sources from instantaneous, two-channel mixtures by binary masking. We ..."
Abstract
- Add to MetaCart
(Show Context)
Time-frequency masking is often used for source separation of underdetermined audio mixtures. It depends on the fact that the sources can be represented disjointly in some transform domain. The focus of this paper is on demixing sources from instantaneous, two-channel mixtures by binary masking. We investigate trees of local cosine bases from which a suitable transform may be generated—the best basis is chosen by a computationally efficient algorithm and is adaptively selected to match the time-varying characterists of the signal. Our heuristically motivated cost function maximises the energy of the transform coefficients associated with each estimated source. Finally, we evaluate our proposed transform by comparing it against two well-known transforms: the shorttime Fourier transform and the modified discrete cosine transform. We assume that the mixing parameters are known. Our results show that in some cases, our method can give better results than these fixed-basis representations. Keywords: Audio source separation, local cosine bases
1Unsupervised Single-channel Music Source Separation by Average Harmonic Structure Modeling
"... Abstract — Source separation of musical signals is an appealing but difficult problem, especially in the single-channel case. In this paper, an unsupervised single-channel music source separation algorithm based on average harmonic structure modeling is pro-posed. Under the assumption of playing in ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract — Source separation of musical signals is an appealing but difficult problem, especially in the single-channel case. In this paper, an unsupervised single-channel music source separation algorithm based on average harmonic structure modeling is pro-posed. Under the assumption of playing in narrow pitch ranges, different harmonic instrumental sources in a piece of music often have different but stable harmonic structures, thus sources can be characterized uniquely by harmonic structure models. Given the number of instrumental sources, the proposed algorithm learns these models directly from the mixed signal by clustering the harmonic structures extracted from different frames. The corresponding sources are then extracted from the mixed signal using the models. Experiments on several mixed signals, including synthesized instrumental sources, real instrumental sources and singing voices, show that this algorithm outperforms the general Nonnegative Matrix Factorization (NMF)-based source separa-tion algorithm, and yields good subjective listening quality. As a side-effect, this algorithm estimates the pitches of the harmonic instrumental sources. The number of concurrent sounds in each frame is also computed, which is a difficult task for general Multi-pitch Estimation (MPE) algorithms.