Results 1 - 10
of
10
Adaptive Harmonic Spectral Decomposition for Multiple Pitch Estimation
, 2009
"... Multiple pitch estimation consists of estimating the fundamental frequencies and saliences of pitched sounds over short time frames of an audio signal. This task forms the basis of several applications in the particular context of musical audio. One approach is to decompose the short-term magnitude ..."
Abstract
-
Cited by 50 (9 self)
- Add to MetaCart
Multiple pitch estimation consists of estimating the fundamental frequencies and saliences of pitched sounds over short time frames of an audio signal. This task forms the basis of several applications in the particular context of musical audio. One approach is to decompose the short-term magnitude spectrum of the signal into a sum of basis spectra representing individual pitches scaled by time-varying amplitudes, using algorithms such as nonnegative matrix factorization (NMF). Prior training of the basis spectra is often infeasible due to the wide range of possible musical instruments. Appropriate spectra must then be adaptively estimated from the data, which may result in limited performance due to overfitting issues. In this article, we model each basis spectrum as a weighted sum of narrowband spectra representing a few adjacent harmonic partials, thus enforcing harmonicity and spectral smoothness while adapting the spectral envelope to each instrument. We derive a NMFlike algorithm to estimate the model parameters and evaluate it on a database of piano recordings, considering several choices for the narrowband spectra. The proposed algorithm performs similarly to supervised NMF using pre-trained piano spectra but improves pitch estimation performance by 6 % to 10 % compared to alternative unsupervised NMF algorithms.
Analysis of polyphonic audio using source-filter model and non-negative matrix factorization
- in Advances in Models for Acoustic Processing, Neural Information Processing Systems Workshop
, 2006
"... •Framework for (polyphonic) audio — linear signal model for magnitude spectrum xt(k): x̂t(k) = N∑ ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
(Show Context)
•Framework for (polyphonic) audio — linear signal model for magnitude spectrum xt(k): x̂t(k) = N∑
Monaural music source separation: Nonnegativity, sparseness, and shift-invariance
- in Proceedings of the International Conference on Independent Component Analysis and Blind Signal Separation
, 2006
"... Abstract. In this paper we present a method for polyphonic music source separation from their monaural mixture, where the underlying assumption is that the harmonic structure of a musical instrument re-mains roughly the same even if it is played at various pitches and is recorded in various mixing e ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
(Show Context)
Abstract. In this paper we present a method for polyphonic music source separation from their monaural mixture, where the underlying assumption is that the harmonic structure of a musical instrument re-mains roughly the same even if it is played at various pitches and is recorded in various mixing environments. We incorporate with nonneg-ativity, shift-invariance, and sparseness to select representative spectral basis vectors that are used to restore music sources from their monaural mixture. Experimental results with monaural instantaneous mixture of voice/cello and monaural convolutive mixture of saxophone/viola, are shown to confirm the validity of our proposed method. 1
DISCRIMINATIVE NON-NEGATIVE MATRIX FACTORIZATION FOR MULTIPLE PITCH ESTIMATION
"... In this paper, we present a supervised method to improve the multiple pitch estimation accuracy of the non-negative matrix factorization (NMF) algorithm. The idea is to extend the sparse NMF framework by incorporating pitch information present in time-aligned musical scores in order to extract featu ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
(Show Context)
In this paper, we present a supervised method to improve the multiple pitch estimation accuracy of the non-negative matrix factorization (NMF) algorithm. The idea is to extend the sparse NMF framework by incorporating pitch information present in time-aligned musical scores in order to extract features that enforce the separability between pitch labels. We introduce two discriminative criteria that maximize inter-class scatter and quantify the predictive potential of a given decomposition using logistic regressors. Those criteria are applied to both the latent variable and the deterministic autoencoder views of NMF, and we devise efficient update rules for each. We evaluate our method on three polyphonic datasets of piano recordings and orchestral instrument mixes. Both models greatly enhance the quality of the basis spectra learned by NMF and the accuracy of multiple pitch estimation. 1.
USING TENSOR FACTORISATION MODELS TO SEPARATE DRUMS FROM POLYPHONIC MUSIC
"... This paper describes the use of Non-negative Tensor Factorisation models for the separation of drums from polyphonic audio. Im-proved separation of the drums is achieved through the incorpo-ration of Gamma Chain priors into the Non-negative Tensor Fac-torisation framework. In contrast to many previo ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
(Show Context)
This paper describes the use of Non-negative Tensor Factorisation models for the separation of drums from polyphonic audio. Im-proved separation of the drums is achieved through the incorpo-ration of Gamma Chain priors into the Non-negative Tensor Fac-torisation framework. In contrast to many previous approaches, the method used in this paper requires little or no pre-training or use of drum templates. The utility of the technique is shown on real-world audio examples. 1.
Automatic Transcription of Polyphonic Music Exploiting Temporal Evolution
, 2012
"... Automatic music transcription is the process of converting an audio recording into a symbolic representation using musical notation. It has numerous ap-plications in music information retrieval, computational musicology, and the creation of interactive systems. Even for expert musicians, transcrib ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Automatic music transcription is the process of converting an audio recording into a symbolic representation using musical notation. It has numerous ap-plications in music information retrieval, computational musicology, and the creation of interactive systems. Even for expert musicians, transcribing poly-phonic pieces of music is not a trivial task, and while the problem of automatic pitch estimation for monophonic signals is considered to be solved, the creation of an automated system able to transcribe polyphonic music without setting restrictions on the degree of polyphony and the instrument type still remains
CONVOLUTIVE SPARSE CODING OF AUDIO SPECTROGRAMS
"... Representations which reduce redundancy and estimate latent variables behind observed data have turned out to be efficient in machine learning. Most of the representations model each observation vector as as weighted sum of N basis functions ..."
Abstract
- Add to MetaCart
(Show Context)
Representations which reduce redundancy and estimate latent variables behind observed data have turned out to be efficient in machine learning. Most of the representations model each observation vector as as weighted sum of N basis functions
Interactive Music Archive Access System
, 2010
"... Part of the Signal Processing Commons This Conference Paper is brought to you for free and open access by the Audio Research Group at ..."
Abstract
- Add to MetaCart
(Show Context)
Part of the Signal Processing Commons This Conference Paper is brought to you for free and open access by the Audio Research Group at
Blind Source Separation and Automatic Transcription of Music Using Tensor Decompositions
, 2007
"... Part of the Signal Processing Commons This Conference Paper is brought to you for free and open access by the Audio Research Group at ..."
Abstract
- Add to MetaCart
(Show Context)
Part of the Signal Processing Commons This Conference Paper is brought to you for free and open access by the Audio Research Group at