Just Relax: Convex Programming Methods for Identifying Sparse Signals in Noise
, 2006
This paper studies a difficult and fundamental problem that arises throughout electrical engineering, applied mathematics, and statistics. Suppose that one forms a short linear combination of elementary signals drawn from a large, fixed collection. Given an observation of the linear combination that has been contaminated with additive noise, the goal is to identify which elementary signals participated and to approximate their coefficients. Although many algorithms have been proposed, there is little theory which guarantees that these algorithms can accurately and efficiently solve the problem. This paper studies a method called convex relaxation, which attempts to recover the ideal sparse signal by solving a convex program. This approach is powerful because the optimization can be completed in polynomial time with standard scientific software. The paper provides general conditions which ensure that convex relaxation succeeds. As evidence of the broad impact of these results, the paper describes how convex relaxation can be used for several concrete signal recovery problems. It also describes applications to channel coding, linear regression, and numerical analysis.
Just relax: Convex programming methods for subset selection and sparse approximation
, 2004
Abstract. Subset selection and sparse approximation problems request a good approximation of an input signal using a linear combination of elementary signals, yet they stipulate that the approximation may only involve a few of the elementary signals. This class of problems arises throughout electrical engineering, applied mathematics and statistics, but small theoretical progress has been made over the last fifty years. Subset selection and sparse approximation both admit natural convex relaxations, but the literature contains few results on the behavior of these relaxations for general input signals. This report demonstrates that the solution of the convex program frequently coincides with the solution of the original approximation problem. The proofs depend essentially on geometric properties of the ensemble of elementary signals. The results are powerful because sparse approximation problems are combinatorial, while convex programs can be solved in polynomial time with standard software. Comparable new results for a greedy algorithm, Orthogonal Matching Pursuit, are also stated. This report should have a major practical impact because the theory applies immediately to many realworld signal processing problems. 1.
On the exponential convergence of matching pursuit in quasicoherent dictionaries
 IEEE TRANS. INFORMATION TH
, 2006
Bayesian Harmonic Models for Musical Signal Analysis
 in Bayesian Statistics 7
, 2002
This paper is concerned with the Bayesian analysis of musical signals. The ultimate aim is to use Bayesian hierarchical structures in order to infer quantities at the highest level, including such quantities as musical pitch, dynamics, timbre, instrument identity, etc. Analysis of real musical signals is complicated by many things, including the presence of transient sounds, noises and the complex structure of musical pitches in the frequency domain. The problem is truly Bayesian in that there is a wealth of (often subjective) prior knwowledge about how musical signals are constructed, which can be exploited in order to achieve more accurate inference about the musical structure. Here we propose developments to an earlier Bayesian model which describes each component `note' at a given time in terms of a fundamental frequency, partials (`harmonics'), and amplitude. This basic model is modified for greater realism to include nonwhite residuals, timevarying amplitudes and partials `detuned' from the natural linear relationship. The unknown parameters of the new model are simulated using a variable dimension MCMC algorithm, leading to a highly sophisticated analysis tool. We discuss how the models and algorithms can be applied for feature extraction, polyphonic music transcription, source separation and restoration of musical sources
MPTK: Matching pursuit made tractable
 in Proc. Int. Conf. on Acoustic Speech and Signal Processing
, 2006
Matching Pursuit (MP) aims at finding sparse decompositions of signals over redundant bases of elementary waveforms. Traditionally, MP has been considered too slow an algorithm to be applied to reallife problems with highdimensional signals. Indeed, in terms of floating points operations, its typical numerical implementations have a complexity of ¢¤£¦¥¨§� © and are associated with impractical runtimes. In this paper, we propose a new architecture which exploits the structure shared by many redundant MP dictionaries, and thus decreases its complexity to ¢¤£¦¥�������¥¨ ©. This architecture is implemented in a new software toolkit, called MPTK (the Matching Pursuit Toolkit), which is able to reach, e.g., ������� � real time for a typical MP analysis scenario applied to a 1 hour long audio track. This substantial acceleration makes it possible, from now on, to explore and apply MP in the framework of reallife, highdimensional data processing problems. 1.
Sound Source Separation in Monaural Music Signals
, 2006
Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separation of monaural, or, onechannel music recordings. We concentrate on separation methods, where the sources to be separated are not known beforehand. Instead, the separation is enabled by utilizing the common properties of realworld sound sources, which are their continuity, sparseness, and repetition in time and frequency, and their harmonic spectral structures. One of the separation approaches taken here use unsupervised learning and the other uses modelbased inference based on sinusoidal modeling. Most of the existing unsupervised separation algorithms are based on a linear instantaneous signal model, where each frame of the input mixture signal is
Instrumentspecific harmonic atoms for midlevel music representation
 IEEE Trans. on Audio, Speech and Lang. Proc
, 2008
Abstract—Several studies have pointed out the need for accurate midlevel representations of music signals for information retrieval and signal processing purposes. In this paper, we propose a new midlevel representation based on the decomposition of a signal into a small number of sound atoms or molecules bearing explicit musical instrument labels. Each atom is a sum of windowed harmonic sinusoidal partials whose relative amplitudes are specific to one instrument, and each molecule consists of several atoms from the same instrument spanning successive time windows. We design efficient algorithms to extract the most prominent atoms or molecules and investigate several applications of this representation, including polyphonic instrument recognition and music visualization. Index Terms—Midlevel representation, music information retrieval, music visualization, sparse decomposition. I.
Bayesian Analysis of Polyphonic Western Tonal Music
 JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA
, 2006
This paper deals with the computational analysis of musical audio from recorded audio waveforms. This general problem includes, as subtasks, music transcription, extraction of musical pitch, dynamics, timbre, instrument identity, and source separation. Analysis of real musical signals is a highly illposed task which is made complicated by the presence of transient sounds, background interference or the complex structure of musical pitches in the timefrequency domain. This paper focuses on models and algorithms for computer transcription of multiple musical pitches in audio, elaborated from previous work by two of the authors. The audio data are supposedly presegmented into fixed pitch regimes such as individual chords. The models presented apply to pitched (tonal) music and are formulated via a Gabor representation of nonstationary signals. A Bayesian probabilistic structure is employed for representation of prior information about the parameters of the notes. This paper introduces a numerical Bayesian inference strategy for estimation of the pitches and other parameters of the waveform. The improved algorithm is much quicker, and makes the approach feasible in realistic sitautions. Results are
Low Bitrate Object Coding of Musical Audio Using Bayesian Harmonic Models
, 2006
This article deals with the decomposition of music signals into pitched sound objects made of harmonic sinusoidal partials for very low bitrate coding purposes. After a brief review of existing methods, we recast this problem in the Bayesian framework. We propose a family of probabilistic signal models combining learnt object priors and various perceptually motivated distortion measures. We design efficient algorithms to infer object parameters and build a coder based on the interpolation of frequency and amplitude parameters. Listening tests suggest that the loudnessbased distortion measure outperforms other distortion measures and that our coder results in a better sound quality than baseline transform and parametric coders at 8 kbit/s and 2 kbit/s. This work constitutes a new step towards a fully objectbased coding system, which would represent audio signals as collections of meaningful notelike sound objects.