Results 11 - 20
of
915
Regularization networks and support vector machines
- Advances in Computational Mathematics
, 2000
"... Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization a ..."
Abstract
-
Cited by 215 (28 self)
- Add to MetaCart
Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization and Support Vector Machines. We review both formulations in the context of Vapnik’s theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics. The emphasis is on regression: classification is treated as a special case.
Stable recovery of sparse overcomplete representations in the presence of noise
- IEEE TRANS. INFORM. THEORY
, 2006
"... Overcomplete representations are attracting interest in signal processing theory, particularly due to their potential to generate sparse representations of signals. However, in general, the problem of finding sparse representations must be unstable in the presence of noise. This paper establishes t ..."
Abstract
-
Cited by 195 (19 self)
- Add to MetaCart
Overcomplete representations are attracting interest in signal processing theory, particularly due to their potential to generate sparse representations of signals. However, in general, the problem of finding sparse representations must be unstable in the presence of noise. This paper establishes the possibility of stable recovery under a combination of sufficient sparsity and favorable structure of the overcomplete system. Considering an ideal underlying signal that has a sufficiently sparse representation, it is assumed that only a noisy version of it can be observed. Assuming further that the overcomplete system is incoherent, it is shown that the optimally sparse approximation to the noisy data differs from the optimally sparse decomposition of the ideal noiseless signal by at most a constant multiple of the noise level. As this optimal-sparsity method requires heavy (combinatorial) computational effort, approximation algorithms are considered. It is shown that similar stability is also available using the basis and the matching pursuit algorithms. Furthermore, it is shown that these methods result in sparse approximation of the noisy data that contains only terms also appearing in the unique sparsest representation of the ideal noiseless sparse signal.
Learning Overcomplete Representations
, 2000
"... In an overcomplete basis, the number of basis vectors is greater than the dimensionality of the input, and the representation of an input is not a unique combination of basis vectors. Overcomplete representations have been advocated because they have greater robustness in the presence of noise, can ..."
Abstract
-
Cited by 188 (8 self)
- Add to MetaCart
In an overcomplete basis, the number of basis vectors is greater than the dimensionality of the input, and the representation of an input is not a unique combination of basis vectors. Overcomplete representations have been advocated because they have greater robustness in the presence of noise, can be sparser, and can have greater flexibility in matching structure in the data. Overcomplete codes have also been proposed as a model of some of the response properties of neurons in primary visual cortex. Previous work has focused on finding the best representation of a signal using a fixed overcomplete basis (or dictionary). We present an algorithm for learning an overcomplete basis by viewing it as probabilistic model of the observed data. We show that overcomplete bases can yield a better approximation of the underlying statistical distribution of the data and can thus lead to greater coding efficiency. This can be viewed as a generalization of the technique of independent component analysis and provides a method for Bayesian reconstruction of signals in the presence of noise and for blind source separation when there are more sources than mixtures.
Just Relax: Convex Programming Methods for Identifying Sparse Signals in Noise
, 2006
"... This paper studies a difficult and fundamental problem that arises throughout electrical engineering, applied mathematics, and statistics. Suppose that one forms a short linear combination of elementary signals drawn from a large, fixed collection. Given an observation of the linear combination that ..."
Abstract
-
Cited by 185 (1 self)
- Add to MetaCart
This paper studies a difficult and fundamental problem that arises throughout electrical engineering, applied mathematics, and statistics. Suppose that one forms a short linear combination of elementary signals drawn from a large, fixed collection. Given an observation of the linear combination that has been contaminated with additive noise, the goal is to identify which elementary signals participated and to approximate their coefficients. Although many algorithms have been proposed, there is little theory which guarantees that these algorithms can accurately and efficiently solve the problem. This paper studies a method called convex relaxation, which attempts to recover the ideal sparse signal by solving a convex program. This approach is powerful because the optimization can be completed in polynomial time with standard scientific software. The paper provides general conditions which ensure that convex relaxation succeeds. As evidence of the broad impact of these results, the paper describes how convex relaxation can be used for several concrete signal recovery problems. It also describes applications to channel coding, linear regression, and numerical analysis.
Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems
- IEEE Journal of Selected Topics in Signal Processing
, 2007
"... Abstract—Many problems in signal processing and statistical inference involve finding sparse solutions to under-determined, or ill-conditioned, linear systems of equations. A standard approach consists in minimizing an objective function which includes a quadratic (squared ℓ2) error term combined wi ..."
Abstract
-
Cited by 180 (7 self)
- Add to MetaCart
Abstract—Many problems in signal processing and statistical inference involve finding sparse solutions to under-determined, or ill-conditioned, linear systems of equations. A standard approach consists in minimizing an objective function which includes a quadratic (squared ℓ2) error term combined with a sparseness-inducing (ℓ1) regularization term.Basis pursuit, the least absolute shrinkage and selection operator (LASSO), waveletbased deconvolution, and compressed sensing are a few wellknown examples of this approach. This paper proposes gradient projection (GP) algorithms for the bound-constrained quadratic programming (BCQP) formulation of these problems. We test variants of this approach that select the line search parameters in different ways, including techniques based on the Barzilai-Borwein method. Computational experiments show that these GP approaches perform well in a wide range of applications, often being significantly faster (in terms of computation time) than competing methods. Although the performance of GP methods tends to degrade as the regularization term is de-emphasized, we show how they can be embedded in a continuation scheme to recover their efficient practical performance. A. Background I.
Translation-invariant de-noising
, 1995
"... De-Noising with the traditional (orthogonal, maximally-decimated) wavelet transform sometimes exhibits visual artifacts; we attribute some of these – for example, Gibbs phenomena in the neighborhood of discontinuities – to the lack of translation invariance of the wavelet basis. One method to suppre ..."
Abstract
-
Cited by 173 (7 self)
- Add to MetaCart
De-Noising with the traditional (orthogonal, maximally-decimated) wavelet transform sometimes exhibits visual artifacts; we attribute some of these – for example, Gibbs phenomena in the neighborhood of discontinuities – to the lack of translation invariance of the wavelet basis. One method to suppress such artifacts, termed “cycle spinning ” by Coifman, is to “average out ” the translation dependence. For a range of shifts, one shifts the data (right or left as the case may be), De-Noises the shifted data, and then unshifts the de-noised data. Doing this for each of a range of shifts, and averaging the several results so obtained, produces a reconstruction subject to far weaker Gibbs phenomena than thresholding based De-Noising using the traditional orthogonal wavelet transform. Cycle-Spinning over the range of all circulant shifts can be accomplished in order nlog 2(n) time; it is equivalent to de-noising using the undecimated or stationary wavelet transform. Cycle-spinning exhibits benefits outside of wavelet de-noising, for example in cosine packet denoising, where it helps suppress ‘clicks’. It also has a counterpart in frequency domain de-noising, where the goal of translation-invariance is replaced by modulation invariance, and the central shift-De-Noise-unshift operation is replaced by modulate-De-Noise-demodulate. We illustrate these concepts with extensive computational examples; all figures presented here are reproducible using the WaveLab software package. 1
An equivalence between sparse approximation and Support Vector Machines
- A.I. Memo 1606, MIT Arti cial Intelligence Laboratory
, 1997
"... This publication can be retrieved by anonymous ftp to publications.ai.mit.edu. The pathname for this publication is: ai-publications/1500-1999/AIM-1606.ps.Z This paper shows a relationship between two di erent approximation techniques: the Support Vector Machines (SVM), proposed by V.Vapnik (1995), ..."
Abstract
-
Cited by 167 (6 self)
- Add to MetaCart
This publication can be retrieved by anonymous ftp to publications.ai.mit.edu. The pathname for this publication is: ai-publications/1500-1999/AIM-1606.ps.Z This paper shows a relationship between two di erent approximation techniques: the Support Vector Machines (SVM), proposed by V.Vapnik (1995), and a sparse approximation scheme that resembles the Basis Pursuit De-Noising algorithm (Chen, 1995 � Chen, Donoho and Saunders, 1995). SVM is a technique which can be derived from the Structural Risk Minimization Principle (Vapnik, 1982) and can be used to estimate the parameters of several di erent approximation schemes, including Radial Basis Functions, algebraic/trigonometric polynomials, B-splines, and some forms of Multilayer Perceptrons. Basis Pursuit De-Noising is a sparse approximation technique, in which a function is reconstructed by using a small number of basis functions chosen from a large set (the dictionary). We show that, if the data are noiseless, the modi ed version of Basis Pursuit De-Noising proposed in this paper is equivalent to SVM in the following sense: if applied to the same data set the two techniques give the same solution, which is obtained by solving the same quadratic programming problem. In the appendix we also present a derivation of the SVM technique in the framework of regularization theory, rather than statistical learning theory, establishing a connection between SVM, sparse approximation and regularization theory.
Blind Source Separation by Sparse Decomposition in a Signal Dictionary
, 2000
"... Introduction In blind source separation an N-channel sensor signal x(t) arises from M unknown scalar source signals s i (t), linearly mixed together by an unknown N M matrix A, and possibly corrupted by additive noise (t) x(t) = As(t) + (t) (1.1) We wish to estimate the mixing matrix A and the M- ..."
Abstract
-
Cited by 149 (28 self)
- Add to MetaCart
Introduction In blind source separation an N-channel sensor signal x(t) arises from M unknown scalar source signals s i (t), linearly mixed together by an unknown N M matrix A, and possibly corrupted by additive noise (t) x(t) = As(t) + (t) (1.1) We wish to estimate the mixing matrix A and the M-dimensional source signal s(t). Many natural signals can be sparsely represented in a proper signal dictionary s i (t) = K X k=1 C ik ' k (t) (1.2) The scalar functions ' k
An EM Algorithm for Wavelet-Based Image Restoration
, 2002
"... This paper introduces an expectation-maximization (EM) algorithm for image restoration (deconvolution) based on a penalized likelihood formulated in the wavelet domain. Regularization is achieved by promoting a reconstruction with low-complexity, expressed in terms of the wavelet coecients, taking a ..."
Abstract
-
Cited by 149 (14 self)
- Add to MetaCart
This paper introduces an expectation-maximization (EM) algorithm for image restoration (deconvolution) based on a penalized likelihood formulated in the wavelet domain. Regularization is achieved by promoting a reconstruction with low-complexity, expressed in terms of the wavelet coecients, taking advantage of the well known sparsity of wavelet representations. Previous works have investigated wavelet-based restoration but, except for certain special cases, the resulting criteria are solved approximately or require very demanding optimization methods. The EM algorithm herein proposed combines the efficient image representation oered by the discrete wavelet transform (DWT) with the diagonalization of the convolution operator obtained in the Fourier domain. The algorithm alternates between an E-step based on the fast Fourier transform (FFT) and a DWT-based M-step, resulting in an ecient iterative process requiring O(N log N) operations per iteration. Thus, it is the rst image restoration algorithm that optimizes a wavelet-based penalized likelihood criterion and has computational complexity comparable to that of standard wavelet denoising or frequency domain deconvolution methods. The convergence behavior of the algorithm is investigated, and it is shown that under mild conditions the algorithm converges to a globally optimal restoration. Moreover, our new approach outperforms several of the best existing methods in benchmark tests, and in some cases is also much less computationally demanding.
Compressive sensing
- IEEE Signal Processing Mag
, 2007
"... The Shannon/Nyquist sampling theorem tells us that in order to not lose information when uniformly sampling a signal we must sample at least two times faster than its bandwidth. In many applications, including digital image and video cameras, the Nyquist rate can be so high that we end up with too m ..."
Abstract
-
Cited by 146 (27 self)
- Add to MetaCart
The Shannon/Nyquist sampling theorem tells us that in order to not lose information when uniformly sampling a signal we must sample at least two times faster than its bandwidth. In many applications, including digital image and video cameras, the Nyquist rate can be so high that we end up with too many samples and must compress in order to store or transmit them. In other applications, including imaging systems (medical scanners, radars) and high-speed analog-to-digital converters, increasing the sampling rate or density beyond the current state-of-the-art is very expensive. In this lecture, we will learn about a new technique that tackles these issues using compressive sensing [1, 2]. We will replace the conventional sampling and reconstruction operations with a more general linear measurement scheme coupled with an optimization in order to acquire certain kinds of signals at a rate significantly below Nyquist. 2

