Results 1  10
of
212
Independent Component Analysis
 Neural Computing Surveys
, 2001
"... A common problem encountered in such disciplines as statistics, data analysis, signal processing, and neural network research, is nding a suitable representation of multivariate data. For computational and conceptual simplicity, such a representation is often sought as a linear transformation of the ..."
Abstract

Cited by 1507 (93 self)
 Add to MetaCart
A common problem encountered in such disciplines as statistics, data analysis, signal processing, and neural network research, is nding a suitable representation of multivariate data. For computational and conceptual simplicity, such a representation is often sought as a linear transformation of the original data. Wellknown linear transformation methods include, for example, principal component analysis, factor analysis, and projection pursuit. A recently developed linear transformation method is independent component analysis (ICA), in which the desired representation is the one that minimizes the statistical dependence of the components of the representation. Such a representation seems to capture the essential structure of the data in many applications. In this paper, we survey the existing theory and methods for ICA. 1
From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images
, 2007
"... A fullrank matrix A ∈ IR n×m with n < m generates an underdetermined system of linear equations Ax = b having infinitely many solutions. Suppose we seek the sparsest solution, i.e., the one with the fewest nonzero entries: can it ever be unique? If so, when? As optimization of sparsity is combinato ..."
Abstract

Cited by 207 (31 self)
 Add to MetaCart
A fullrank matrix A ∈ IR n×m with n < m generates an underdetermined system of linear equations Ax = b having infinitely many solutions. Suppose we seek the sparsest solution, i.e., the one with the fewest nonzero entries: can it ever be unique? If so, when? As optimization of sparsity is combinatorial in nature, are there efficient methods for finding the sparsest solution? These questions have been answered positively and constructively in recent years, exposing a wide variety of surprising phenomena; in particular, the existence of easilyverifiable conditions under which optimallysparse solutions can be found by concrete, effective computational methods. Such theoretical results inspire a bold perspective on some important practical problems in signal and image processing. Several wellknown signal and image processing problems can be cast as demanding solutions of undetermined systems of equations. Such problems have previously seemed, to many, intractable. There is considerable evidence that these problems often have sparse solutions. Hence, advances in finding sparse solutions to underdetermined systems energizes research on such signal and image processing problems – to striking effect. In this paper we review the theoretical results on sparse solutions of linear systems, empirical
Efficient sparse coding algorithms
 In NIPS
, 2007
"... Sparse coding provides a class of algorithms for finding succinct representations of stimuli; given only unlabeled input data, it discovers basis functions that capture higherlevel features in the data. However, finding sparse codes remains a very difficult computational problem. In this paper, we ..."
Abstract

Cited by 205 (12 self)
 Add to MetaCart
Sparse coding provides a class of algorithms for finding succinct representations of stimuli; given only unlabeled input data, it discovers basis functions that capture higherlevel features in the data. However, finding sparse codes remains a very difficult computational problem. In this paper, we present efficient sparse coding algorithms that are based on iteratively solving two convex optimization problems: an L1regularized least squares problem and an L2constrained least squares problem. We propose novel algorithms to solve both of these optimization problems. Our algorithms result in a significant speedup for sparse coding, allowing us to learn larger sparse codes than possible with previously described algorithms. We apply these algorithms to natural images and demonstrate that the inferred sparse codes exhibit endstopping and nonclassical receptive field surround suppression and, therefore, may provide a partial explanation for these two phenomena in V1 neurons. 1
Blind Source Separation by Sparse Decomposition in a Signal Dictionary
, 2000
"... Introduction In blind source separation an Nchannel sensor signal x(t) arises from M unknown scalar source signals s i (t), linearly mixed together by an unknown N M matrix A, and possibly corrupted by additive noise (t) x(t) = As(t) + (t) (1.1) We wish to estimate the mixing matrix A and the M ..."
Abstract

Cited by 194 (31 self)
 Add to MetaCart
Introduction In blind source separation an Nchannel sensor signal x(t) arises from M unknown scalar source signals s i (t), linearly mixed together by an unknown N M matrix A, and possibly corrupted by additive noise (t) x(t) = As(t) + (t) (1.1) We wish to estimate the mixing matrix A and the Mdimensional source signal s(t). Many natural signals can be sparsely represented in a proper signal dictionary s i (t) = K X k=1 C ik ' k (t) (1.2) The scalar functions ' k
Face recognition by independent component analysis
 IEEE Transactions on Neural Networks
, 2002
"... Abstract—A number of current face recognition algorithms use face representations found by unsupervised statistical methods. Typically these methods find a set of basis images and represent faces as a linear combination of those images. Principal component analysis (PCA) is a popular example of such ..."
Abstract

Cited by 192 (4 self)
 Add to MetaCart
Abstract—A number of current face recognition algorithms use face representations found by unsupervised statistical methods. Typically these methods find a set of basis images and represent faces as a linear combination of those images. Principal component analysis (PCA) is a popular example of such methods. The basis images found by PCA depend only on pairwise relationships between pixels in the image database. In a task such as face recognition, in which important information may be contained in the highorder relationships among pixels, it seems reasonable to expect that better basis images may be found by methods sensitive to these highorder statistics. Independent component analysis (ICA), a generalization of PCA, is one such method. We used a version of ICA derived from the principle of optimal information transfer through sigmoidal neurons. ICA was performed on face images in the FERET database under two different architectures, one which treated the images as random variables and the pixels as outcomes, and a second which treated the pixels as random variables and the images as outcomes. The first architecture found spatially local basis images for the faces. The second architecture produced a factorial face code. Both ICA representations were superior to representations based on PCA for recognizing faces across days and changes in expression. A classifier that combined the two ICA representations gave the best performance. Index Terms—Eigenfaces, face recognition, independent component analysis (ICA), principal component analysis (PCA), unsupervised learning. I.
Sparse multinomial logistic regression: fast algorithms and generalization bounds
 IEEE Trans. on Pattern Analysis and Machine Intelligence
"... Abstract—Recently developed methods for learning sparse classifiers are among the stateoftheart in supervised learning. These methods learn classifiers that incorporate weighted sums of basis functions with sparsitypromoting priors encouraging the weight estimates to be either significantly larg ..."
Abstract

Cited by 114 (1 self)
 Add to MetaCart
Abstract—Recently developed methods for learning sparse classifiers are among the stateoftheart in supervised learning. These methods learn classifiers that incorporate weighted sums of basis functions with sparsitypromoting priors encouraging the weight estimates to be either significantly large or exactly zero. From a learningtheoretic perspective, these methods control the capacity of the learned classifier by minimizing the number of basis functions used, resulting in better generalization. This paper presents three contributions related to learning sparse classifiers. First, we introduce a true multiclass formulation based on multinomial logistic regression. Second, by combining a bound optimization approach with a componentwise update procedure, we derive fast exact algorithms for learning sparse multiclass classifiers that scale favorably in both the number of training samples and the feature dimensionality, making them applicable even to large data sets in highdimensional feature spaces. To the best of our knowledge, these are the first algorithms to perform exact multinomial logistic regression with a sparsitypromoting prior. Third, we show how nontrivial generalization bounds can be derived for our classifier in the binary case. Experimental results on standard benchmark data sets attest to the accuracy, sparsity, and efficiency of the proposed methods.
Blind source separation of more sources than mixtures using overcomplete representations
 IEEE Sig. Proc. Lett
, 1999
"... Abstract—Empirical results were obtained for the blind source separation of more sources than mixtures using a recently proposed framework for learning overcomplete representations. This technique assumes a linear mixing model with additive noise and involves two steps: 1) learning an overcomplete r ..."
Abstract

Cited by 100 (2 self)
 Add to MetaCart
Abstract—Empirical results were obtained for the blind source separation of more sources than mixtures using a recently proposed framework for learning overcomplete representations. This technique assumes a linear mixing model with additive noise and involves two steps: 1) learning an overcomplete representation for the observed data and 2) inferring sources given a sparse prior on the coefficients. We demonstrate that three speech signals can be separated with good fidelity given only two mixtures of the three signals. Similar results were obtained with mixtures of two speech signals and one music signal. Index Terms—Blind source separation, independent component analysis, overcomplete dictionary, overcomplete representation, speech signal separation. (a) (b)
Online learning for matrix factorization and sparse coding
"... Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the largescale matrix factorization problem that consists of learning the basis set, adapting it t ..."
Abstract

Cited by 100 (19 self)
 Add to MetaCart
Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the largescale matrix factorization problem that consists of learning the basis set, adapting it to specific data. Variations of this problem include dictionary learning in signal processing, nonnegative matrix factorization and sparse principal component analysis. In this paper, we propose to address these tasks with a new online optimization algorithm, based on stochastic approximations, which scales up gracefully to large datasets with millions of training samples, and extends naturally to various matrix factorization formulations, making it suitable for a wide range of learning problems. A proof of convergence is presented, along with experiments with natural images and genomic data demonstrating that it leads to stateoftheart performance in terms of speed and optimization for both small and large datasets.
Efficient coding of natural sounds
 Nature Neuroscience
, 2002
"... The auditory system encodes sound by decomposing the amplitude signal arriving at the ear into multiple frequency bands whose center frequencies and bandwidths are approximately logarithmic functions of the distance from the stapes. This particular organization is thought to result from the adaptati ..."
Abstract

Cited by 95 (3 self)
 Add to MetaCart
The auditory system encodes sound by decomposing the amplitude signal arriving at the ear into multiple frequency bands whose center frequencies and bandwidths are approximately logarithmic functions of the distance from the stapes. This particular organization is thought to result from the adaptation of cochlear mechanisms to the statistics of an animal’s auditory environment. Here we report that several basic auditory nerve fiber tuning properties can be accounted for by adapting a population of filter shapes to optimally encode natural sounds. The form of the code is dependent on the class of sounds, resembling a Fourier transformation when optimized for animal vocalizations and a wavelet transformation when optimized for nonbiological environmental sounds. Only for a combined set of vocalizations and environmental sounds does the optimal code follow scaling characteristics that are consistent with physiological data. These results suggest that the population of auditory nerve fibers encode a broad set of natural sounds in a manner that is consistent with information theoretic principles. Correspondence:
Just relax: Convex programming methods for subset selection and sparse approximation
, 2004
"... Abstract. Subset selection and sparse approximation problems request a good approximation of an input signal using a linear combination of elementary signals, yet they stipulate that the approximation may only involve a few of the elementary signals. This class of problems arises throughout electric ..."
Abstract

Cited by 90 (4 self)
 Add to MetaCart
Abstract. Subset selection and sparse approximation problems request a good approximation of an input signal using a linear combination of elementary signals, yet they stipulate that the approximation may only involve a few of the elementary signals. This class of problems arises throughout electrical engineering, applied mathematics and statistics, but small theoretical progress has been made over the last fifty years. Subset selection and sparse approximation both admit natural convex relaxations, but the literature contains few results on the behavior of these relaxations for general input signals. This report demonstrates that the solution of the convex program frequently coincides with the solution of the original approximation problem. The proofs depend essentially on geometric properties of the ensemble of elementary signals. The results are powerful because sparse approximation problems are combinatorial, while convex programs can be solved in polynomial time with standard software. Comparable new results for a greedy algorithm, Orthogonal Matching Pursuit, are also stated. This report should have a major practical impact because the theory applies immediately to many realworld signal processing problems. 1.