Results 1  10
of
64
Sparse Representation For Computer Vision and Pattern Recognition
, 2009
"... Techniques from sparse signal representation are beginning to see significant impact in computer vision, often on nontraditional applications where the goal is not just to obtain a compact highfidelity representation of the observed signal, but also to extract semantic information. The choice of ..."
Abstract

Cited by 48 (1 self)
 Add to MetaCart
Techniques from sparse signal representation are beginning to see significant impact in computer vision, often on nontraditional applications where the goal is not just to obtain a compact highfidelity representation of the observed signal, but also to extract semantic information. The choice of dictionary plays a key role in bridging this gap: unconventional dictionaries consisting of, or learned from, the training samples themselves provide the key to obtaining stateoftheart results and to attaching semantic meaning to sparse signal representations. Understanding the good performance of such unconventional dictionaries in turn demands new algorithmic and analytical techniques. This review paper highlights a few representative examples of how the interaction between sparse signal representation and computer vision can enrich both fields, and raises a number of open questions for further study.
Compressed sensing and Bayesian experimental design
 In ICML 25
, 2008
"... We relate compressed sensing (CS) with Bayesian experimental design and provide a novel efficient approximate method for the latter, based on expectation propagation. In a large comparative study about linearly measuring natural images, we show that the simple standard heuristic of measuring wavelet ..."
Abstract

Cited by 31 (11 self)
 Add to MetaCart
We relate compressed sensing (CS) with Bayesian experimental design and provide a novel efficient approximate method for the latter, based on expectation propagation. In a large comparative study about linearly measuring natural images, we show that the simple standard heuristic of measuring wavelet coefficients topdown systematically outperforms CS methods using random measurements; the sequential projection optimisation approach of (Ji & Carin, 2007) performs even worse. We also show that our own approximate Bayesian method is able to learn measurement filters on full images efficiently which outperform the wavelet heuristic. To our knowledge, ours is the first successful attempt at “learning compressed sensing ” for images of realistic size. In contrast to common CS methods, our framework is not restricted to sparse signals, but can readily be applied to other notions of signal complexity or noise models. We give concrete ideas how our method can be scaled up to large signal representations. 1.
TaskDriven Dictionary Learning
"... Abstract—Modeling data with linear combinations of a few elements from a learned dictionary has been the focus of much recent research in machine learning, neuroscience, and signal processing. For signals such as natural images that admit such sparse representations, it is now well established that ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
Abstract—Modeling data with linear combinations of a few elements from a learned dictionary has been the focus of much recent research in machine learning, neuroscience, and signal processing. For signals such as natural images that admit such sparse representations, it is now well established that these models are well suited to restoration tasks. In this context, learning the dictionary amounts to solving a largescale matrix factorization problem, which can be done efficiently with classical optimization tools. The same approach has also been used for learning features from data for other purposes, e.g., image classification, but tuning the dictionary in a supervised way for these tasks has proven to be more difficult. In this paper, we present a general formulation for supervised dictionary learning adapted to a wide variety of tasks, and present an efficient algorithm for solving the corresponding optimization problem. Experiments on handwritten digit classification, digital art identification, nonlinear inverse image problems, and compressed sensing demonstrate that our approach is effective in largescale settings, and is well suited to supervised and semisupervised classification, as well as regression tasks for data that admit sparse representations. Index Terms—Basis pursuit, Lasso, dictionary learning, matrix factorization, semisupervised learning, compressed sensing. Ç 1
Sequential optimal design of neurophysiology experiments
, 2008
"... Adaptively optimizing experiments has the potential to significantly reduce the number of trials needed to build parametric statistical models of neural systems. However, application of adaptive methods to neurophysiology has been limited by severe computational challenges. Since most neurons are hi ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
Adaptively optimizing experiments has the potential to significantly reduce the number of trials needed to build parametric statistical models of neural systems. However, application of adaptive methods to neurophysiology has been limited by severe computational challenges. Since most neurons are high dimensional systems, optimizing neurophysiology experiments requires computing highdimensional integrations and optimizations in real time. Here we present a fast algorithm for choosing the most informative stimulus by maximizing the mutual information between the data and the unknown parameters of a generalized linear model (GLM) which we want to fit to the neuron’s activity. We rely on important logconcavity and asymptotic normality properties of the posterior to facilitate the required computations. Our algorithm requires only lowrank matrix manipulations and a 2dimensional search to choose the optimal stimulus. The average running time of these operations scales quadratically with the dimensionality of the GLM, making realtime adaptive experimental design feasible even for highdimensional stimulus and parameter spaces. For example, we
Expectation propagation for exponential families
, 2005
"... This is a tutorial describing the Expectation Propagation (EP) algorithm for a general exponential family. Our focus is on simplicity of exposition. Although the overhead of translating a specific model into its exponential family representation can be considerable, many apparent complications of EP ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
This is a tutorial describing the Expectation Propagation (EP) algorithm for a general exponential family. Our focus is on simplicity of exposition. Although the overhead of translating a specific model into its exponential family representation can be considerable, many apparent complications of EP can simply be sidestepped by working in this canonical representation. Note: This material is extracted from the Appendix of my PhD thesis (see www.kyb.tuebingen.mpg.de/bs/people/seeger/papers/thesis.html). 1 Exponential Families Definition 1 (Exponential Family) A set F of distributions with densities P (xθ) = exp � θ T φ(x) − Φ(θ) � , θ ∈ Θ, Φ(θ) = log exp � θ T φ(x) � dµ(x) w.r.t. a base measure µ is called an exponential family. Here, θ are called natural parameters, Θ the natural parameter space, φ(x) the sufficient statistics, and Φ(θ) is the log partition function. Furthermore, η = Eθ[φ(x)] are called moment parameters, where Eθ[·]
Sparse Regression Learning by Aggregation and Langevin MonteCarlo
, 2009
"... We consider the problem of regression learning for deterministic design and independent random errors. We start by proving a sharp PACBayesian type bound for the exponentially weighted aggregate (EWA) under the expected squared empirical loss. For a broad class of noise distributions the presented ..."
Abstract

Cited by 14 (3 self)
 Add to MetaCart
We consider the problem of regression learning for deterministic design and independent random errors. We start by proving a sharp PACBayesian type bound for the exponentially weighted aggregate (EWA) under the expected squared empirical loss. For a broad class of noise distributions the presented bound is valid whenever the temperature parameter β of the EWA is larger than or equal to 4σ 2, where σ 2 is the noise variance. A remarkable feature of this result is that it is valid even for unbounded regression functions and the choice of the temperature parameter depends exclusively on the noise level. Next, we apply this general bound to the problem of aggregating the elements of a finitedimensional linear space spanned by a dictionary of functions φ1,...,φM. We allow M to be much larger than the sample size n but we assume that the true regression function can be well approximated by a sparse linear combination of functions φj. Under this sparsity scenario, we propose an EWA with a heavy tailed prior and we show that it satisfies a sparsity oracle inequality with leading constant one. Finally, we propose several Langevin MonteCarlo algorithms to approximately compute such an EWA when the number M of aggregated functions can be large. We discuss in some detail the convergence of these algorithms and present numerical experiments that confirm our theoretical findings.
Bayesian Experimental Design of Magnetic Resonance Imaging Sequences
"... We show how improved sequences for magnetic resonance imaging can be found through optimization of Bayesian design scores. Combining approximate Bayesian inference and natural image statistics with highperformance numerical computation, we propose the first Bayesian experimental design framework fo ..."
Abstract

Cited by 12 (9 self)
 Add to MetaCart
We show how improved sequences for magnetic resonance imaging can be found through optimization of Bayesian design scores. Combining approximate Bayesian inference and natural image statistics with highperformance numerical computation, we propose the first Bayesian experimental design framework for this problem of high relevance to clinical and brain research. Our solution requires largescale approximate inference for dense, nonGaussian models. We propose a novel scalable variational inference algorithm, and show how powerful methods of numerical mathematics can be modified to compute primitives in our framework. Our approach is evaluated on raw data from a 3T MR scanner. 1
Efficient spectral feature selection with minimum redundancy
 In Proceedings of the Twenty4th AAAI Conference on Artificial Intelligence (AAAI), 2010. Ji Zhu, Saharon
"... Spectral feature selection identifies relevant features by measuring their capability of preserving sample similarity. It provides a powerful framework for both supervised and unsupervised feature selection, and has been proven to be effective in many realworld applications. One common drawback ass ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
Spectral feature selection identifies relevant features by measuring their capability of preserving sample similarity. It provides a powerful framework for both supervised and unsupervised feature selection, and has been proven to be effective in many realworld applications. One common drawback associated with most existing spectral feature selection algorithms is that they evaluate features individually and cannot identify redundant features. Since redundant features can have significant adverse effect on learning performance, it is necessary to address this limitation for spectral feature selection. To this end, we propose a novel spectral feature selection algorithm to handle feature redundancy, adopting an embedded model. The algorithm is derived from a formulation based on a sparse multioutput regression with a L2,1norm constraint. We conduct theoretical analysis on the properties of its optimal solutions, paving the way for designing an efficient pathfollowing solver. Extensive experiments show that the proposed algorithm can do well in both selecting relevant features and removing redundancy.
Large Scale Variational Inference and Experimental Design for Sparse Generalized Linear Models
, 2008
"... Sparsity is a fundamental concept of modern statistics, and often the only general principle available at the moment to address novel learning applications with many more variables than observations. While much progress has been made recently in the theoretical understanding and algorithmics of spa ..."
Abstract

Cited by 9 (6 self)
 Add to MetaCart
Sparsity is a fundamental concept of modern statistics, and often the only general principle available at the moment to address novel learning applications with many more variables than observations. While much progress has been made recently in the theoretical understanding and algorithmics of sparse point estimation, higherorder problems such as covariance estimation or optimal data acquisition are seldomly addressed for sparsityfavouring models, and there are virtually no algorithms for large scale applications of these. We provide novel approximate Bayesian inference algorithms for sparse generalized linear models, that can be used with hundred thousands of variables, and run orders of magnitude faster than previous algorithms in domains where either apply. By analyzing our methods and establishing some novel convexity results, we settle a longstanding open question about variational Bayesian inference for continuous variable models: the Gaussian lower bound relaxation, which has been used previously for a range of models, is proved to be a convex optimization problem, if and only if the posterior mode is found by convex programming. Our algorithms reduce to the same computational primitives than commonly used sparse estimation methods do, but require Gaussian marginal variance estimation as well. We show how the Lanczos algorithm from numerical mathematics can be employed to compute the latter. We are interested in Bayesian experimental design here (which is mainly driven by efficient approximate inference), a powerful framework for optimizing measurement architectures of complex signals, such as natural images. Designs