Results 1 - 10
of
16
Discriminative random fields: A discriminative framework for contextual interaction in classification
- In ICCV
, 2003
"... ..."
Sparse multinomial logistic regression: Fast algorithms and generalization bounds
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2005
"... Recently developed methods for learning sparse classifiers are among the state-of-the-art in supervised learning. These methods learn classifiers that incorporate weighted sums of basis functions with sparsity-promoting priors encouraging the weight estimates to be either significantly large or exac ..."
Abstract
-
Cited by 67 (1 self)
- Add to MetaCart
Recently developed methods for learning sparse classifiers are among the state-of-the-art in supervised learning. These methods learn classifiers that incorporate weighted sums of basis functions with sparsity-promoting priors encouraging the weight estimates to be either significantly large or exactly zero. From a learning-theoretic perspective, these methods control the capacity of the learned classifier by minimizing the number of basis functions used, resulting in better generalization. This paper presents three contributions related to learning sparse classifiers. First, we introduce a true multiclass formulation based on multinomial logistic regression. Second, by combining a bound optimization approach with a component-wise update procedure, we derive fast exact algorithms for learning sparse multiclass classifiers that scale favorably in both the number of training samples and the feature dimensionality, making them applicable even to large data sets in high-dimensional feature spaces. To the best of our knowledge, these are the first algorithms to perform exact multinomial logistic regression with a sparsity-promoting prior. Third, we show how nontrivial generalization bounds can be derived for our classifier in the binary case. Experimental results on standard benchmark data sets attest to the accuracy, sparsity, and efficiency of the proposed methods.
Adaptive Sparseness for Supervised Learning
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... The goal of supervised learning is to infer a functional mapping based on a set of training examples. To achieve good generalization, it is necessary to control the "complexity" of the learned function. In Bayesian approaches, this is done by adopting a prior for the parameters of the function bei ..."
Abstract
-
Cited by 57 (3 self)
- Add to MetaCart
The goal of supervised learning is to infer a functional mapping based on a set of training examples. To achieve good generalization, it is necessary to control the "complexity" of the learned function. In Bayesian approaches, this is done by adopting a prior for the parameters of the function being learned. We propose a Bayesian approach to supervised learning, which leads to sparse solutions; that is, in which irrelevant parameters are automatically set exactly to zero. Other ways to obtain sparse classifiers (such as Laplacian priors, support vector machines) involve (hyper)parameters which control the degree of sparseness of the resulting classifiers; these parameters have to be somehow adjusted/estimated from the training data. In contrast, our approach does not involve any (hyper)parameters to be adjusted or estimated. This is achieved by a hierarchical-Bayes interpretation of the Laplacian prior, which is then modified by the adoption of a Jeffreys' noninformative hyperprior. Implementation is carried out by an expectationmaximization (EM) algorithm. Experiments with several benchmark data sets show that the proposed approach yields state-of-the-art performance. In particular, our method outperforms SVMs and performs competitively with the best alternative techniques, although it involves no tuning or adjustment of sparseness-controlling hyperparameters.
Man-Made Structure Detection in Natural Images using a Causal Multiscale Random Field
- In Proc. IEEE Int. Conf. on Comp. Vision and Pattern Recog
"... This paper presents a generative model based approach to man-made structure detection in 2D natural images. The proposed approach uses a causal multiscale random field suggested in [3] as a prior model on the class labels on the image sites. However, instead of assuming the conditional independence ..."
Abstract
-
Cited by 44 (3 self)
- Add to MetaCart
This paper presents a generative model based approach to man-made structure detection in 2D natural images. The proposed approach uses a causal multiscale random field suggested in [3] as a prior model on the class labels on the image sites. However, instead of assuming the conditional independence of the observed data, we propose to capture the local dependencies in the data using a multiscale feature vector. The distribution of the multiscale feature vectors is modeled as mixture of Gaussians. A set of robust multiscale features is presented that captures the general statistical properties of man-made structures at multiple scales without relying on explicit edge detection. The proposed approach was validated on real-world images from the Corel data set, and a performance comparison with other techniques is presented.
Algorithms for sparse linear classifiers in the massive data setting, 2006. Manuscript. Available fromwww.stat.rutgers.edu/˜madigan/papers
, 2005
"... Classifiers favoring sparse solutions, such as support vector machines, relevance vector machines, LASSO-regression based classifiers, etc., provide competitive methods for classification problems in high dimensions. However, current algorithms for training sparse classifiers typically scale quite u ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Classifiers favoring sparse solutions, such as support vector machines, relevance vector machines, LASSO-regression based classifiers, etc., provide competitive methods for classification problems in high dimensions. However, current algorithms for training sparse classifiers typically scale quite unfavorably with respect to the number of training examples. This paper proposes online and multi-pass algorithms for training sparse linear classifiers for high dimensional data. These algorithms have computational complexity and memory requirements that make learning on massive datasets feasible. The central idea that makes this possible is a straightforward quadratic approximation to the likelihood function.
Sparse Representations of Polyphonic Music
- SIGNAL PROCESSING
, 2005
"... We consider two approaches for sparse decomposition of polyphonic music: a timedomain approach based on shift-invariant waveforms, and a frequency-domain approach based on phase-invariant power spectra. When trained on an example of a MIDI-controlled acoustic piano recording, both methods produce di ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
We consider two approaches for sparse decomposition of polyphonic music: a timedomain approach based on shift-invariant waveforms, and a frequency-domain approach based on phase-invariant power spectra. When trained on an example of a MIDI-controlled acoustic piano recording, both methods produce dictionary vectors or sets of vectors which represent underlying notes, and produce component activations related to the original MIDI score. The time-domain method is more computationally expensive, but produces sample-accurate spike-like activations and can be used for a direct time-domain reconstruction. The spectral domain method discards phase information, but is faster than the time-domain method and retains more higher-frequency harmonics. These results suggest that these two methods would provide a powerful yet complementary approach to automatic music transcription or object-based coding of musical audio.
Joint Classifier and Feature Optimization for Cancer Diagnosis Using Gene Expression Data
, 2003
"... Recent research has demonstrated quite convincingly that accurate cancer diagnosis can be achieved by constructing classifiers that are designed to compare the gene expression profile of a tissue of unknown cancer status to a database of stored expression profiles from tissues of known cancer status ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
Recent research has demonstrated quite convincingly that accurate cancer diagnosis can be achieved by constructing classifiers that are designed to compare the gene expression profile of a tissue of unknown cancer status to a database of stored expression profiles from tissues of known cancer status. This paper introduces the JCFO, a novel algorithm that uses a sparse Bayesian approach to jointly identify both the optimal nonlinear classifier for diagnosis and the optimal set of genes on which to base that diagnosis. We show that the diagnostic classification accuracy of the proposed algorithm is superior to a number of current state-of-the-art methods in a full leave-one-out cross-validation study of two widely used benchmark datasets. In addition to its superior classification accuracy, the algorithm is designed to automatically identify a small subset of genes (typically around twenty in our experiments) that are capable of providing complete discriminatory information for diagnosis. Focusing attention on a small subset of genes is not only useful because it produces a classifier with good generalization capacity, but also because this set of genes may provide insights into the mechanisms responsible for the disease itself. A number of the genes identified by the JCFO in our experiments are already in use as clinical markers for cancer diagnosis; some of the remaining genes may be excellent candidates for further clinical investigation. If it is possible to identify a small set of genes that is indeed capable of providing complete discrimination, inexpensive diagnostic assays might be widely deployable in clinical settings.
Joint Classifier and Feature Optimization for Comprehensive Cancer Diagnosis Using Gene Expression Data
- J. Comput. Biol
, 2004
"... achieved by constructing classifiers that are designed to compare the gene expression profile of a tissue of unknown cancer status to a database of stored expression profiles from tissues of known cancer status. This paper introduces the JCFO, a novel algorithm that uses a sparse Bayesian approach t ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
achieved by constructing classifiers that are designed to compare the gene expression profile of a tissue of unknown cancer status to a database of stored expression profiles from tissues of known cancer status. This paper introduces the JCFO, a novel algorithm that uses a sparse Bayesian approach to jointly identify both the optimal nonlinear classifier for diagnosis and the optimal set of genes on which to base that diagnosis. We show that the diagnostic classification accuracy of the proposed algorithm is superior to a number of current state-of-the-art methods in a full leave-one-out cross-validation study of five widely used benchmark datasets. In addition to its superior classification accuracy, the algorithm is designed to automatically identify a small subset of genes (typically around twenty in our experiments) that are capable of providing complete discriminatory information for diagnosis. Focusing attention on a small subset of genes is not only useful because it produces a classifier with good generalization capacity, but also because this set of genes may provide insights into the mechanisms responsible for the disease itself. A number To whom correspondence should be addressed.
T.: Adaptive feature selection in image segmentation
- In: Pattern Recognition–DAGM’04
, 2004
"... Abstract. Most image segmentation algorithms optimize some mathematical similarity criterion derived from several low-level image features. One possible way of combining different types of features, e.g. color- and texture features on different scales and/or different orientations, is to simply stac ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. Most image segmentation algorithms optimize some mathematical similarity criterion derived from several low-level image features. One possible way of combining different types of features, e.g. color- and texture features on different scales and/or different orientations, is to simply stack all the individual measurements into one high-dimensional feature vector. Due to the nature of such stacked vectors, however, only very few components (e.g. those which are defined on a suitable scale) will carry information that is relevant for the actual segmentation task. We present an approach to combining segmentation and adaptive feature selection that overcomes this relevance determination problem. All free model parameters of this method are selected by a resampling-based stability analysis. Experiments demonstrate that the built-in feature selection mechanism leads to stable and meaningful partitions of the images. 1
On Shift-Invariant Sparse Coding
- PRIETO (EDS.), INDEPENDENT COMPONENT ANALYSIS AND BLIND SIGNAL SEPARATION: PROC. FIFTH INTL. CONF., ICA 2004
, 2004
"... The goals of this paper are: 1) the introduction of a shiftinvariant sparse coding model together with learning rules for this model; 2) the comparison of this model to the traditional sparse coding model; and 3) the analysis of some limitations of the newly proposed approach. To evaluate ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
The goals of this paper are: 1) the introduction of a shiftinvariant sparse coding model together with learning rules for this model; 2) the comparison of this model to the traditional sparse coding model; and 3) the analysis of some limitations of the newly proposed approach. To evaluate

