Results 11 - 20
of
241
New results for learning noisy parities and halfspaces
- In Proceedings of the 47th Annual Symposium on Foundations of Computer Science (FOCS
, 2006
"... We address well-studied problems concerning the learnability of parities and halfspaces in the presence of classification noise. Learning of parities under the uniform distribution with random classification noise, also called the noisy parity problem is a famous open problem in computational learni ..."
Abstract
-
Cited by 35 (9 self)
- Add to MetaCart
We address well-studied problems concerning the learnability of parities and halfspaces in the presence of classification noise. Learning of parities under the uniform distribution with random classification noise, also called the noisy parity problem is a famous open problem in computational learning. We reduce a number of basic problems regarding learning under the uniform distribution to learning of noisy parities. We show that under the uniform distribution, learning parities with adversarial classification noise reduces to learning parities with random classification noise. Together with the parity learning algorithm of Blum et al. [5], this gives the first nontrivial algorithm for learning parities with adversarial noise. We show that learning of DNF expressions reduces to learning noisy parities of just logarithmic number of variables. We show that learning of k-juntas reduces to learning noisy parities of k variables. These reductions work even in the presence of random classification noise in the original DNF or junta. We then consider the problem of learning halfspaces over Qn with adversarial noise or finding a halfspace that maximizes the agreement rate with a given set of examples. We prove an essentially optimal hardness factor of 2 − ɛ, improving the factor of 85 84 − ɛ due to Bshouty and Burroughs [8]. Finally, we show that majorities of halfspaces are hard to PAC-learn using any representation, based on the cryptographic assumption underlying the Ajtai-Dwork cryptosystem.
M.: Interactive policy learning through confidence-based autonomy
- J. Artificial Intelligence Research
, 2009
"... We present Confidence-Based Autonomy (CBA), an interactive algorithm for policy learning from demonstration. The CBA algorithm consists of two components which take advantage of the complementary abilities of humans and computer agents. The first component, Confident Execution, enables the agent to ..."
Abstract
-
Cited by 35 (10 self)
- Add to MetaCart
We present Confidence-Based Autonomy (CBA), an interactive algorithm for policy learning from demonstration. The CBA algorithm consists of two components which take advantage of the complementary abilities of humans and computer agents. The first component, Confident Execution, enables the agent to identify states in which demonstration is required, to request a demonstration from the human teacher and to learn a policy based on the acquired data. The algorithm selects demonstrations based on a measure of action selection confidence, and our results show that using Confident Execution the agent requires fewer demonstrations to learn the policy than when demonstrations are selected by a human teacher. The second algorithmic component, Corrective Demonstration, enables the teacher to correct any mistakes made by the agent through additional demonstrations in order to improve the policy and future task performance. CBA and its individual components are compared and evaluated in a complex simulated driving domain. The complete CBA algorithm results in the best overall learning performance, successfully reproducing the behavior of the teacher while balancing the tradeoff between number of demonstrations and number of incorrect actions during learning. 1.
Feature Selection as a Preprocessing Step for Hierarchical Clustering
, 1999
"... Although feature selection is a central problem in inductive learning as suggested by the growing amount of research in this area, most of the work has been carried out under the supervised learning paradigm, paying little attention to unsupervised learning tasks and, particularly, clustering tasks. ..."
Abstract
-
Cited by 34 (5 self)
- Add to MetaCart
Although feature selection is a central problem in inductive learning as suggested by the growing amount of research in this area, most of the work has been carried out under the supervised learning paradigm, paying little attention to unsupervised learning tasks and, particularly, clustering tasks. In this paper, we analyze the particular benefits that feature selection may provide in hierarchical clustering tasks and explore the power of feature selection methods applied as a preprocessing step under the proposed dimensions. Instead of only predicting class labels, the focus is on a more general inference tasks over all the features. Empirical results suggest that feature selection as preprocessing only provides limited improvements in the performance task. In addition, they raise the problem of the notion of irrelevance in unsupervised settings. 1 INTRODUCTION Inductive learning systems are a powerful approach for automatically extracting useful information from data or for assisti...
Instrument recognition in polyphonic music based on automatic taxonomies
- IEEE Transactions on Speech and Audio Processing
, 2006
"... We propose a new approach to instrument recognition in the context of real music orchestrations ranging from solos to quartets. The strength of our approach is that it does not require prior musical source separation. Thanks to a hierarchical clustering algorithm exploiting robust probabilistic dist ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
We propose a new approach to instrument recognition in the context of real music orchestrations ranging from solos to quartets. The strength of our approach is that it does not require prior musical source separation. Thanks to a hierarchical clustering algorithm exploiting robust probabilistic distances, we obtain a taxonomy of musical ensembles which is used to efficiently classify possible combinations of instruments played simultaneously. Moreover, a wide set of acoustic features is studied including some new proposals. In particular, Signal to Mask Ratios are found to be useful features for audio classification. This study focuses on a single music genre (i.e. jazz) but combines a variety of instruments among which are percussion and singing voice. Using a varied database of sound excerpts from commercial recordings, we show that the segmentation of music with respect to the instruments played can be achieved with an average accuracy of 53%.
Feature Reduction and Hierarchy of Classifiers for Fast Object Detection in Video Images
, 2001
"... We present a two-step method to speed-up object detection systems in computer vision that use Support Vector Machines (SVMs) as classifiers. In a first step we perform feature reduction by choosing relevant image features according to a measure derived from statistical learning theory. In a second s ..."
Abstract
-
Cited by 28 (4 self)
- Add to MetaCart
We present a two-step method to speed-up object detection systems in computer vision that use Support Vector Machines (SVMs) as classifiers. In a first step we perform feature reduction by choosing relevant image features according to a measure derived from statistical learning theory. In a second step we build a hierarchy of classifiers. On the bottom level, a simple and fast classifier analyzes the whole image and rejects large parts of the background. On the top level, a slower but more accurate classifier performs the final detection. Experiments with a face detection system show that combining feature reduction with hierarchical classification leads to a speed-up by a factor of 170 with similar classification performance.
Hierarchical classification and feature reduction for fast face detection with support vector machines
- Pattern Recognition
, 2003
"... We present a two-step method to speed-up object detection systems in computer vision that use Support Vector Machines (SVMs) as classifiers. In the first step we build a hierarchy of classifiers. On the bottom level a simple and fast linear classifier analyzes the whole image and rejects large parts ..."
Abstract
-
Cited by 28 (3 self)
- Add to MetaCart
We present a two-step method to speed-up object detection systems in computer vision that use Support Vector Machines (SVMs) as classifiers. In the first step we build a hierarchy of classifiers. On the bottom level a simple and fast linear classifier analyzes the whole image and rejects large parts of the background. On the top level, a slower but more accurate classifier performs the final detection. We propose a new method for automatically building and training a hierarchy of classifiers. In the second step we apply feature reduction to the top level classifier by choosing relevant image features according to a measure derived from statistical learning theory. Experiments with a face detection system show that combining feature reduction with hierarchical classification leads to a speed-up by a factor of 335 with similar classification performance. 1.
Feature Weighting in k-Means Clustering
- Machine Learning
, 2002
"... Data sets with multiple, heterogeneous feature spaces occur frequently. We present an abstract framework for integrating multiple feature spaces in the k-means clustering algorithm. Our main ideas are (i) to represent each data object as a tuple of multiple feature vectors, (ii) to assign a suitable ..."
Abstract
-
Cited by 26 (0 self)
- Add to MetaCart
Data sets with multiple, heterogeneous feature spaces occur frequently. We present an abstract framework for integrating multiple feature spaces in the k-means clustering algorithm. Our main ideas are (i) to represent each data object as a tuple of multiple feature vectors, (ii) to assign a suitable (and possibly different) distortion measure to each feature space, (iii) to combine distortions on different feature spaces, in a convex fashion, by assigning (possibly) different relative weights to each, (iv) for a fixed weighting, to cluster using the proposed convex k-means algorithm, and (v) to determine the optimal feature weighting to be the one that yields the clustering that simultaneously minimizes the average within-cluster dispersion and maximizes the average between-cluster dispersion along all the feature spaces. Using precision/recall evaluations and known ground truth classifications, we empirically demonstrate the effectiveness of feature weighting in clustering on several different application domains.
Support vector machines for segmental minimum bayes risk decoding of continuous speech
- In IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU
, 2003
"... Segmental Minimum Bayes Risk (SMBR) Decoding involves the refinement of the search space into sequences of small sets of confusable words. We describe the application of Support Vector Machines (SVMs) as discriminative models for the refined search spaces. We show that SVMs, which in their basic for ..."
Abstract
-
Cited by 25 (4 self)
- Add to MetaCart
Segmental Minimum Bayes Risk (SMBR) Decoding involves the refinement of the search space into sequences of small sets of confusable words. We describe the application of Support Vector Machines (SVMs) as discriminative models for the refined search spaces. We show that SVMs, which in their basic formulation are binary classifiers of fixed dimensional observations, can be used for continuous speech recognition. We also study the use of GiniSVMs, which is a variant of the basic SVM. On a small vocabulary task, we show this two pass scheme outperforms MMI trained HMMs. Using system combination we also obtain further improvements over discriminatively trained HMMs. 1.
Redundancy based feature selection for microarray data
- In Proc. of SIGKDD
, 2004
"... In gene expression microarray data analysis, selecting a small number of discriminative genes from thousands of genes is an important problem for accurate classification of diseases or phenotypes. The problem becomes particularly challenging due to the large number of features (genes) and small samp ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
In gene expression microarray data analysis, selecting a small number of discriminative genes from thousands of genes is an important problem for accurate classification of diseases or phenotypes. The problem becomes particularly challenging due to the large number of features (genes) and small sample size. Traditional gene selection methods often select the top-ranked genes according to their individual discriminative power without handling the high degree of redundancy among the genes. Latest research shows that removing redundant genes among selected ones can achieve a better representation of the characteristics of the targeted phenotypes and lead to improved classification accuracy. Hence, we study in this paper the relationship between feature relevance and redundancy and propose an efficient method that can effectively remove redundant genes. The efficiency and effectiveness of our method in comparison with representative methods has been demonstrated through an empirical study using public microarray data sets.
Result analysis of the NIPS 2003 feature selection challenge
- Advances in Neural Information Processing Systems 17
, 2004
"... The NIPS 2003 workshops included a feature selection competition organized by the authors. We provided participants with five datasets from different application domains and called for classification results using a minimal number of features. The competition took place over a period of 13 weeks and ..."
Abstract
-
Cited by 24 (7 self)
- Add to MetaCart
The NIPS 2003 workshops included a feature selection competition organized by the authors. We provided participants with five datasets from different application domains and called for classification results using a minimal number of features. The competition took place over a period of 13 weeks and attracted 78 research groups. Participants were asked to make on-line submissions on the validation and test sets, with performance on the validation set being presented immediately to the participant and performance on the test set presented to the participants at the workshop. In total 1863 entries were made on the validation sets during the development period and 135 entries on all test sets for the final competition. The winners used a combination of Bayesian neural networks with ARD priors and Dirichlet diffusion trees. Other top entries used a variety of methods for feature selection, which combined filters and/or wrapper or embedded methods using Random Forests, kernel methods, or neural networks as a classification engine. The results of the benchmark (including the predictions made by the participants and the features they selected) and the scoring software are publicly available. The benchmark is available at www.nipsfsc.ecs.soton.ac.uk for post-challenge submissions to stimulate further research. 1

