Results 1  10
of
59
CBSA: Contentbased Soft Annotation for Multimodal Image Retrieval Using Bayes Point Machines
 IEEE Transactions on Circuits and Systems for Video Technology
, 2003
"... We propose a contentbased soft annotation (CBSA) procedure for providing images with semantical labels. The annotation procedure starts with labeling a small set of training images, each with one single semantical label (e.g., forest, animal, or sky). An ensemble of binary classifiers is then train ..."
Abstract

Cited by 93 (7 self)
 Add to MetaCart
We propose a contentbased soft annotation (CBSA) procedure for providing images with semantical labels. The annotation procedure starts with labeling a small set of training images, each with one single semantical label (e.g., forest, animal, or sky). An ensemble of binary classifiers is then trained for predicting label membership for images. The trained ensemble is applied to each individual image to give the image multiple soft labels, and each label is associated with a label membership factor. To select a base binaryclassifier for CBSA, we experiment with two learning methods, Support Vector Machines (SVMs) and Bayes Point Machines (BPMs, and compare their classprediction accuracy. Our empirical study on a 116category 25Kimage set shows that the BPMbased ensemble provides better annotation quality than the SVMbased ensemble for supporting multimodal image retrievals. Keywords: Bayes Point Machines, Support Vector Machines, image annotation, multimodal image retrieval.
Online Choice of Active Learning Algorithms
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2004
"... This work is concerned with the question of how to combine online an ensemble of active learners so as to expedite the learning progress in poolbased active learning. We develop an activelearning master algorithm, based on a known competitive algorithm for the multiarmed bandit problem. A major ..."
Abstract

Cited by 88 (2 self)
 Add to MetaCart
This work is concerned with the question of how to combine online an ensemble of active learners so as to expedite the learning progress in poolbased active learning. We develop an activelearning master algorithm, based on a known competitive algorithm for the multiarmed bandit problem. A major challenge in successfully choosing top performing active learners online is to reliably estimate their progress during the learning session. To this end we propose a simple maximum entropy criterion that provides effective estimates in realistic settings. We study the performance of the proposed master algorithm using an ensemble containing two of the best known activelearning algorithms as well as a new algorithm. The resulting
Incorporating Diversity in Active Learning with Support Vector Machines
 In ICML
, 2003
"... In many real world applications, active selection of training examples can significantly reduce the number of labelled training examples to learn a classification function. Different strategies in the field of support vector machines have been proposed that iteratively select a single new example fr ..."
Abstract

Cited by 69 (0 self)
 Add to MetaCart
In many real world applications, active selection of training examples can significantly reduce the number of labelled training examples to learn a classification function. Different strategies in the field of support vector machines have been proposed that iteratively select a single new example from a set of unlabelled examples, query the corresponding class label and then perform retraining of the current classifier. However, to reduce computational time for training, it might be necessary to select batches of new training examples instead of single examples. Strategies for single examples can be extended straightforwardly to select batches by choosing the h> 1 examples that get the highest values for the individual selection criterion. We present a new approach that is especially designed to construct batches and incorporates a diversity measure. It has low computational requirements making it feasible for large scale problems with several thousands of examples. Experimental results indicate that this approach provides a faster method to attain a level of generalization accuracy in terms of the number of labelled examples. 1.
Confidenceweighted linear classification
 In ICML ’08: Proceedings of the 25th international conference on Machine learning
, 2008
"... We introduce confidenceweighted linear classifiers, which add parameter confidence information to linear classifiers. Online learners in this setting update both classifier parameters and the estimate of their confidence. The particular online algorithms we study here maintain a Gaussian distributi ..."
Abstract

Cited by 58 (11 self)
 Add to MetaCart
We introduce confidenceweighted linear classifiers, which add parameter confidence information to linear classifiers. Online learners in this setting update both classifier parameters and the estimate of their confidence. The particular online algorithms we study here maintain a Gaussian distribution over parameter vectors and update the mean and covariance of the distribution with each instance. Empirical evaluation on a range of NLP tasks show that our algorithm improves over other state of the art online and batch methods, learns faster in the online setting, and lends itself to better classifier combination after parallel training. 1.
Automatic prediction of frustration
, 2007
"... Predicting when a person might be frustrated can provide an intelligent system with important information about when to initiate interaction. For example, an automated Learning Companion or Intelligent Tutoring System might use this information to intervene, providing support to the learner who is l ..."
Abstract

Cited by 46 (6 self)
 Add to MetaCart
Predicting when a person might be frustrated can provide an intelligent system with important information about when to initiate interaction. For example, an automated Learning Companion or Intelligent Tutoring System might use this information to intervene, providing support to the learner who is likely to otherwise quit, while leaving engaged learners free to discover things without interruption. This paper presents the first automated method that assesses, using multiple channels of affectrelated information, whether a learner is about to click on a button saying ‘‘I’m frustrated.’’ The new method was tested on data gathered from 24 participants using an automated Learning Companion. Their indication of frustration was automatically predicted from the collected data with 79% accuracy (chance 58%). The new assessment method is based on Gaussian process classification and Bayesian inference. Its performance suggests that nonverbal channels carrying affective cues can help provide important information to a system for formulating a more intelligent response.
Adaptive Regularization of Weight Vectors
 Advances in Neural Information Processing Systems 22
, 2009
"... We present AROW, a new online learning algorithm that combines several useful properties: large margin training, confidence weighting, and the capacity to handle nonseparable data. AROW performs adaptive regularization of the prediction function upon seeing each new instance, allowing it to perform ..."
Abstract

Cited by 33 (10 self)
 Add to MetaCart
We present AROW, a new online learning algorithm that combines several useful properties: large margin training, confidence weighting, and the capacity to handle nonseparable data. AROW performs adaptive regularization of the prediction function upon seeing each new instance, allowing it to perform especially well in the presence of label noise. We derive a mistake bound, similar in form to the second order perceptron bound, that does not assume separability. We also relate our algorithm to recent confidenceweighted online learning techniques and show empirically that AROW achieves stateoftheart performance and notable robustness in the case of nonseparable data. 1
A PACBayesian Margin Bound for Linear Classifiers
, 2002
"... We present a bound on the generalisation error of linear classifiers in terms of a refined margin quantity on the training sample. The result is obtained in a PACBayesian framework and is based on geometrical arguments in the space of linear classifiers. The new bound constitutes an exponential imp ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
We present a bound on the generalisation error of linear classifiers in terms of a refined margin quantity on the training sample. The result is obtained in a PACBayesian framework and is based on geometrical arguments in the space of linear classifiers. The new bound constitutes an exponential improvement of the so far tightest margin bound, which was developed in the luckiness framework, and scales logarithmically in the inverse margin. Even in the case of less training examples than input dimensions sufficiently large margins lead to nontrivial bound values andfor maximum marginsto a vanishing complexity term. In contrast to previous results, however, the new bound does depend on the dimensionality of feature space. The analysis shows that the classical margin is too coarse a measure for the essential quantity that controls the generalisation error: the fraction of hypothesis space consistent with the training sample. The practical relevance of the result lies in the fact that the wellknown support vector machine is optimal with respect to the new bound only if the feature vectors in the training sample are all of the same length. As a consequence we recommend to use SVMs on normalised feature vectors only. Numerical simulations support this recommendation and demonstrate that the new error bound can be used for the purpose of model selection.
Collaborative prediction using ensembles of maximum margin matrix factorizations
 In ICML
, 2006
"... Fast gradientbased methods for Maximum Margin Matrix Factorization (MMMF) were recently shown to have great promise (Rennie & Srebro, 2005), including significantly outperforming the previous stateoftheart methods on some standard collaborative prediction benchmarks (including MovieLens). In thi ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
Fast gradientbased methods for Maximum Margin Matrix Factorization (MMMF) were recently shown to have great promise (Rennie & Srebro, 2005), including significantly outperforming the previous stateoftheart methods on some standard collaborative prediction benchmarks (including MovieLens). In this paper, we investigate ways to further improve the performance of MMMF, by casting it within an ensemble approach. We explore and evaluate a variety of alternative ways to define such ensembles. We show that our resulting ensembles can perform significantly better than a single MMMF model, along multiple evaluation metrics. In fact, we find that ensembles of partially trained MMMF models can sometimes even give better predictions in total training time comparable to a single MMMF model. 1.
Algorithmic luckiness
 Journal of Machine Learning Research
, 2002
"... Classical statistical learning theory studies the generalisation performance of machine learning algorithms rather indirectly. One of the main detours is that algorithms are studied in terms of the hypothesis class that they draw their hypotheses from. In this paper, motivated by the luckiness frame ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
Classical statistical learning theory studies the generalisation performance of machine learning algorithms rather indirectly. One of the main detours is that algorithms are studied in terms of the hypothesis class that they draw their hypotheses from. In this paper, motivated by the luckiness framework of ShaweTaylor et al. (1998), we study learning algorithms more directly and in a way that allows us to exploit the serendipity of the training sample. The main dierence to previous approaches lies in the complexity measure; rather than covering all hypotheses in a given hypothesis space it is only necessary to cover the functions which could have been learned using the fixed learning algorithm. We show how the resulting framework relates to the VC, luckiness and compression frameworks. Finally, we present an application of this framework to the maximum margin algorithm for linear classiers which results in a bound that exploits the margin, the sparsity of the resultant weight vector, and the degree of clustering of the training data in feature space.
Exact convex confidenceweighted learning
 In Advances in Neural Information Processing Systems 22
, 2008
"... Confidenceweighted (CW) learning [6], an online learning method for linear classifiers, maintains a Gaussian distributions over weight vectors, with a covariance matrix that represents uncertainty about weights and correlations. Confidence constraints ensure that a weight vector drawn from the hypo ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
Confidenceweighted (CW) learning [6], an online learning method for linear classifiers, maintains a Gaussian distributions over weight vectors, with a covariance matrix that represents uncertainty about weights and correlations. Confidence constraints ensure that a weight vector drawn from the hypothesis distribution correctly classifies examples with a specified probability. Within this framework, we derive a new convex form of the constraint and analyze it in the mistake bound model. Empirical evaluation with both synthetic and text data shows our version of CW learning achieves lower cumulative and outofsample errors than commonly used firstorder and secondorder online methods. 1