Results 1  10
of
179,830
Sparse Bayesian Learning and the Relevance Vector Machine
, 2001
"... This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classication tasks utilising models linear in the parameters. Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the `relevance vec ..."
Abstract

Cited by 958 (5 self)
 Add to MetaCart
vector machine' (RVM), a model of identical functional form to the popular and stateoftheart `support vector machine' (SVM). We demonstrate that by exploiting a probabilistic Bayesian learning framework, we can derive accurate prediction models which typically utilise dramatically fewer
Ensemble Methods in Machine Learning
 MULTIPLE CLASSIFIER SYSTEMS, LBCS1857
, 2000
"... Ensemble methods are learning algorithms that construct a set of classifiers and then classify new data points by taking a (weighted) vote of their predictions. The original ensemble method is Bayesian averaging, but more recent algorithms include errorcorrecting output coding, Bagging, and boostin ..."
Abstract

Cited by 607 (3 self)
 Add to MetaCart
Ensemble methods are learning algorithms that construct a set of classifiers and then classify new data points by taking a (weighted) vote of their predictions. The original ensemble method is Bayesian averaging, but more recent algorithms include errorcorrecting output coding, Bagging
Bayesian YingYang machine, clustering and number of clusters
 Pattern Recognition Letters
, 1997
"... It is shown that a particular case of the Bayesian YingYang learning system and theory reduces to the maximum likelihood learning of a finite mixture, from which we have obtained not only the EM algorithm for its parameter estimation Z and its various approximate but fast algorithms for clustering ..."
Abstract

Cited by 29 (11 self)
 Add to MetaCart
to be more robust in learning. Finally, experimental results are provided. q 1997 Elsevier Science B.V. Keywords: Bayesian YingYang machine; Number of clusters; Finite mixture; Cluster analysis 1.
Machine Learning in Automated Text Categorization
 ACM COMPUTING SURVEYS
, 2002
"... The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this p ..."
Abstract

Cited by 1658 (22 self)
 Add to MetaCart
to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual
Support vector machine active learning for image retrieval
, 2001
"... Relevance feedback is often a critical component when designing image databases. With these databases it is difficult to specify queries directly and explicitly. Relevance feedback interactively determinines a user’s desired output or query concept by asking the user whether certain proposed images ..."
Abstract

Cited by 448 (29 self)
 Add to MetaCart
are relevant or not. For a relevance feedback algorithm to be effective, it must grasp a user’s query concept accurately and quickly, while also only asking the user to label a small number of images. We propose the use of a support vector machine active learning algorithm for conducting effective relevance
Gaussian processes for machine learning
 in: Adaptive Computation and Machine Learning
, 2006
"... Abstract. We give a basic introduction to Gaussian Process regression models. We focus on understanding the role of the stochastic process and how it is used to define a distribution over functions. We present the simple equations for incorporating training data and examine how to learn the hyperpar ..."
Abstract

Cited by 631 (2 self)
 Add to MetaCart
of statistics and machine learning, either for analysis of data sets, or as a subgoal of a more complex problem. Traditionally parametric 1 models have been used for this purpose. These have a possible advantage in ease of interpretability, but for complex data sets, simple parametric models may lack expressive
A learning algorithm for Boltzmann machines
 Cognitive Science
, 1985
"... The computotionol power of massively parallel networks of simple processing elements resides in the communication bandwidth provided by the hardware connections between elements. These connections con allow a significant fraction of the knowledge of the system to be applied to an instance of a probl ..."
Abstract

Cited by 586 (13 self)
 Add to MetaCart
. Second, there must be some way of choosing internal representations which allow the preexisting hardware connections to be used efficiently for encoding the constraints in the domain being searched. We describe a generol parallel search method, based on statistical mechanics, and we show how it leads
Bayesian Network Classifiers
, 1997
"... Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with stateoftheart classifiers such as C4.5. This fact raises the question of whether a classifier with less restr ..."
Abstract

Cited by 788 (23 self)
 Add to MetaCart
restrictive assumptions can perform even better. In this paper we evaluate approaches for inducing classifiers from data, based on the theory of learning Bayesian networks. These networks are factored representations of probability distributions that generalize the naive Bayesian classifier and explicitly
Estimating Continuous Distributions in Bayesian Classifiers
 In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence
, 1995
"... When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality ..."
Abstract

Cited by 489 (2 self)
 Add to MetaCart
distribution with a single Gaussian; and using nonparametric kernel density estimation. We observe large reductions in error on several natural and artificial data sets, which suggests that kernel estimation is a useful tool for learning Bayesian models. In Proceedings of the Eleventh Conference on Uncertainty
Feature selection based on mutual information: Criteria of maxdepe ndency, maxrelevance, and minredundancy
 IEEE Trans. Pattern Analysis and Machine Intelligence
"... Abstract—Feature selection is an important problem for pattern classification systems. We study how to select good features according to the maximal statistical dependency criterion based on mutual information. Because of the difficulty in directly implementing the maximal dependency condition, we f ..."
Abstract

Cited by 533 (7 self)
 Add to MetaCart
to select a compact set of superior features at very low cost. We perform extensive experimental comparison of our algorithm and other methods using three different classifiers (naive Bayes, support vector machine, and linear discriminate analysis) and four different data sets (handwritten digits
Results 1  10
of
179,830