Results 1  10
of
14,147
The Learnability of Naive Bayes
 In: Proceedings of Canadian Artificial Intelligence Conference
, 2005
"... Naive Bayes is an efficient and effective learning algorithm, but previous results show that its representation ability is severely limited since it can only represent certain linearly separable functions in the binary domain. We give necessary and sufficient conditions on linearly separable functio ..."
Abstract

Cited by 156 (0 self)
 Add to MetaCart
functions in the binary domain to be learnable by Naive Bayes under uniform representation. We then show that the learnability (and error rates) of Naive Bayes can be affected dramatically by sampling distributions. Our results help us to gain a much deeper understanding of this seemingly simple, yet
Bayesian Network Classifiers
, 1997
"... Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with stateoftheart classifiers such as C4.5. This fact raises the question of whether a classifier with less restr ..."
Abstract

Cited by 788 (23 self)
 Add to MetaCart
represent statements about independence. Among these approaches we single out a method we call Tree Augmented Naive Bayes (TAN), which outperforms naive Bayes, yet at the same time maintains the computational simplicity (no search involved) and robustness that characterize naive Bayes. We experimentally
Text Classification from Labeled and Unlabeled Documents using EM
 MACHINE LEARNING
, 1999
"... This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large qua ..."
Abstract

Cited by 1033 (19 self)
 Add to MetaCart
quantities of unlabeled documents are readily available. We introduce an algorithm for learning from labeled and unlabeled documents based on the combination of ExpectationMaximization (EM) and a naive Bayes classifier. The algorithm first trains a classifier using the available labeled documents
Learnability of Augmented Naive Bayes in Nominal Domains
 In Proceedings of ICML2001
, 2001
"... It is wellknown that Naive Bayes can only represent linearly separable functions in binary domains. But the learnability of general Augmented Naive Bayes is open. Little work is done on the learnability of Bayesian networks in nominal domains, a general case of binary domains. This paper explores t ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
It is wellknown that Naive Bayes can only represent linearly separable functions in binary domains. But the learnability of general Augmented Naive Bayes is open. Little work is done on the learnability of Bayesian networks in nominal domains, a general case of binary domains. This paper explores
An Efficient Boosting Algorithm for Combining Preferences
, 1999
"... The problem of combining preferences arises in several applications, such as combining the results of different search engines. This work describes an efficient algorithm for combining multiple preferences. We first give a formal framework for the problem. We then describe and analyze a new boosting ..."
Abstract

Cited by 707 (18 self)
 Add to MetaCart
search strategies, each of which is a query expansion for a given domain. For this task, we compare the performance of RankBoost to the individual search strategies. The second experiment is a collaborativefiltering task for making movie recommendations. Here, we present results comparing Rank
Hierarchically Classifying Documents Using Very Few Words
, 1997
"... The proliferation of topic hierarchies for text documents has resulted in a need for tools that automatically classify new documents within such hierarchies. Existing classification schemes which ignore the hierarchical structure and treat the topics as separate classes are often inadequate in text ..."
Abstract

Cited by 521 (8 self)
 Add to MetaCart
The proliferation of topic hierarchies for text documents has resulted in a need for tools that automatically classify new documents within such hierarchies. Existing classification schemes which ignore the hierarchical structure and treat the topics as separate classes are often inadequate in text classification where the there is a large number of classes and a huge number of relevant features needed to distinguish between them. We propose an approach that utilizes the hierarchical topic structure to decompose the classification task into a set of simpler problems, one at each node in the classification tree. As we show, each of these smaller problems can be solved accurately by focusing only on a very small set of features, those relevant to the task at hand. This set of relevant features varies widely throughout the hierarchy, so that, while the overall relevant feature set may be large, each classifier only examines a small subset. The use of reduced feature sets allows us to util...
Mining the Network Value of Customers
 In Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining
, 2002
"... One of the major applications of data mining is in helping companies determine which potential customers to market to. If the expected pro t from a customer is greater than the cost of marketing to her, the marketing action for that customer is executed. So far, work in this area has considered only ..."
Abstract

Cited by 562 (11 self)
 Add to MetaCart
One of the major applications of data mining is in helping companies determine which potential customers to market to. If the expected pro t from a customer is greater than the cost of marketing to her, the marketing action for that customer is executed. So far, work in this area has considered only the intrinsic value of the customer (i.e, the expected pro t from sales to her). We propose to model also the customer's network value: the expected pro t from sales to other customers she may inuence to buy, the customers those may inuence, and so on recursively. Instead of viewing a market as a set of independent entities, we view it as a social network and model it as a Markov random eld. We show the advantages of this approach using a social network mined from a collaborative ltering database. Marketing that exploits the network value of customersalso known as viral marketingcan be extremely eective, but is still a black art. Our work can be viewed as a step towards providing a more solid foundation for it, taking advantage of the availability of large relevant databases. Categories and Subject Descriptors H.2.8 [Database Management]: Database Applications data mining
Graphical models, exponential families, and variational inference
, 2008
"... The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building largescale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fiel ..."
Abstract

Cited by 800 (26 self)
 Add to MetaCart
The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building largescale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimization, signal and image processing, information retrieval and statistical machine learning. Many problems that arise in specific instances — including the key problems of computing marginals and modes of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing likelihoods, marginal probabilities and most probable configurations. We describe how a wide varietyof algorithms — among them sumproduct, cluster variational methods, expectationpropagation, mean field methods, maxproduct and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations. The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in largescale statistical models.
Results 1  10
of
14,147