Results 1  10
of
29
Combining Naive Bayes and nGram Language Models for Text Classification
 In 25th European Conference on Information Retrieval Research (ECIR
, 2003
"... We augment the naive Bayes model with an ngram language model to address two shortcomings of naive Bayes text classifiers. ..."
Abstract

Cited by 43 (2 self)
 Add to MetaCart
(Show Context)
We augment the naive Bayes model with an ngram language model to address two shortcomings of naive Bayes text classifiers.
Locally Weighted Naive Bayes
 Proceedings of the Conference on Uncertainty in Artificial Intelligence
, 2003
"... Despite its simplicity, the naive Bayes classifier has surprised machine learning researchers by exhibiting good performance on a variety of learning problems. Encouraged by these results, researchers have looked to overcome naive Bayes' primary weakness  attribute independence  and improve ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
Despite its simplicity, the naive Bayes classifier has surprised machine learning researchers by exhibiting good performance on a variety of learning problems. Encouraged by these results, researchers have looked to overcome naive Bayes' primary weakness  attribute independence  and improve the performance of the algorithm. This paper presents a locally weighted version of naive Bayes that relaxes the independence assumption by learning local models at prediction time. Experimental results show that locally weighted naive Bayes rarely degrades accuracy compared to standard naive Bayes and, in many cases, improves accuracy dramatically. The main advantage of this method compared to other techniques for enhancing naive Bayes is its conceptual and computational simplicity.
Mobimine: Monitoring the stock market from a PDA
 ACM SIGKDD Explorations
, 2002
"... ..."
(Show Context)
Basic Principles of Learning Bayesian Logic Programs
 Institute for Computer Science, University of Freiburg
, 2002
"... Bayesian logic programs tightly integrate definite logic programs with Bayesian networks in order to... In this paper, we present results on combining Inductive Logic Programming with Bayesian networks to learn both the qualitative and the quantitative components of Bayesian logic programs from data ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
(Show Context)
Bayesian logic programs tightly integrate definite logic programs with Bayesian networks in order to... In this paper, we present results on combining Inductive Logic Programming with Bayesian networks to learn both the qualitative and the quantitative components of Bayesian logic programs from data. More precisely, we show how the qualitative components can be learned by combining the inductive logic programming setting learning from interpretations with scorebased techniques for learning Bayesian networks. The estimation of the quantitative components is reduced to the corresponding problem of (dynamic) Bayesian networks
A Probabilistic Approach to FullText Document Clustering
, 1998
"... In addressing the issue of text document clustering, a suitable function for measuring the distance between documents is needed. In this paper we explore a function for scoring document similarity based on probabilistic considerations: similarity is scored according to the expectation of the same wo ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
(Show Context)
In addressing the issue of text document clustering, a suitable function for measuring the distance between documents is needed. In this paper we explore a function for scoring document similarity based on probabilistic considerations: similarity is scored according to the expectation of the same words appearing in two documents. This score enables the investigation of different smoothing methods for estimating the probability of a word appearing in a document for purposes of clustering. Our experimental results show that these different smoothing methods may be more or less effective depending on the degree of separability between the clusters. Furthermore, we show that the cosine coefficientwidely used in information retrieval can be associated with a particular form of probabilistic smoothing in our model. We also introduce a specific scoring function that outperforms the cosine coefficient and its extensions such as TFIDF weighting in our experiments with document clustering tasks. This new scoring is based on normalizing (in the probabilistic sense) the cosine similarity score and adding a scaling factor based on the characteristics of the corpus being clustered. Finally our experiments indicate that our model, which assumes an asymmetry between positive (word appearance)and negative (word nonappearance)information in the document clustering task, outperforms standard mixture models that weight such information equally.
Learning recursive Bayesian multinets for data clustering by means of constructive induction
, 2001
"... This paper introduces and evaluates a new class of knowledge model, the recursive Bayesian multinet (RBMN), which encodes the joint probability distribution of a given database. RBMNs extend Bayesian networks (BNs) as well as partitional clustering systems. Briefly, a RBMN is a decision tree with co ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
This paper introduces and evaluates a new class of knowledge model, the recursive Bayesian multinet (RBMN), which encodes the joint probability distribution of a given database. RBMNs extend Bayesian networks (BNs) as well as partitional clustering systems. Briefly, a RBMN is a decision tree with component BNs at the leaves. A RBMN is learnt using a greedy, heuristic approach akin to that used by many supervised decision tree learners, but where BNs are learnt at leaves using constructive induction. A key idea is to treat expected data as real data. This allows us to complete the database and to take advantage of a closed form for the marginal likelihood of the expected complete data that factorizes into separate marginal likelihoods for each family (a node and its parents). Our approach is evaluated on synthetic and realworld databases.
Feature Reduction for Document Clustering and Classification
, 2000
"... Often users receive search results which contain a wide range of documents, only some of which are relevant to their information needs. To address this problem, ever more systems not only locate information for users, but also organise that information on their behalf. We look at two main automatic ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
Often users receive search results which contain a wide range of documents, only some of which are relevant to their information needs. To address this problem, ever more systems not only locate information for users, but also organise that information on their behalf. We look at two main automatic approaches to information organisation: interactive clustering of search results and precategorising documents to provide hierarchical browsing structures. To be feasible in real world applications, both of these approaches require accurate yet efficient algorithms. Yet, both suffer from the curse of dimensionality — documents are typically represented by hundreds or thousands of words (features) which must be analysed and processed during clustering or classification. In this paper, we discuss feature reduction techniques and their application to document clustering and classification, showing that feature reduction improves efficiency as well as accuracy. We validate these algorithms using human relevance assignments and categorisation. 1
Naive Bayes for regression
 Machine Learning
, 2000
"... Abstract. Despite its simplicity, the naive Bayes learning scheme performs well on most classification tasks, and is often significantly more accurate than more sophisticated methods. Although the probability estimates that it produces can be inaccurate, it often assigns maximum probability to the c ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
Abstract. Despite its simplicity, the naive Bayes learning scheme performs well on most classification tasks, and is often significantly more accurate than more sophisticated methods. Although the probability estimates that it produces can be inaccurate, it often assigns maximum probability to the correct class. This suggests that its good performance might be restricted to situations where the output is categorical. It is therefore interesting to see how it performs in domains where the predicted value is numeric, because in this case, predictions are more sensitive to inaccurate probability estimates. This paper shows how to apply the naive Bayes methodology to numeric prediction (i.e., regression) tasks by modeling the probability distribution of the target value with kernel density estimators, and compares it to linear regression, locally weighted linear regression, and a method that produces “model trees”—decision trees with linear regression functions at the leaves. Although we exhibit an artificial dataset for which naive Bayes is the method of choice, on realworld datasets it is almost uniformly worse than locally weighted linear regression and model trees. The comparison with linear regression depends on the error measure: for one measure naive Bayes performs similarly, while for another it is worse. We also show that standard naive Bayes applied to regression problems by discretizing the target value performs similarly badly. We then present empirical evidence that isolates naive Bayes ’ independence assumption as the culprit for its poor performance in the regression setting. These results indicate that the simplistic statistical assumption that naive Bayes makes is indeed more restrictive for regression than for classification.
Classification using Hierarchical Naïve Bayes models
 Machine Learning 2006
, 2002
"... Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well performing set of classifiers is the Nave Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe an in ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well performing set of classifiers is the Nave Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe an instance are conditionally independent given the class of that instance. When this assumption is violated (which is often the case in practice) it can reduce classification accuracy due to "information doublecounting" and interaction omission.
Parameter Learning in Object Oriented Bayesian Networks
, 2001
"... This paper describes a method for parameter learning in ObjectOriented Bayesian Networks (OOBNs). We propose a methodology for learning parameters in OOBNs, and prove that maintaining the object orientation imposed by the prior model will increase the learning speed in objectoriented domains. We a ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
This paper describes a method for parameter learning in ObjectOriented Bayesian Networks (OOBNs). We propose a methodology for learning parameters in OOBNs, and prove that maintaining the object orientation imposed by the prior model will increase the learning speed in objectoriented domains. We also propose a method to efficiently estimate the probability parameters in domains that are not strictly object oriented. Finally, we attack type uncertainty, a special case of model uncertainty typical to objectoriented domains