Results 1  10
of
1,661
A comparison of event models for Naive Bayes text classification
, 1998
"... Recent work in text classification has used two different firstorder probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multivariate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e.g. Larkey ..."
Abstract

Cited by 1025 (26 self)
 Add to MetaCart
Recent work in text classification has used two different firstorder probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multivariate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e
Reversing and Smoothing the Multinomial Naive Bayes Text Classifer
 In Proceedings of the 2nd Int. Workshop on Pattern Recognition in Information Systems (PRIS 2002
, 2002
"... The naive Bayes text classifier has long been a core technique in information retrieval and, more recently, it has emerged as a focus of research itself in machine learning. This paper is concerned with the naive Bayes text classifier in its multinomial model instantiation. ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
The naive Bayes text classifier has long been a core technique in information retrieval and, more recently, it has emerged as a focus of research itself in machine learning. This paper is concerned with the naive Bayes text classifier in its multinomial model instantiation.
On Word Frequency Information and Negative Evidence in Naive Bayes Text Classification
 ESPAÑA FOR NATURAL LANGUAGE PROCESSING, ESTAL
, 2004
"... The Naive Bayes classifier exists in different versions. One version, called multivariate Bernoulli or binary independence model, uses binary word occurrence vectors, while the multinomial model uses word frequency counts. Many publications cite this difference as the main reason for the superi ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
The Naive Bayes classifier exists in different versions. One version, called multivariate Bernoulli or binary independence model, uses binary word occurrence vectors, while the multinomial model uses word frequency counts. Many publications cite this difference as the main reason
A Comparison of Event Models for Naive Bayes Text Classification
"... Recent work in text classification has used two different firstorder probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multivariate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e.g. Larke ..."
Abstract
 Add to MetaCart
Recent work in text classification has used two different firstorder probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multivariate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e
A Term Association Translation Model for Naive Bayes Text Classification
"... Abstract. Text classification (TC) has long been an important research topic in information retrieval (IR) related areas. In the literature, the bagofwords (BoW) model has been widely used to represent a document in text classification and many other applications. However, BoW, which ignores the ..."
Abstract
 Add to MetaCart
the relationships between terms, offers a rather poor document representation. Some previous research has shown that incorporating language models into the naive Bayes classifier (NBC) can improve the performance of text classification. Although the widely used Ngram language models (LM) can exploit
A new feature selection score for multinomial naive bayes text classification based on kldivergence
 In The Companion Volume to the Proceedings of 42st Annual Meeting of the Association for Computational Linguistics
, 2004
"... We define a new feature selection score for text classification based on the KLdivergence between the distribution of words in training documents and their classes. The score favors words that have a similar distribution in documents of the same class but different distributions in documents of dif ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
We define a new feature selection score for text classification based on the KLdivergence between the distribution of words in training documents and their classes. The score favors words that have a similar distribution in documents of the same class but different distributions in documents
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
, 1998
"... The naive Bayes classifier, currently experiencing a renaissance in machine learning, has long been a core technique in information retrieval. We review some of the variations of naive Bayes models used for text retrieval and classification, focusing on the distributional assump tions made abou ..."
Abstract

Cited by 499 (1 self)
 Add to MetaCart
The naive Bayes classifier, currently experiencing a renaissance in machine learning, has long been a core technique in information retrieval. We review some of the variations of naive Bayes models used for text retrieval and classification, focusing on the distributional assump tions made
Poisson Naive Bayes for Text Classification with Feature Weighting
 International Workshop on Information Retrieval with Asian Languages
, 2003
"... In this paper, we investigate the use of multivariate Poisson model and feature weighting to learn naive Bayes text classifier. Our new naive Bayes text classification model assumes that a document is generated by a multivariate Poisson model while the previous works consider a document as a vector ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
In this paper, we investigate the use of multivariate Poisson model and feature weighting to learn naive Bayes text classifier. Our new naive Bayes text classification model assumes that a document is generated by a multivariate Poisson model while the previous works consider a document as a vector
Text Classification from Labeled and Unlabeled Documents using EM
 MACHINE LEARNING
, 1999
"... This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large qua ..."
Abstract

Cited by 1033 (15 self)
 Add to MetaCart
This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large
An evaluation of statistical approaches to text categorization
 Journal of Information Retrieval
, 1999
"... Abstract. This paper focuses on a comparative evaluation of a widerange of text categorization methods, including previously published results on the Reuters corpus and new results of additional experiments. A controlled study using three classifiers, kNN, LLSF and WORD, was conducted to examine th ..."
Abstract

Cited by 663 (22 self)
 Add to MetaCart
were used as baselines, since they were evaluated on all versions of Reuters that exclude the unlabelled documents. As a global observation, kNN, LLSF and a neural network method had the best performance; except for a Naive Bayes approach, the other learning algorithms also performed relatively well.
Results 1  10
of
1,661