• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 1,661
Next 10 →

A comparison of event models for Naive Bayes text classification

by Andrew McCallum, Kamal Nigam , 1998
"... Recent work in text classification has used two different first-order probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multi-variate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e.g. Larkey ..."
Abstract - Cited by 1025 (26 self) - Add to MetaCart
Recent work in text classification has used two different first-order probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multi-variate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e

Reversing and Smoothing the Multinomial Naive Bayes Text Classifer

by Alfons Juan, Hermann Ney - In Proceedings of the 2nd Int. Workshop on Pattern Recognition in Information Systems (PRIS 2002 , 2002
"... The naive Bayes text classifier has long been a core technique in information retrieval and, more recently, it has emerged as a focus of research itself in machine learning. This paper is concerned with the naive Bayes text classifier in its multinomial model instantiation. ..."
Abstract - Cited by 7 (2 self) - Add to MetaCart
The naive Bayes text classifier has long been a core technique in information retrieval and, more recently, it has emerged as a focus of research itself in machine learning. This paper is concerned with the naive Bayes text classifier in its multinomial model instantiation.

On Word Frequency Information and Negative Evidence in Naive Bayes Text Classification

by Karl-Michael Schneider - ESPAÑA FOR NATURAL LANGUAGE PROCESSING, ESTAL , 2004
"... The Naive Bayes classifier exists in different versions. One version, called multi-variate Bernoulli or binary independence model, uses binary word occurrence vectors, while the multinomial model uses word frequency counts. Many publications cite this difference as the main reason for the superi ..."
Abstract - Cited by 10 (0 self) - Add to MetaCart
The Naive Bayes classifier exists in different versions. One version, called multi-variate Bernoulli or binary independence model, uses binary word occurrence vectors, while the multinomial model uses word frequency counts. Many publications cite this difference as the main reason

A Comparison of Event Models for Naive Bayes Text Classification

by Kamal Nigamt
"... Recent work in text classification has used two different first-order probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multi-variate Bernoulli model, that is, a Bayesian Net-work with no dependencies between words and binary word features (e.g. Larke ..."
Abstract - Add to MetaCart
Recent work in text classification has used two different first-order probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multi-variate Bernoulli model, that is, a Bayesian Net-work with no dependencies between words and binary word features (e

A Term Association Translation Model for Naive Bayes Text Classification

by Meng-sung Wu, Hsin-min Wang
"... Abstract. Text classification (TC) has long been an important research topic in information retrieval (IR) related areas. In the literature, the bag-of-words (BoW) model has been widely used to represent a docu-ment in text classification and many other applications. However, BoW, which ignores the ..."
Abstract - Add to MetaCart
the relationships between terms, offers a rather poor docu-ment representation. Some previous research has shown that incorporat-ing language models into the naive Bayes classifier (NBC) can improve the performance of text classification. Although the widely used N-gram language models (LM) can exploit

A new feature selection score for multinomial naive bayes text classification based on kl-divergence

by Karl-michael Schneider - In The Companion Volume to the Proceedings of 42st Annual Meeting of the Association for Computational Linguistics , 2004
"... We define a new feature selection score for text classification based on the KL-divergence between the distribution of words in training documents and their classes. The score favors words that have a similar distribution in documents of the same class but different distributions in documents of dif ..."
Abstract - Cited by 6 (0 self) - Add to MetaCart
We define a new feature selection score for text classification based on the KL-divergence between the distribution of words in training documents and their classes. The score favors words that have a similar distribution in documents of the same class but different distributions in documents

Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

by David D. Lewis , 1998
"... The naive Bayes classifier, currently experiencing a renaissance in machine learning, has long been a core technique in information retrieval. We review some of the variations of naive Bayes models used for text retrieval and classification, focusing on the distributional assump- tions made abou ..."
Abstract - Cited by 499 (1 self) - Add to MetaCart
The naive Bayes classifier, currently experiencing a renaissance in machine learning, has long been a core technique in information retrieval. We review some of the variations of naive Bayes models used for text retrieval and classification, focusing on the distributional assump- tions made

Poisson Naive Bayes for Text Classification with Feature Weighting

by Sang-bum Kim, Hee-cheol Seo, Hae-chang Rim - International Workshop on Information Retrieval with Asian Languages , 2003
"... In this paper, we investigate the use of multivariate Poisson model and feature weighting to learn naive Bayes text classifier. Our new naive Bayes text classification model assumes that a document is generated by a multivariate Poisson model while the previous works consider a document as a vector ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
In this paper, we investigate the use of multivariate Poisson model and feature weighting to learn naive Bayes text classifier. Our new naive Bayes text classification model assumes that a document is generated by a multivariate Poisson model while the previous works consider a document as a vector

Text Classification from Labeled and Unlabeled Documents using EM

by Kamal Nigam, Andrew Kachites Mccallum, Sebastian Thrun, Tom Mitchell - MACHINE LEARNING , 1999
"... This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large qua ..."
Abstract - Cited by 1033 (15 self) - Add to MetaCart
This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large

An evaluation of statistical approaches to text categorization

by Yiming Yang - Journal of Information Retrieval , 1999
"... Abstract. This paper focuses on a comparative evaluation of a wide-range of text categorization methods, including previously published results on the Reuters corpus and new results of additional experiments. A controlled study using three classifiers, kNN, LLSF and WORD, was conducted to examine th ..."
Abstract - Cited by 663 (22 self) - Add to MetaCart
were used as baselines, since they were evaluated on all versions of Reuters that exclude the unlabelled documents. As a global observation, kNN, LLSF and a neural network method had the best performance; except for a Naive Bayes approach, the other learning algorithms also performed relatively well.
Next 10 →
Results 1 - 10 of 1,661
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University