• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Spam Filtering with Naive Bayes -- Which Naive Bayes? (2006)

Cached

  • Download as a PDF

Download Links

  • [www.aueb.gr]
  • [www.aueb.gr]
  • [www.cse.ucsc.edu]
  • [www.ceas.cc]
  • [www.cse.ucsc.edu]
  • [pages.cs.aueb.gr]
  • [nlp.cs.aueb.gr]
  • [users.iit.demokritos.gr]
  • [classes.soe.ucsc.edu]
  • [classes.soe.ucsc.edu]
  • [users.iit.demokritos.gr]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Vangelis Metsis, et al.
Venue:THIRD CONFERENCE ON EMAIL AND ANTI-SPAM (CEAS
Citations:48 - 3 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Metsis06spamfiltering,
    author = {Vangelis Metsis and et al.},
    title = {Spam Filtering with Naive Bayes -- Which Naive Bayes?},
    booktitle = {THIRD CONFERENCE ON EMAIL AND ANTI-SPAM (CEAS},
    year = {2006},
    publisher = {}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

Naive Bayes is very popular in commercial and open-source anti-spam e-mail filters. There are, however, several forms of Naive Bayes, something the anti-spam literature does not always acknowledge. We discuss five different versions of Naive Bayes, and compare them on six new, non-encoded datasets, that contain ham messages of particular Enron users and fresh spam messages. The new datasets, which we make publicly available, are more realistic than previous comparable benchmarks, because they maintain the temporal order of the messages in the two categories, and they emulate the varying proportion of spam and ham messages that users receive over time. We adopt an experimental procedure that emulates the incremental training of personalized spam filters, and we plot roc curves that allow us to compare the different versions of nb over the entire tradeoff between true positives and true negatives.

Keyphrases

naive bayes naive bayes    naive bayes    ham message    di erent version    fresh spam message    entire tradeo    particular enron user    incremental training    open-source anti-spam e-mail filter    experimental procedure    temporal order    previous comparable benchmark    several form    anti-spam literature    true positive    new datasets    personalized spam filter    roc curve    true negative    non-encoded datasets   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University