Results 1 - 10
of
29
Re: Reliable email
- In Proc. NSDI
, 2006
"... The explosive growth in unwanted email has prompted the development of techniques for the rejection of email, intended to shield recipients from the onerous task of identifying the legitimate email in their inboxes amid a sea of spam. Unfortunately, widely used contentbased filtering systems have co ..."
Abstract
-
Cited by 53 (3 self)
- Add to MetaCart
The explosive growth in unwanted email has prompted the development of techniques for the rejection of email, intended to shield recipients from the onerous task of identifying the legitimate email in their inboxes amid a sea of spam. Unfortunately, widely used contentbased filtering systems have converted the spam problem into a false positive one: email has become unreliable. Email acceptance techniques complement rejection ones; they can help prevent false positives by filing email into a user’s inbox before it is considered for rejection. Whitelisting, whereby recipients accept email from some set of authorized senders, is one such acceptance technique. We present Reliable Email (RE:), a new whitelisting system that incurs zero false positives among socially connected users. Unlike previous whitelisting systems, which require that whitelists be populated manually, RE: exploits friend-of-friend relationships among email correspondents to populate whitelists automatically. To do so, RE: permits an email’s recipient to discover whether other email users have whitelisted the email’s sender, while preserving the privacy of users ’ email contacts with cryptographic private matching techniques. Using real email traces from two sites, we demonstrate that RE: renders a significant fraction of received email reliable. Our evaluation also shows that RE: can prevent up to 88 % of the false positives incurred by a widely deployed email rejection system, at modest computational cost. 1
Developing an Immunity to Spam
- in Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2003
, 2003
"... Immune systems protect animals from pathogens, so why not apply a similar model to protect computers? Several researchers have investigated the use of an artificial immune system to protect computers from viruses and others have looked at using such a system to detect unauthorized computer intru ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
Immune systems protect animals from pathogens, so why not apply a similar model to protect computers? Several researchers have investigated the use of an artificial immune system to protect computers from viruses and others have looked at using such a system to detect unauthorized computer intrusions. This paper describes the use of an artificial immune system for another kind of protection: protection from unsolicited email, or spam.
The CONTINUE server (or, how I administered PADL 2002 and 2003
- of Lecture
, 2003
"... Abstract. Conference paper submission and reviewing is an increasingly electronic activity.Paper authors and program committee members expect to be able to use software, especially with Web interfaces, to simplify and even automate many activities.Building interactive Web sites is a prime target of ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Abstract. Conference paper submission and reviewing is an increasingly electronic activity.Paper authors and program committee members expect to be able to use software, especially with Web interfaces, to simplify and even automate many activities.Building interactive Web sites is a prime target of opportunity for sophisticated declarative programming languages.This paper describes the plt Scheme application Continue, which automates many conference paper management tasks.
Filtering image spam with near-duplicate detection
- In Proceedings of the Fourth Conference on Email and AntiSpam, CEAS’2007
, 2007
"... A new trend in email spam is the emergence of image spam. Although current anti-spam technologies are quite successful in filtering text-based spam emails, the new image spams are substantially more difficult to detect, as they employ a variety of image creation and randomization algorithms. Spam im ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
A new trend in email spam is the emergence of image spam. Although current anti-spam technologies are quite successful in filtering text-based spam emails, the new image spams are substantially more difficult to detect, as they employ a variety of image creation and randomization algorithms. Spam image creation algorithms are designed to defeat well-known vision algorithms such as optical character recognition (OCR) algorithms whereas randomization techniques ensure the uniqueness of each image. We observe that image spam is often sent in batches that consist of visually similar images that differ only due to the application of randomization algorithms. Based on this observation, we propose an image spam detection system that uses near-duplicate detection to detect spam images. We rely on traditional anti-spam methods to detect a subset of spam images and then use multiple image spam filters to detect all the spam images that “look ” like the spam caught by traditional methods. We have implemented a prototype system to achieve high detection rate while having a less than 0.001 % false positive rate. 1.
Outside the Closed World: On Using Machine Learning For Network Intrusion Detection
- In Proceedings of the IEEE Symposium on Security and Privacy
, 2010
"... Abstract—In network intrusion detection research, one popular strategy for finding attacks is monitoring a network’s activity for anomalies: deviations from profiles of normality previously learned from benign traffic, typically identified using tools borrowed from the machine learning community. Ho ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Abstract—In network intrusion detection research, one popular strategy for finding attacks is monitoring a network’s activity for anomalies: deviations from profiles of normality previously learned from benign traffic, typically identified using tools borrowed from the machine learning community. However, despite extensive academic research one finds a striking gap in terms of actual deployments of such systems: compared with other intrusion detection approaches, machine learning is rarely employed in operational “real world ” settings. We examine the differences between the network intrusion detection problem and other areas where machine learning regularly finds much more success. Our main claim is that the task of finding attacks is fundamentally different from these other applications, making it significantly harder for the intrusion detection community to employ machine learning effectively. We support this claim by identifying challenges particular to network intrusion detection, and provide a set of guidelines meant to strengthen future research on anomaly detection. Keywords-anomaly detection; machine learning; intrusion detection; network security. I.
Behavioral characteristics of spammers and their network reachability properties
- In Proc. IEEE ICC
, 2006
"... Abstract — By analyzing a two-month trace of more than 25 million emails received at a large US university campus network, of which more than 18 million are spam messages, we characterize the spammer behavior at both the mail server and the network levels. We also correlate the arrivals of spam with ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Abstract — By analyzing a two-month trace of more than 25 million emails received at a large US university campus network, of which more than 18 million are spam messages, we characterize the spammer behavior at both the mail server and the network levels. We also correlate the arrivals of spam with the BGP route updates to study the network reachability properties of spammers. Among others, our significant findings are: (a) the majority of spammers (93 % of spam only mail servers and 58 % of spam only networks) send only a small number of spam messages (no more than 10); (b) the vast majority of both spam messages (91.7%) and spam only mail servers (91%) are from mixed networks that send both spam and non-spam messages; (c) the majority of both spam messages (68%) and spam mail servers (74%) are from a few regions of the IP address space (top 20 “/8” address spaces); (d) a large portion of spammers (81 % of spam only mail servers and 27 % of spam only networks) send spam only within a short period of time (no longer than one day out of the two months); and (e) network prefixes for a non-negligible portion of spam only networks (6%) are only visible for a short period of time (within 7 days), coinciding with the spam arrivals from these networks. We discuss the implications of the findings for the current anti-spam efforts, and more importantly, for the design of future email delivery architectures. I.
Combining email models for false positive reduction
- In KDD ’05: Proceeding of the 11th ACM SIGKDD international conference on Knowledge discovery in data mining
, 2005
"... Machine learning and data mining can be effectively used to model, classify and discover interesting information for a wide variety of data including email. The Email Mining Toolkit, EMT, has been designed to provide a wide range of analyses for arbitrary email sources. Depending upon the task, one ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Machine learning and data mining can be effectively used to model, classify and discover interesting information for a wide variety of data including email. The Email Mining Toolkit, EMT, has been designed to provide a wide range of analyses for arbitrary email sources. Depending upon the task, one can usually achieve very high accuracy, but with some amount of false positive tradeoff. Generally false positives are prohibitively expensive in the real world. In the case of spam detection, for example, even if one email is misclassified, this may be unacceptable if it is a very important email. Much work has been done to improve specific algorithms for the task of detecting unwanted messages, but less work has been report on leveraging multiple algorithms and correlating models in this particular domain of email analysis. EMT has been updated with new correlation functions allowing the analyst to integrate a number of EMT’s user behavior models available in the core technology. We present results of combining classifier outputs for improving both accuracy and reducing false positives for the problem of spam detection. We apply these methods to a very large email data set and show results of different combination methods on these corpora. We introduce a new method to compare multiple and combined classifiers, and show how it differs from past work. The method analyzes the relative gain and maximum possible accuracy that can be achieved for certain combinations of classifiers to automatically choose the best combination. Categories & Subject Descriptors: H.3.3 [Information Search and Retrieval]: Retrieval models,
Increasing the Accuracy of a Spam-Detecting Artificial Immune System
- IN CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2003), PROCEEDINGS
, 2003
"... ... This paper looks at application of the artificial immune system model to protect email users effectively from spam. In particular, it tests the spam immune system against the publicly available SpamAssassin corpus of spam and nonspam, and extends the original system by looking at several methods ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
... This paper looks at application of the artificial immune system model to protect email users effectively from spam. In particular, it tests the spam immune system against the publicly available SpamAssassin corpus of spam and nonspam, and extends the original system by looking at several methods of classifying email messages with the detectors produced by the immune system. The resulting system classifies the messages with similar accuracy to other spam filters, but uses fewer detectors to do so, making it an attractive solution for circumstances where processing time is at a premium.
Resisting Spam Delivery by TCP Damping
- In Proceedings of First Conference on Email and Anti-Spam (CEAS
, 2004
"... Spam has become a major problem that is threatening the efficiency of the current email system. Spam is overwhelming the Internet because 1) emails are pushed from senders to receivers without much control from recipients, and 2) the cost for delivering emails is very low. In this paper, we presen ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Spam has become a major problem that is threatening the efficiency of the current email system. Spam is overwhelming the Internet because 1) emails are pushed from senders to receivers without much control from recipients, and 2) the cost for delivering emails is very low. In this paper, we present an anti-spam framework that slows down spammers: by adding delay to email delivery, and by consuming more sender resources. Both delay and resource consumption are controlled based on the likelihood of the source of email messages being a spammer, so that our technique only impact the spammers and has negligible impact on normal email senders. The mechanisms are implemented in the TCP level at the recipient side without requiring any modifications at the sender side. Our evaluations show that selectively delaying connections can effectively slow down a spammer thousands of times when they use a simple setup or use open relays. The mechanism of increasing sender's resource consumption can significantly slow down spammers even when they are spamming from their own optimized servers.
Information asymmetry and thwarting spam
- In Proceedings of the 2004 MIT Spam Conference
, 2004
"... We explore a novel approach to spam based on economic rather than technological or regulatory screening mechanisms. Our first point is that mechanisms designed to promote valuable communication can often outperform those designed merely to block wasteful communication. Our second is to shift focus f ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We explore a novel approach to spam based on economic rather than technological or regulatory screening mechanisms. Our first point is that mechanisms designed to promote valuable communication can often outperform those designed merely to block wasteful communication. Our second is to shift focus from the information in the message to the information known to the sender. We can then use principles of information asymmetry to cause people who knowingly misuse communication to incur higher costs than those who do not. In certain cases, though not all, we can show this approach leaves recipients better off than even an idealized or “perfect ” filter that costs nothing and makes no mistakes. Our mechanism also accounts for individual differences in opportunity costs, and allows for bi-directional wealth transfers while facilitating both sender signaling and recipient screening. 1

