Share this page
Share this page E-mail this page Print this page RSS feeds
Home > Publications > Improving Spam Filtering by Detecting Gray Mail
Improving Spam Filtering by Detecting Gray Mail

We address the problem of gray mail – messages that could reasonably be considered either spam or good. Email users often disagree on this mail, presenting serious challenges to spam filters in both model training and evaluation. In this paper, we propose four simple methods for detecting gray mail and compare their performance using recall-precision curves. Among them, we found that email campaigns that have messages labeled differently are the most reliable source for learning a gray mail detector.

Preliminary experiments also show that even when the gray mail detector is imperfect, a traditional statistical spam filter can still be improved consistently in different regions of the ROC curve by incorporating this new information.

YihMcKo07.pdf
PDF file

In: Proceedings of the 4th Conference on Email and Anti-Spam

Publisher: CEAS
Copyright (c) 2007

Details

Type: Inproceedings