On Network-level Clusters for Spam Detection

Zhiyun Qian, Zhuoqing Mao, Yinglian Xie, and Fang Yu


IP-based blacklist is an effective way to filter spam

emails. However, building and maintaining individual

IP addresses in the blacklist is difficult, as new mali-

cious hosts continuously appear and their IP addresses

may also change over time. To mitigate this problem,

researchers have proposed to replace individual IP ad-

dresses in the blacklist with IP clusters, e.g., BGP clus-

ters. In this paper, we closely examine the accuracy of

IP-cluster-based approaches to understand their effec-

tiveness and fundamental limitations. Based on such

understanding, we propose and implement a new clus-

tering approach that considers both network origin and

DNS information, and incorporate it with SpamAssas-

sin, a popular spam filtering system widely used today.

Applying our approach to a 7-month email trace col-

lected at a large university department, we can reduce

the false negative rate by 50% compared with directly

applying various public IP-based blacklists without in-

creasing the false positive rate. Furthermore, using hon-

eypot email accounts and real user accounts, we show

that our approach can capture 30% - 50% of the spam

emails that slip through SpamAssassin today.


Publication typeInproceedings
Published inThe 17th Annual Network and Distributed System Security Symposium (NDSS) 2010
> Publications > On Network-level Clusters for Spam Detection