Zhiyun Qian, Zhuoqing Mao, Yinglian Xie, and Fang Yu
IP-based blacklist is an effective way to filter spam emails. However, building and maintaining individual IP addresses in the blacklist is difficult, as new mali- cious hosts continuously appear and their IP addresses may also change over time. To mitigate this problem, researchers have proposed to replace individual IP ad- dresses in the blacklist with IP clusters, e.g., BGP clus- ters. In this paper, we closely examine the accuracy of IP-cluster-based approaches to understand their effec- tiveness and fundamental limitations. Based on such understanding, we propose and implement a new clus- tering approach that considers both network origin and DNS information, and incorporate it with SpamAssas- sin, a popular spam filtering system widely used today. Applying our approach to a 7-month email trace col- lected at a large university department, we can reduce the false negative rate by 50% compared with directly applying various public IP-based blacklists without in- creasing the false positive rate. Furthermore, using hon- eypot email accounts and real user accounts, we show that our approach can capture 30% - 50% of the spam emails that slip through SpamAssassin today.
|Published in||The 17th Annual Network and Distributed System Security Symposium (NDSS) 2010|