Share this page
Share this page E-mail this page Print this page RSS feeds
Home > Projects > S-GPS: Spammer Global Positioning System
S-GPS: Spammer Global Positioning System

This project focuses on spammer identification rather than spam identification, and we seek to identify zombie-based spammers. We explore host network properties (for example: proxy/NAT servers, dynamically assigned IP addresses), and correlate such fine-grained information with other data and traces for detection. We emphasize that spammer identification at the network level is independent of spam content and is often straightforward to integrate with existing filtering frameworks.

UDMap: Usage-based Dynamic IP-address Map

We developed a novel method, called UDmap, to identify dynamically assigned IP addresses and analyze their dynamics pattern. UDmap is fully automatic, and relies only on application-level server logs that are already available today.

We applied UDmap to a month-long Hotmail user-login trace and identified a large number of dynamic IP addresses -- more than 102 million.By correlating the inferred dynamic IP addresses with Hotmail’s email server log pertaining to three consecutive months, we were able to establish that 97% of mail servers setup on dynamic IPs sent out solely spam emails, likely controlled by zombies. Moreover, these mail servers sent out a large amount of spam -- counting towards over 42% of all spam emails to Hotmail. These results highlight the importance of being able to accurately identify dynamic IP addresses for spam filtering and we suspect of similar benefits for phishing site identification and Botnet detection.

                  

Top 10 ASes with most number of dynamic IP addresses

 

AutoRE: Signature-based Spamming Botnet Detection

We developed AutoRE, a spam signature generation framework that detects and characterizes spamming botnets by leveraging both spam payload and spam server traffic properties.  AutoRE does not require pre-classified training data or white lists. Moreover, it outputs high quality regular expression signatures that can detect botnet spam with a low false positive rate.

Our in-depth analysis of the identified botnets revealed several interesting findings regarding the degree of email obfuscation, properties of botnet IP addresses, sending patterns, and their correlation with network scanning traffic. We believe these observations are useful information in the design of botnet detection schemes.

                   

 Distribution of the botnet hosts around the globe

BotGraph: Large-scale Spamming Botnet Detection

Network security applications often require analyzing huge volumes of data to identify abnormal patterns or activities. The emergence of cloud-computing models opens up new opportunities to address this challenge by leveraging the power of parallel computing.

We design and implement a novel system, called BotGraph, to detect a new type of botnet spamming attacks targeting major Web email providers. BotGraph uncovers the correlations among botnet activities by constructing large user-user graphs and looking for tightly connected subgraph components. This enables us to identify stealthy botnet users that are hard to detect when viewed in isolation. To deal with the huge data volume, we implement BotGraph as a distributed application on a computer cluster. We believe both our graph-based approach and our implementations are generally applicable to a wide class of security applications for analyzing large datasets.

Publications