Zozzle: Low-overhead Mostly Static JavaScript Malware Detection

JavaScript malware-based attacks now account for a large fraction of successful mass-scale exploitation happening today. From the standpoint of the attacker, the attraction is that these drive-by attacks that can be mounted against an unsuspecting user visiting a seemingly innocent web page. While several techniques for addressing these types of exploits have been proposed, in-browser adoption has been slow, in part because of the performance overhead these methods tend to incur.

In this paper, we propose Zozzle, a low-overhead solution for detecting and preventing JavaScript malware that can be deployed in the browser. Our approach uses Bayesian classification of hierarchical features of the JavaScript abstract syntax tree to identify syntax elements that are highly predictive of malware. Our extensive experimental evaluation shows that Zozzle is able to effectively detect JavaScript malware through mostly static code analysis with very low false positive rates (usually fractions of 1%), and with a typical overhead of only 2-5 milliseconds per JavaScript file.

Our experience also suggests that Zozzle may be used as a lightweight filter for a more costly detection technique or for standalone offline malware detection.