“NOFUS: Automatically Detecting” + String.fromCharCode(32) + “ObFuSCateD “.toLowerCase() + “JavaScript Code”

  • Scott Kaplan ,
  • Ben Livshits ,
  • ,
  • Christian Siefert ,
  • Charlie Cursinger

MSR-TR-2011-57 |

Obfuscation is applied to large quantities of benign and malicious JavaScript throughout the web. In situations where JavaScript source code is being submitted for widespread use, such as in a gallery of browser extensions (e.g., Firefox), it is valuable to require that the code submitted is not obfuscated and to check for that property. In this paper, we describe NoFus, a static, automatic classifier that distinguishes obfuscated and non-obfuscated JavaScript with high precision. Using a collection of examples of both obfuscated and non-obfuscated JavaScript, we train NoFus to distinguish between the two and show that the classifier has both a low false positive rate (about 1%) and low false negative rate (about 5%).

Applying NoFus to collections of deployed JavaScript, we show it correctly identifies obfuscated JavaScript files from Alexa top 50 websites. While prior work conflates obfuscation with maliciousness (assuming that detecting obfuscation implies maliciousness), we show that the correlation is weak. Yes, much malware is hidden using obfuscation, but so is benign JavaScript. Further, applying NoFus to known JavaScript malware, we show our classifier finds 15% of the files are unobfuscated, showing that not all malware is obfuscated.