Fast and Precise Sanitizer Analysis with BEK

  • Pieter Hooimeijer ,
  • Ben Livshits ,
  • David Molnar ,
  • Prateek Saxena ,

USENIX Security'11 |

Web applications often use special string-manipulating sanitizers on untrusted user data, but it is difficult to reason manually about the behavior of these functions, leading to errors. For example, the Internet Explorer crosssite scripting filter turned out to transform some web pages without JavaScript into web pages with valid Java-Script, enabling attacks. In other cases, sanitizers may fail to commute, rendering one order of application safe and the other dangerous. BEK is a language and system for writing sanitizers that enables precise analysis of sanitizer behavior, including checking idempotence, commutativity, and equivalence. For example, BEK can determine if a target string, such as an entry on the XSS Cheat Sheet, is a valid output of a sanitizer. If so our analysis synthesizes an input string that yields the target. Our language is expressive enough to capture real web sanitizers used in ASP.NET, the Internet Explorer XSS Filter, and the Google Auto-Escape framework, which we demonstrate by porting these sanitizers to BEK. Our analyses use a novel symbolic finite automata representation to leverage fast satisfiability modulo theories (SMT) solvers and are quick in practice, taking fewer than two seconds to check the commutativity of the entire set of Internet Exporer XSS filters, between 36 and 39 seconds to check implementations of HTMLEncode against target strings from the XSS Cheat Sheet, and less than ten seconds to check equivalence between all pairs of a set of implementations of HTMLEncode. Programs written in BEK can be compiled to traditional languages such as JavaScript and C#, making it possible for web developers to write sanitizers supported by deep analysis, yet deploy the analyzed code directly to real applications.