Margus Veanes, David Molnar, Benjamin Livshits, and Lubomir Litchev
Security sanitizers have long been known to be very difficult to implement correctly. Moreover, with the rise of the web, developers need string manipulating functions in both "server" and "client" languages. Hand-writing these functions separately is an open invitation to bugs. At the same time, auto-generated code will not be accepted unless it is significantly faster than previous hand-written code. We address this problem with two complementary approaches centered around bek, a domain-specific language for writing complex string manipulation routines.
We have implemented our code generation pipeline for bek code corresponding to several real string sanitizers. We use an automatic testing approach to compare our generated code to the original C# implementations and found no semantic deviations. Our generated C# code outperforms the previous hand-tuned code by a factor of up to 2.5. For C code with SIMD, we see speedups of 2.5 times compared to native C code for the same sanitizer.