FAST: A Transducer-Based Language for Tree Manipulation

MSR-TR-2013-121 |

Tree automata and tree transducers are used in a wide range of applications in software engineering, from XML processing to language type-checking. While these formalisms are of immense practical use, they can only model finite alphabets, and since many real-world applications operate over infinite domains such as integers, this is often a limitation. To overcome this problem we augment tree automata and transducers with symbolic alphabets represented as parametric theories. Admitting infinite alphabets makes these models more general and succinct than their classical counterparts. Despite this, we show how the main operations, such as composition and language equivalence, remain computable given a decision procedure for the alphabet theory. We introduce a high-level language called FAST that acts as a front-end for the above formalisms. FAST supports symbolic alphabets through tight integration with state-of-the-art satisfiability modulo theory (SMT) solvers. We demonstrate our techniques on practical case studies, covering a wide range of applications.