Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Learning Semantic String Transformations from Examples

Rishabh Singh and Sumit Gulwani

Abstract

We address the problem of performing semantic transformations on strings, which may represent a variety of data-types (or their combination) such as a column in a relational table, time, date, currency, etc. Unlike syntactic transformations, which are based on regular expressions and which interpret a string as a sequence of characters, semantic transformations additionally require exploiting the semantics of the data-type represented by the string, which may be encoded as a database of relational tables. Manually performing such transformations on a large collection of strings is error prone and cumbersome, while programmatic solutions are beyond the skill-set of end-users. We present a programming-by-example technology that allows end-users to automate such repetitive tasks.

We describe an expressive transformation language for semantic manipulation that combines table lookup operations and syntactic manipulations. We then present a synthesis algorithm that can learn all transformations in the language that are consistent with the user-provided set of input-output examples. We have implemented this technology as an add-in for the Microsoft Excel Spreadsheet system and have evaluated it successfully over several benchmarks picked from various Excel help-forums.

Details

Publication typeTechReport
NumberMSR-TR-2012-5
PublisherMicrosoft Research
> Publications > Learning Semantic String Transformations from Examples