Synthesizing Number Transformations from Input-Output Examples
Abstract
Numbers are one of the most widely used data type in programming
languages. Number transformations like formatting and rounding present
a challenge even for experienced programmers as they find it difficult
to remember different number format strings supported by different
programming languages. These transformations present an even bigger
challenge for end-users of spreadsheet systems like Microsoft Excel
where providing such custom format strings is beyond their
expertise. In our extensive case study of help forums of many
programming languages and Excel, we found that both programmers and
end-users struggle with these number transformations, but are able to
easily express their intent using input-output examples.
In this paper, we present a framework that can learn such number
transformations from very few input-output examples. We first describe
an expressive number transformation language that can model these
transformations, and then present an inductive synthesis algorithm
that can learn all expressions in this language that are consistent
with a given set of examples. We also present a ranking scheme of
these expressions that enables efficient learning of the desired
transformation from very few examples. By combining our inductive
synthesis algorithm for number transformations with an inductive
synthesis algorithm for syntactic string transformations, we are able
to obtain an inductive synthesis algorithm for manipulating data types
that have numbers as a constituent sub-type such as date, unit, and
time. We have implemented our algorithms as an Excel add-in and have
evaluated it successfully over several benchmarks obtained from the
help forums and the Excel product team.