Representation Dependence Testing Using Program Inversion

Foundations of Software Engineering (FSE) |

Published by Association for Computing Machinery, Inc.

The definition of a data structure may permit many different concrete representations of the same logical content. A (client) program that accepts such a data structure as input is said to have a representation dependence if its behavior differs for logically equivalent input values. In this paper, we present a methodology and tool for automated testing of clients of a data structure for representation dependence. In the proposed methodology, the developer expresses the logical equivalence by writing a normalization program f that maps each concrete representation to a canonical one. Our solution relies on automatically synthesizing the one-to-many inverse function of f: given an input value x, we can generate multiple test inputs logically equivalent to x by executing the inverse with the canonical value f(x) as input repeatedly. We present an inversion algorithm for restricted classes of normalization programs including programs mapping arrays to arrays in a typical iterative manner. We present a prototype implementation of the algorithm, and demonstrate how our methodology reveals bugs due to representation dependence in open source software such as Open Office and Picasa using the widely used image format Tiff. Tiff is a challenging case study for our approach.