Earl T. Barr, Christian Bird, and Mark Marron
The program heap is fundamentally a simple mathematical concept - a set of objects and a connectivity relation on them. However, a large gap exists between the set of heap structures that could be constructed and those that programmers actually build. To understand this gap, we empirically study heap structures and sharing relations in large object-oriented programs. To scale and make sense of real world heaps, any analysis must employ abstraction; our abstraction groups sets of objects by role and the aliasing present in pointer sets. We fifind that the heaps of real-world programs are, in practice, fundamentally simple structures that are largely constructed from a small number of simple structures and sharing idioms, such as the sharing of immutable or unique (e.g. singleton) objects. For instance, we find that, under our abstraction, 53-75% of pointers build tree structures and we classify all but 7-18% of aliasing pointers. These results provide actionable information for rethinking the design of annotation systems, memory allocation/collection, and program analyses.
In Proceedings of the International Symposium on Software Testing and Analysis, 2013
Earl T. Barr, Christian Bird, and mark marron. Collecting a Heap of Shapes, , November 2011.