Static Analysis of Heap-Manipulating Low-Level Software

MSR-TR-2006-160 |

This paper describes a static (intraprocedural) analysis for analyzing heap-manipulating programs (in presence of recursive data structures and pointer arithmetic) in languages like C or low-level code. This analysis can be used for checking memory-safety, memory leaks, and user specified assertions. We first propose a rich abstract domain for representing useful invariants about such programs. This abstract domain allows representation of must and may equalities among pointer expressions. The integer variables used in the pointer expressions can be existentially as well as universally quantified and can have constraints over some base domain. We allow quantification of a special form, namely ∃ ∀ quantification. This choice has been made to balance expressiveness with efficient automated deduction. The existential quantification is over some ghost variables of programs, which are automatically made explicit by our analysis to express useful program invariants. The universal quantifier is used to express properties of collections of memory locations. We then show how to perform sound abstract interpretation over this abstract domain. We give transfer functions for performing join, meet, and postcondition operations over this abstract domain. The basis of all these operations is an abstract interpreter for the quantifier-free base constraint domain (eg., the conjunctive domain of linear arithmetic combined with uninterpreted functions). To our knowledge, this is the first abstract interpreter that can automatically deduce first-order logic invariants in programs (without requiring any explicit predicates). We also present initial experimental results demonstrating the effectiveness of our ideas on some common coding patterns.