Static Analysis of Heap-Manipulating Low-Level Software
Abstract
This paper describes a static (intraprocedural) analysis for analyzing
heap-manipulating programs (in presence of recursive data structures
and pointer arithmetic) in languages like C or low-level code. This
analysis can be used for checking memory-safety, memory leaks,
and user specified assertions.
We first propose a rich abstract domain for representing useful invariants about
such programs. This abstract domain allows representation of must and may
equalities among pointer expressions. The integer variables used in the
pointer expressions can be existentially as well as universally
quantified and can have constraints over some base domain.
We allow quantification of a special form,
namely $\exists \forall$ quantification.
This choice has been made to balance expressiveness with
efficient automated deduction.
The existential quantification is over some ghost variables of
programs, which are automatically made explicit by our analysis
to express useful program invariants.
The universal quantifier is used to express properties of collections of
memory locations.
We then show how to perform sound abstract interpretation over this abstract
domain. We give transfer functions for performing join, meet, and
postcondition operations over this abstract domain.
The basis of all these operations is an
abstract interpreter for the quantifier-free base constraint domain
(eg., the conjunctive domain of linear arithmetic combined with
uninterpreted functions). To our knowledge, this is the
first abstract interpreter that can automatically deduce first-order logic
invariants in programs (without requiring any explicit predicates).
We also present initial experimental results
demonstrating the effectiveness of our ideas
on some common coding patterns.