A Data Driven Approach for Algebraic Loop Invariants

MSR-TR-2012-97 |

We describe a new algorithm GUESS-AND-CHECK for computing algebraic equation invariants. These invariants are of the form wedge i fi(x1,…,xn =0,, where each fi is a polynomial over the variables x1, … , xn of the program. Two novel features of our algorithm are:- (1) it is data driven, that is, invariants are derived from data generated from concrete executions of the program, and (2) it is sound and complete, that is, the algorithm terminates with the correct invariant if it exists. The GUESS-AND-CHECK algorithm proceeds iteratively in two phases. The “guess” phase uses linear algebra techniques to efficiently derive a candidate invariant from data. This candidate invariant is subsequently validated in a “check” phase. If the check phase fails to validate the candidate invariant, then the proof of invalidity is used to generate more data and this is used to find better candidate invariants in a subsequent guess phase. Therefore, GUESS-AND-CHECK iteratively invokes the guess and check phases until it finds the desired invariant. In terms of implementation, off-the-shelf linear algebra solvers are used to implement the guess phase, and off-the-shelf decision procedures are used to implement the check phase.
We have evaluated our technique on a number of benchmark programs from recent papers on invariant generation. Our results are encouraging – we are able to efficiently compute algebraic invariants in all cases.