Proving Programs Robust

Abstract

We present a program analysis for verifying quantitative robustness properties of programs, stated generally as: “If the inputs of a program are perturbed by an arbitrary amount \epsilon, then its outputs change at most by K \epsilon, where K can depend on the size of the input but not its value.” Robustness properties generalize the analytic notion of continuity—e.g., while the function ex is continuous, it is not robust. Our problem is to verify the robustness of a function P that is coded as an imperative program, and can use diverse data types and features such as branches and loops.

Our approach to the problem soundly decomposes it into two subproblems: (a) verifying that the smallest possible perturbations to the inputs of P do not change the corresponding outputs significantly, even if control now flows along a different control path; and (b) verifying the robustness of the computation along each control-flow path of P. To solve the former subproblem, we build on an existing method for verifying that a program encodes a continuous function [7]. The latter is solved using a static analysis that bounds the magnitude of the slope of any function computed by a control flow path of P. The outcome is a sound program analysis for robustness that uses proof obligations which do not refer to \epsilon-changes and can often be fully automated using off-the-shelf SMT-solvers.

We identify three application domains for our analysis. First, our analysis can be used to guarantee the predictable execution of embedded control software, whose inputs come from physical sources and can suffer from error and uncertainty. A guarantee of robustness ensures that the system does not react disproportionately to such uncertainty. Second, our analysis is directly applicable to approximate computation, and can be used to provide foundations for a recently-proposed program approximation scheme called loop perforation. A third application is in database privacy: proofs of robustness of queries are essential to differential privacy, the most popular notion of privacy for statistical databases.