When Can we Trust Progress Estimators for SQL Queries?

SIGMOD |

Published by Association for Computing Machinery, Inc.

The problem of estimating progress for long-running queries
has recently been introduced. We analyze the characteristics
of the progress estimation problem, from the perspective of
providing robust, worst-case guarantees. Our first result
is that in the worst case, no progress estimation algorithm
can yield anything even moderately better than the trivial
guarantee that identifies the progress as lying between 0%
and 100%. In such cases, we introduce an estimator that
can optimally bound the error. By placing different types of
restrictions on the data and query characteristics, we show
that it is possible to design effective progress estimators with
small error bounds. We show where previous solutions lie
in this spectrum. We then demonstrate empirically that
these “good” scenarios are common in practice and discuss
possible ways of combining the estimators.