Determinize, Solve, and Generalize: Classical Planning for MDP Heuristics

ICAPS 2009 Workshop on Heuristics for Domain Independent Planning |

Published by AAAI Press

Heuristics make MDP solvers practical by reducing their space and memory requirements. Some of the most effective heuristics (e.g. the FF heuristic) first determinize the MDP to a classical approximation and then solve a relaxation of the resulting classical problem (e.g., one which ignores the ac-tions’ delete effects). While these heuristics can be computed quite quickly, they frequently yield overly-optimistic value estimates. This paper proposes a novel class of heuristics, called THUDS, which improve on the existing methods by using full-fledged classical planners to solve the non-relaxed deter-minizations. THUDS produces more informative state value estimates than those given by the FF heuristic, causing many fewer states to be explored. Of course, invoking a determin-istic planner can be very slow; to overcome this high cost THUDS generalizes the heuristic value of one state to many others by extracting basis functions from the plans discov-ered in the process of heuristic computation. Thus, the clas-sical planner is only called for states without basis functions — amortizing its costly invocation. Experiments show that THUDS can provide large time and memory savings com-pared to the FF heuristic and that generalization is vital in making THUDS computationally feasible.