Partially Observable
Markov Decision Process Representation
State
t
Action(t)
Action(t+1)
Reward
t
Observe
t
Observe
t+1
Reward
(t+1)
State
t+2
State
t+1
Eric Horvitz, April 5, 2003