Bandit convex optimization is a special case of online convex optimization
with partial information. In this setting, a player attempts to minimize a
sequence of adversarially generated convex loss functions, while only observing
the value of each function at a single point. In some cases, the minimax regret
of these problems is known to be strictly worse than the minimax regret in the
corresponding full information setting. We introduce the multi-point bandit
setting, in which the player can query each loss function at multiple points.
When the player is allowed to query each function at two points, we prove regret
bounds that closely resemble bounds for the full information case. This suggests
that knowing the value of each loss function at two points is almost as useful as
knowing the value of each function everywhere. When the player is allowed to
query each function at *d+1* points (*d* being the dimension of the
space), we prove regret bounds that are exactly equivalent to full information
bounds for smooth functions.