Explore-Exploit Learning
This is an umbrella project for our activity in machine learning with exploration-exploitation tradeoff: the trade-off between acquiring new information (exploration) and capitalizing on the information available so far (exploitation). Such problems have been studied extensively in Machine Learning, Theoretical Computer Science, Operations Research and Economics. This mature, yet very active, research area is known as "multi-armed bandits", "contextual bandits", "associative reinforcement learning", and "counterfactual evaluation", among others.

Most of us are located in Microsoft Research New York City.

Internal project page (Microsoft-only).
Related project: Bandits @MSR-SVC.

Publications