Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Explore-Exploit Learning

This is an umbrella project for our activity in machine learning with exploration-exploitation tradeoff: the trade-off between acquiring new information (exploration) and capitalizing on the information available so far (exploitation). Such problems have been studied extensively in Machine Learning, Theoretical Computer Science, Operations Research and Economics. This mature, yet very active, research area is known under several names: "multi-armed bandits", "contextual bandits", "associative reinforcement learning", and "counterfactual evaluation", among others.

Most of us are located in Microsoft Research New York City.

We are involved in Multi-World Testing: an approach & system for contextual bandit learning.

Related project: Bandits @MSR-SVC [inactive].