Speaker John Langford
Host Francis Bach
Date recorded 5 December 2013
Machine Learning does magical things when it starts interacting with and changing the world, yet most algorithms are not designed to do this. Systematically gathering the right data is the first order problem for learning with interaction. One simplistic example of this is ad recommendation where a high recommendation implies high placement which implies high click-through which implies high recommendation.... creating a self-fulfilling prophecy. This talk is about how to systematically avoid these problems by effectively (re)using randomization to engage in controlled exploration for learning algorithms. With these techniques, we can exponentially reduce the amount of exploration required, test many policies offline, and repurpose our existing learning algorithms to directly solve for optimal policies.
©2013 Microsoft Corporation. All rights reserved.