Approximation Algorithms for Correlated Knapsacks and Non-Martingale Bandits

In the stochastic knapsack problem, we are given a knapsack with size B, and a set of jobs whose sizes and rewards are drawn from a known probability distribution. However, the only way to know the actual size and reward is to schedule the job—when it completes, we get to know these values. How should we schedule jobs to maximize the expected total reward? We know constant-factor approximations for this problem when we assume that rewards and sizes are independent random variables, and that we cannot prematurely cancel jobs after we schedule them. What can we say when either or both of these assumptions are dropped?

Not only is the stochastic knapsack problem of interest in its own right, but techniques developed for it are applicable to other stochastic packing problems. Indeed, ideas for this problem have been useful for budgeted learning problems, where one is given several arms which evolve in a specified stochastic fashion with each pull, and the goal is to pull the arms a total of B times to maximize the reward obtained. Much recent work on this problem focus on the case when the evolution of the arms follows a martingale, i.e., when the expected reward from the future is the same as the reward at the current state. However, what can we say when the rewards do not form a martingale?

We give constant-factor approximation algorithms for the stochastic knapsack problem with correlations and cancelations, and also for some budgeted learning problems where the martingale condition is not satisfied, using similar ideas. Indeed, we can show that previously proposed linear programming relaxations for these problems have large integrality gaps. We propose new time-indexed LP relaxations; using a decomposition and “shifting” approach, we convert these fractional solutions to distributions over strategies, and then use the LP values and the time ordering information from these strategies to devise a randomized scheduling algorithm. We hope our LP formulation and decomposition methods may provide a new way to address other correlated bandit problems with more general contexts.

This is joint work with Anupam Gupta, Ravishankar Krishnaswamy and Marco Molinaro, all from CMU. The paper is available at http://arxiv.org/abs/1102.3749

Speaker Details

Professor R. Ravi is Carnegie Bosch Professor of Operations Research and Computer Science at Carnegie Mellon University. Ravi received his bachelor’s from IIT, Madras, and Master’s and doctoral degrees from Brown University, all in Computer Science. He has been at the Tepper School of Business since 1995 where he served as the Associate Dean for Intellectual Strategy from 2005-2008. Ravi’s main research interests are in Combinatorial Optimization (particularly in Approximation Algorithms), Computational Molecular Biology and Electronic Commerce. He currently serves on the editorial boards of Management Science and the ACM Transactions on Algorithms.

Date:
Speakers:
R. Ravi
Affiliation:
Tepper School of Business
    • Portrait of Jeff Running

      Jeff Running