Long Tran-Thanh, Lampros Stavrogiannis, Victor Naroditskiy, Valentin Robu, Nicholas R Jennings, and Peter Key
We study the problem of an advertising agent who needs to intelligently distribute her budget across a sequence of online keyword bidding auctions. We assume the closing price of each auction is governed by the same unknown distribution, and study the problem of making provably optimal bidding decisions. Learning the distribution is done under censored observations, i.e. the closing price of an auction is revealed only if the bid we place is above it. We consider three algorithms, namely "First, Greedy ProductLimit (GPL) and LuekerLearn, respectively, and we show that these algorithms provably achieve Hannan-consistency. In particular, we show that the regret bound of "First is at most O(T 2) with high probability. For
3 the other two algorithms, we first prove that, by using a censored data distribution estimator proposed by Zeng , the empirical distribution of the closing market price converges in probability to its true distribution with a O(p1) rate, where t is the number of updates. Basedt on this result, we prove pthat both GPL and LuekerLearn achieve O(T) regret bound with high probability. This in fact provides an affirmative answer to the research question raised in . We also evaluate the abovementioned algorithms using real bidding data, and show that although GPL achieves the best performance on average (up to 90% of the optimal solution), its long running time may limit its suitability in practice. By contrast, LuekerLearn and "First proposed in this paper achieve up to 85% of the optimal, but with an exponential reduction in computational complexity (a saving up to 95%, compared to GPL).
|Published in||uai2014, 30th Conf. on Uncertainty in AI|