Share this page
Share this page E-mail this page Print this page RSS feeds
Home > Publications > Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization
Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization

We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term. Traditional online algorithms, such as stochastic gradient descent, has limited capability of exploiting the problem structure, and their inherent low accuracy often makes it hard to obtain the desired regularization effects, e.g., sparsity under $\ell_1$-regularization. In this paper, we develop extensions of Nesterov's dual averaging method, that can explicitly exploit the regularization structure in an online setting. For general convex regularizations, we show that the regularized dual averaging method achieves the optimal convergence rate $O(1/\sqrt{t})$, where $t$ is the number of iterations or samples in an online algorithm. For strongly convex regularizations, we develop a variant that has a faster convergence rate $O(\ln t/t)$. Computational experiments are presented for the special case of $\ell_1$-regularized online learning.

msr_tr_2009_100.pdf
PDF file

Details

Type: TechReport
Number: MSR-TR-2009-100