Share on Facebook Tweet on Twitter Share on LinkedIn Share by email
Towards Stochastic Conjugate Gradient Methods

Nicol N. Schraudolph and Thore Graepel


The method of conjugate gradients provides a very effective way to optimize large, deterministic systems by gradient descent. In its standard form, however, it is not amenable to stochastic approximation of the gradient. Here we explore a number of ways to adopt ideas from conjugate gradient in the stochastic setting, using fast Hessian-vector products to obtain curvature information cheaply. In our benchmark experiments the resulting highly scalable algorithms converge about an order of magnitude faster than ordinary stochastic gradient descent.


Publication typeInproceedings
Published inProceedings of the 9th International Conference on Neural Information Processing, ICONIP 2002
> Publications > Towards Stochastic Conjugate Gradient Methods