Brian Guenter, Dong Yu, Adam Eversole, Oleksii Kuchaiev, and Michael L. Seltzer
We introduce the stochastic gradient descent algorithm used in the computational network toolkit (CNTK) — a general purpose machine learning toolkit written in C++ for training and using models that can be expressed as a computational network. We describe the algorithm used to compute the gradients automatically for a given network. We also propose a low-cost automatic learning rate selection algorithm and demonstrate that it works well in practice.
In OPT2013: NIPS Workshop on Optimization for Machine Learning