Beyond Newton's method

Thomas P. Minka
(2000, revised 2013)

Newton's method for optimization is equivalent to iteratively maximizing a local quadratic approximation to the objective function. But some functions are not well-approximated by a quadratic, leading to slow convergence, and some have turning points where the curvature changes sign, leading to failure. To fix this, we can use a more appropriate choice of local approximation than quadratic, based on the type of function we are optimizing. This paper demonstrates three such generalized Newton rules. Like Newton's method, they only involve the first two derivatives of the function, yet converge faster and fail less often.

PDF (color, 88K)