Accelerating Gradient Descent - Nesterov's Method

Name: Accelerating Gradient Descent - Nesterov's Method
Start: 2013-06-17T19:00:00-07:00
End: 2013-06-17T20:00:00-07:00
Location: Hacker Dojo

Hosted By

Mike B.

Accelerating Gradient Descent - Nesterov's Method

Details

Gradient descent gets used to solve a myriad of machine learning problems - fast svm, regularized regression, reduced rank matrix approximation, deep belief networks, etc. Nesterov's method can be used to speed up convergence for any of these. It's simple to apply. We'll run through some simple code that illustrates why gradient descent can sometimes be slow and then demonstrates the speedup with Nesterov's method. Warning - this will be a little nerdy but extremely useful if your algo is too slow or won't converge.

Some References:

princeton - main theorem
http://blogs.princeton.edu/imabandit/2013/04/01/acceleratedgradientdescent/

princeton - background and needed lemma.
http://blogs.princeton.edu/imabandit/2013/03/28/smoothfunctions/

Derivation for more general function - (Tseng 2008)
google - On Accelerated Proximal Gradient Methods for Convex-Concave

In slide form (UCLA EE236)
google - 7. Fast proximal gradient methods

Events in Mountain View, CA