Several popular machine learning problems including neural network training lead to non-convex optimization problems. In general, non-convex optimization is NP-hard and in theory, standard optimization approaches can lead to arbitrarily poor solutions. However, in practice, there are several success stories for non-convex optimization with deep learning shining the brightest. Owing to this practical success, there are several recent theoretical advances that try to explain and understand this phenomenon.
In this tutorial, we will discuss some of these recent advances. In particular, we will divide the tutorial in two parts: a) in first part, we will discuss results that show how non-convex optimization problems can be solved "optimally" when the data is "nice" and is sampled from a parametric model, b) in second part, we will discuss results that consider general non-convex optimization problem and show how we can avoid poor saddle points of the problem. As the field is still in a nascent stage, the tutorial will be peppered with several fundamental open questions.
Microsoft Research, Bangalore, India