Several popular machine learning problems including neural network training lead to non-convex optimization problems. In general, non-convex optimization is NP-hard and in theory, standard optimization approaches can lead to arbitrarily poor solutions. However, in practice, there are several success stories for non-convex optimization with deep learning shining the brightest. Owing to this practical success, there are several recent theoretical advances that try to explain and understand this phenomenon.
In this tutorial, we will discuss some of these recent advances. In particular, we will divide the tutorial in two parts: a) in first part, we will discuss results that show how non-convex optimization problems can be solved "optimally" when the data is "nice" and is sampled from a parametric model, b) in second part, we will discuss results that consider general non-convex optimization problem and show how we can avoid poor saddle points of the problem. As the field is still in a nascent stage, the tutorial will be peppered with several fundamental open questions.

Prateek Jain

Prateek Jain is a member of the Machine Learning and Optimization and the Algorithms and Data Sciences Group at Microsoft Research, Bangalore, India. He has been there since January 2010. Prateek's research interests are in machine learning, large-scale (non-convex optimization), and statistical learning theory. He also interested in applications of machine learning to privacy, computer vision, text mining and natural language processing. He completed his PhD in December 2009 at the University of Texas at Austin under Prof. Inderjit S. Dhillon, and BTech in Computer Science from IIT Kanpur in 2004.