Theses Doctoral

When Can Nonconvex Optimization Problems be Solved with Gradient Descent? A Few Case Studies

Gilboa, Dar

Gradient descent and related algorithms are ubiquitously used to solve optimization problems arising in machine learning and signal processing. In many cases, these problems are nonconvex yet such simple algorithms are still effective. In an attempt to better understand this phenomenon, we study a number of nonconvex problems, proving that they can be solved efficiently with gradient descent. We will consider complete, orthogonal dictionary learning, and present a geometric analysis allowing us to obtain efficient convergence rate for gradient descent that hold with high probability. We also show that similar geometric structure is present in other nonconvex problems such as generalized phase retrieval.

Turning next to neural networks, we will also calculate conditions on certain classes of networks under which signals and gradients propagate through the network in a stable manner during the initial stages of training. Initialization schemes derived using these calculations allow training recurrent networks on long sequence tasks, and in the case of networks with low precision activation functions they make explicit a tradeoff between the reduction in precision and the maximal depth of a model that can be trained with gradient descent.

We finally consider manifold classification with a deep feed-forward neural network, for a particularly simple configuration of the manifolds. We provide an end-to-end analysis of the training process, proving that under certain conditions on the architectural hyperparameters of the network, it can successfully classify any point on the manifolds with high probability given a sufficient number of independent samples from the manifold, in a timely manner. Our analysis relates the depth and width of the network to its fitting capacity and statistical regularity respectively in early stages of training.


  • thumnail for Gilboa_columbia_0054D_16160.pdf Gilboa_columbia_0054D_16160.pdf application/pdf 872 KB Download File

More About This Work

Academic Units
Neurobiology and Behavior
Thesis Advisors
Wright, John N.
Ph.D., Columbia University
Published Here
September 21, 2020