Theses Doctoral

# A Geometric Approach to Dynamical System: Global Analysis for Non-Convex Optimization

Xu, Ji

Non-convex optimization often plays an important role in many machine learning problems. Study the existing algorithms that aim to solve the non-convex optimization problems can help us understand the optimization problem itself and may shed light on developing more effective algorithms or methods. In this thesis, we study two popular non-convex optimization problems along with two popular algorithms.

The first pair is maximum likelihood estimation with the expectation maximization algorithm. Expectation Maximization (EM) is among the most popular algorithms for estimating parameters of statistical models. However, EM, which is an iterative algorithm based on the maximum likelihood principle, is generally only guaranteed to find stationary points of the likelihood objective, and these points may be far from any maximizer. We address this disconnect between the statistical principles behind EM and its algorithmic properties.

Specifically, we provide a global analysis of EM for specific models in which the observations comprise an i.i.d.~sample from a mixture of two Gaussians. This is achieved by (i) studying the sequence of parameters from idealized execution of EM in the infinite sample limit, and fully characterizing the limit points of the sequence in terms of the initial parameters; and then (ii) based on this convergence analysis, establishing statistical consistency (or lack thereof) for the actual sequence of parameters produced by EM.

The second pair is phase retrieval problem with approximate message passing algorithm. Specifically, we consider an $\ell_2$-regularized non-convex optimization problem for recovering signals from their noisy phaseless observations. We design and study the performance of a message passing algorithm that aims to solve this optimization problem. We consider the asymptotic setting $m,n \rightarrow \infty$, $m/n \rightarrow \delta$ and obtain sharp performance bounds, where $m$ is the number of measurements and $n$ is the signal dimension. We show that for complex signals the algorithm can perform accurate recovery with only $m= \frac{64}{\pi^2}-4 \approx 2.5n$ measurements. The sharp analyses in this paper enable us to compare the performance of our method with other phase recovery schemes.

Finally, the convergence analysis of the iterative algorithms are done by a geometric approach to dynamical systems. By analyzing the movements from iteration to iteration, we provide a general tool that can show global convergence for many two dimensional dynamical systems. We hope this can shed light on convergence analysis for general dynamical systems.