First Order Methods for Large-Scale Sparse Optimization
Necdet Serhat Aybat
- First Order Methods for Large-Scale Sparse Optimization
- Aybat, Necdet Serhat
- Thesis Advisor(s):
- Iyengar, Garud N.
- Industrial Engineering and Operations Research
- Permanent URL:
- Ph.D., Columbia University.
- In today's digital world, improvements in acquisition and storage technology are allowing us to acquire more accurate and finer application-specific data, whether it be tick-by-tick price data from the stock market or frame-by-frame high resolution images and videos from surveillance systems, remote sensing satellites and biomedical imaging systems. Many important large-scale applications can be modeled as optimization problems with millions of decision variables. Very often, the desired solution is sparse in some form, either because the optimal solution is indeed sparse, or because a sparse solution has some desirable properties. Sparse and low-rank solutions to large scale optimization problems are typically obtained by regularizing the objective function with L1 and nuclear norms, respectively. Practical instances of these problems are very high dimensional (~ million variables) and typically have dense and ill-conditioned data matrices. Therefore, interior point based methods are ill-suited for solving these problems. The large scale of these problems forces one to use the so-called first-order methods that only use gradient information at each iterate. These methods are efficient for problems with a "simple" feasible set such that Euclidean projections onto the set can be computed very efficiently, e.g. the positive orthant, the n-dimensional hypercube, the simplex, and the Euclidean ball. When the feasible set is "simple", the subproblems used to compute the iterates can be solved efficiently. Unfortunately, most applications do not have "simple" feasible sets. A commonly used technique to handle general constraints is to relax them so that the resulting problem has only "simple" constraints, and then to solve a single penalty or Lagrangian problem. However, these methods generally do not guarantee convergence to feasibility. The focus of this thesis is on developing new fast first-order iterative algorithms for computing sparse and low-rank solutions to large-scale optimization problems with very mild restrictions on the feasible set - we allow linear equalities, norm-ball and conic inequalities, and also certain non-smooth convex inequalities to define the constraint set. The proposed algorithms guarantee that the sequence of iterates converges to an optimal feasible solution of the original problem, and each subproblem is an optimization problem with a "simple" feasible set. In addition, for any eps > 0, by relaxing the feasibility requirement of each iteration, the proposed algorithms can compute an eps-optimal and eps-feasible solution within O(log(1/eps)) iterations which requires O(1/eps) basic operations in the worst case. Algorithm parameters do not depend on eps > 0. Thus, these new methods compute iterates arbitrarily close to feasibility and optimality as they continue to run. Moreover, the computational complexity of each basic operation for these new algorithms is the same as that of existing first-order algorithms running on "simple" feasible sets. Our numerical studies showed that only O(log(1/eps)) basic operations, as opposed to O(1/eps) worst case theoretical bound, are needed for obtaining eps-feasible and eps-optimal solutions. We have implemented these new first-order methods for the following problem classes: Basis Pursuit (BP) in compressed sensing, Matrix Rank Minimization, Principal Component Pursuit (PCP) and Stable Principal Component Pursuit (SPCP) in principal component analysis. These problems have applications in signal and image processing, video surveillance, face recognition, latent semantic indexing, and ranking and collaborative filtering. To best of our knowledge, an algorithm for the SPCP problem that has O(1/eps) iteration complexity and has a per iteration complexity equal to that of a singular value decomposition is given for the first time.
- Item views: