Algorithms for Sparse Linear Classifiers in the Massive Data Setting

Balakrishnan, Suhrid; Madigan, David B.; Bartlett, Peter

Classifiers favoring sparse solutions, such as support vector machines, relevance vector machines, LASSO-regression based classifiers, etc., provide competitive methods for classification problems in high dimensions. However, current algorithms for training sparse classifiers typically scale quite unfavorably with respect to the number of training examples. This paper proposes online and multi-pass algorithms for training sparse linear classifiers for high dimensional data. These algorithms have computational complexity and memory requirements that make learning on massive data sets feasible. The central idea that makes this possible is a straightforward quadratic approximation to the likelihood function.


Also Published In

Journal of Machine Learning Research

More About This Work

Academic Units
MIT Press
Published Here
May 15, 2014