Learning mixtures of product distributions over discrete domains

Feldman, Jon; O'Donnell, Ryan; Servedio, Rocco Anthony

We consider the problem of learning mixtures of product distributions over discrete domains in the distribution learning framework introduced by Kearns et al. [18]. We give a poly(n/ε) time algorithm for learning a mixture of k arbitrary product distributions over the n-dimensional Boolean cube {0,1}n to accuracy ε, for any constant k. Previous polynomial time algorithms could only achieve this for k = 2 product distributions; our result answers an open question stated independently in [8] and [14]. We further give evidence that no polynomial time algorithm can succeed when k is superconstant, by reduction from a notorious open problem in PAC learning. Finally, we generalize our poly(n/ε) time algorithm to learn any mixture of k = O(1) product distributions over {0, 1, . . . , b}n, for any b = O(1).



More About This Work

Academic Units
Computer Science
Department of Computer Science, Columbia University
Columbia University Computer Science Technical Reports, CUCS-029-05
Published Here
April 21, 2011