Tree Dependent Identically Distributed Learning

Tony Jebara; Philip M. Long

Tree Dependent Identically Distributed Learning
Jebara, Tony
Long, Philip M.
Technical reports
Computer Science
Center for Computational Learning Systems
Persistent URL:
Columbia University Computer Science Technical Reports
Part Number:
Department of Computer Science, Columbia University
Publisher Location:
New York
We view a dataset of points or samples as having an underlying, yet unspecified, tree structure and exploit this assumption in learning problems. Such a tree structure assumption is equivalent to treating a dataset as being tree dependent identically distributed or tdid and preserves exchange-ability. This extends traditional iid assumptions on data since each datum can be sampled sequentially after being conditioned on a parent. Instead of hypothesizing a single best tree structure, we infer a richer Bayesian posterior distribution over tree structures from a given dataset. We compute this posterior over (directed or undirected) trees via the Laplacian of conditional distributions between pairs of input data points. This posterior distribution is efficiently normalized by the Laplacian's determinant and also facilitates novel maximum likelihood estimators, efficient expectations and other useful inference computations. In a classification setting, tdid assumptions yield a criterion that maximizes the determinant of a matrix of conditional distributions between pairs of input and output points. This leads to a novel classification algorithm we call the Maximum Determinant Machine. Unsupervised and supervised experiments are shown.
Computer science
Item views
text | xml
Suggested Citation:
Tony Jebara, Philip M. Long, , Tree Dependent Identically Distributed Learning, Columbia University Academic Commons, .

Columbia University Libraries | Policies | FAQ