Theses Doctoral

Methods in functional data analysis and functional genomics

Backenroth, Daniel

This thesis has two overall themes, both of which involve the word functional, albeit in different contexts. The theme that motivates two of the chapters is the development of methods that enable a deeper understanding of the variability of functional data. The theme of the final chapter is the development of methods that enable a deeper understanding of the landscape of functionality across the human genome in different human tissues.
The first chapter of this thesis provides a framework for quantifying the variability of functional data and for analyzing the factors that affect this variability. We extend functional principal components analysis by modeling the variance of principal component scores. We pose a Bayesian model, which we estimate using variational Bayes methods. We illustrate our model with an application to a kinematic dataset of two-dimensional planar reaching motions by healthy subjects, showing the effect of learning on motion variability.
The second chapter of this thesis provides an alternative method for decomposing functional data that follows a Poisson distribution. Classical methods pose a latent Gaussian process that is then linked to the observed data via a logarithmic link function. We pose an alternative model that draws on ideas from non-negative matrix factorization, in which we constrain both scores and spline coefficient vectors for the functional prototypes to be non-negative. We impose smoothness on the functional prototypes. We estimate our model using the method of alternating minimization. We illustrate our model with an application to a dataset of accelerometer readings from elderly healthy Americans.
The third chapter of this thesis focuses on functional genomics, rather than functional data analysis. Here we pose a method for unsupervised clustering of functional genomics data. Our method is non-parametric, allowing for flexible modeling of the functional genomics data without binarization. We estimate our model using variational Bayes methods, and illustrate it by calculating genome-wide functional scores (based on a partition of our clusters into functional and non-functional clusters) for 127 different human tissues. We show that these genome-wide and tissue-specific functional scores provide state-of-the-art functional prediction.


  • thumnail for Backenroth_columbia_0054D_14357.pdf Backenroth_columbia_0054D_14357.pdf application/pdf 2.28 MB Download File

More About This Work

Academic Units
Thesis Advisors
Goldsmith, Jeff
Ph.D., Columbia University
Published Here
January 19, 2018