2022 Theses Doctoral
Learning predictive models from menstrual cycle data
Despite being a physiological phenomenon that impacts billions of womxn worldwide, menstruation has long been understudied. In this dissertation, we first explore the menstrual characteristics of nearly 380,000 womxn, as collected via a self-tracking mobile health (mHealth) app, Clue. We examine how variation in menstrual cycle length is related to volatility in other experienced symptoms, helping to debunk the idea that menstrual cycles should be 'regular.'
We then develop predictive models for menstruation utilizing this dataset, demonstrating first how a fully generative model that explicitly accounts for the possibility that self-tracked data may be flawed in terms of reliability can both outperform baselines and aid in the detection of self-tracking artifacts (i.e., instances where a user supposedly did not experience a period event, but in reality forgot or otherwise neglected to track it). Finally, we explore a hierarchical, deep generative model for symptom tracking, where we utilize a deep neural network to learn per-user parameters for tracking and retain a mechanism for modeling per-user likelihood of adherence.
We find that leveraging symptom data at the time series level allows us to predict occurrence of next bleeding and non-bleeding tracking events with high accuracy. This work demonstrates the great potential that large-scale mHealth data holds to better understanding menstruation as a whole, as well as the importance of treating such data carefully.
Subjects
Files
- Li_columbia_0054D_17193.pdf application/pdf 23 MB Download File
More About This Work
- Academic Units
- Applied Physics and Applied Mathematics
- Thesis Advisors
- Wiggins, Chris H.
- Degree
- Ph.D., Columbia University
- Published Here
- April 27, 2022