Theses Doctoral

Understanding and Reducing Clinical Data Biases

Fort, Daniel

The vast amount of clinical data made available by pervasive electronic health records presents a great opportunity for reusing these data to improve the efficiency and lower the costs of clinical and translational research. A risk to reuse is potential hidden biases in clinical data. While specific studies have demonstrated benefits in reusing clinical data for research, there are significant concerns about potential clinical data biases.
This dissertation research contributes original understanding of clinical data biases. Using research data carefully collected from a patient community served by our institution as the reference standard, we examined the measurement and sampling biases in the clinical data for selected clinical variables. Our results showed that the clinical data and research data had similar summary statistical profiles, but that there were detectable differences in definitions and measurements for variables such as height, diastolic blood pressure, and diabetes status. One implication of these results is that research data can complement clinical data for clinical phenotyping. We further supported this hypothesis using diabetes as an example clinical phenotype, showing that integrated clinical and research data improved the sensitivity and positive predictive value.

Subjects

Files

  • thumnail for Fort_columbia_0054D_12490.pdf Fort_columbia_0054D_12490.pdf application/pdf 1.09 MB Download File

More About This Work

Academic Units
Biomedical Informatics
Thesis Advisors
Weng, Chunhua
Degree
Ph.D., Columbia University
Published Here
February 2, 2015