Theses Doctoral

Synergizing human-machine intelligence: Visualizing, labeling, and mining the electronic health record

Lee, Noah

We live in a world where data surround us in every aspect of our lives. The key challenge for humans and machines is how we can make better use of such data. Imagine what would happen if you were to have intelligent machines that could give you insight into the data. Insight that will enable you to better 1) reason about, 2) learn, and 3) understand the underlying phenomena that produced the data. The possibilities of combined human-machine intelligence are endless and will impact our lives in ways we can not even imagine today.

Synergistic human-machine intelligence aims to facilitate the analytical reasoning and inference process of humans by creating machines that maximize a human's ability to 1) reason about, 2) learn, and 3) understand large, complex, and heterogeneous data. Combined human-machine intelligence is a powerful symbiosis of mutual benefit, in which we depend on the computational capabilities of the machine for the tasks we are not good at, and the machine requires human intervention for the tasks it performs poorly on.

This relationship provides a compelling alternative to either approach in isolation for solving today's and tomorrow's arising data challenges. In his regard, this dissertation proposes a diverse analytical framework that leverages synergistic human-machine intelligence to maximize a human's ability to better 1) reason about, 2) learn, and 3) understand different biomedical imaging and healthcare data present in the patient's electronic health record (EHR). Correspondingly, we approach the data analyses problem from the 1) visualization, 2) labeling, and 3) mining perspective and demonstrate the efficacy of our analytics on specific application scenarios and various data domains.

In the first part of this dissertation we explore the question how we can build intelligent imaging analytics that are commensurate with human capabilities and constraints, specifically for optimizing data visualization and automated labeling workflows. Our journey starts with heuristic rule-based analytical models that are derived from task-specific human knowledge. From this experience, we move on to data-driven analytics, where we adapt and combine the intelligence of the model based on prior information provided by the human and synthetic knowledge learned from partial data observations. Within this realm, we propose a novel Bayesian transductive Markov random field model that requires minimal human intervention and is able to cope with scarce label information to learn and infer object shapes in complex spatial, multimodal, spatio-temporal, and longitudinal data. We then study the question how machines can learn discriminative object representations from dense human provided label information by investigating learning and inference mechanisms that make use of deep learning architectures. The developed analytics can aid visualization and labeling tasks, which enables the interpretation and quantification of clinically relevant image information.

The second part explores the question how we can build data-driven analytics for exploratory analysis in longitudinal event data that are commensurate with human capabilities and constraints. We propose human-intuitive analytics that enable the representation and discovery of interpretable event patterns to ease knowledge absorption and comprehension of the employed analytics model and the underlying data. We propose a novel doubly-constrained convolutional sparse-coding framework that learns interpretable and shift-invariant latent temporal event patterns. We apply the model to mine complex event data in EHRs. By mapping the event space to heterogeneous patient encounters in the EHR we explore the linkage between healthcare resource utilization (HRU) in relation to disease severity. This linkage may help to better understand how disease specific co-morbidities and their clinical attributes incur different HRU patterns. Such insight helps to characterize the patient's care history, which then enables the comparison against clinical practice guidelines, the discovery of prevailing practices based on common HRU group patterns, and the identification of outliers that might indicate poor patient management.


  • thumnail for Lee_columbia_0054D_10267.pdf Lee_columbia_0054D_10267.pdf application/pdf 22.8 MB Download File

More About This Work

Academic Units
Biomedical Engineering
Thesis Advisors
Laine, Andrew F.
Ph.D., Columbia University
Published Here
June 20, 2011