Theses Doctoral

Leveraging Infrared Imaging with Machine Learning for Phenotypic Profiling

Liu, Xinwen

Phenotypic profiling systematically maps and analyzes observable traits (phenotypes) exhibited in cells, tissues, organisms or systems in response to various conditions, including chemical, genetic and disease perturbations. This approach seeks to comprehensively understand the functional consequences of perturbations on biological systems, thereby informing diverse research areas such as drug discovery, disease modeling, functional genomics and systems biology.

Corresponding techniques should capture high-dimensional features to distinguish phenotypes affected by different conditions. Current methods mainly include fluorescence imaging, mass spectrometry and omics technologies, coupled with computational analysis, to quantify diverse features such as morphology, metabolism and gene expression in response to perturbations. Yet, they face challenges of high costs, complicated operations and strong batch effects. Vibrational imaging offers an alternative for phenotypic profiling, providing a sensitive, cost-effective and easily operated approach to capture the biochemical fingerprint of phenotypes. Among vibrational imaging techniques, infrared (IR) imaging has further advantages of high throughput, fast imaging speed and full spectrum coverage compared with Raman imaging. However, current biomedical applications of IR imaging mainly concentrate on "digital disease pathology", which uses label-free IR imaging with machine learning for tissue pathology classification and disease diagnosis.

The thesis contributes as the first comprehensive study of using IR imaging for phenotypic profiling, focusing on three key areas. First, IR-active vibrational probes are systematically designed to enhance metabolic specificity, thereby enriching measured features and improving sensitivity and specificity for phenotype discrimination. Second, experimental workflows are established for phenotypic profiling using IR imaging across biological samples at various levels, including cellular, tissue and organ, in response to drug and disease perturbations. Lastly, complete data analysis pipelines are developed, including data preprocessing, statistical analysis and machine learning methods, with additional algorithmic developments for analyzing and mapping phenotypes.

Chapter 1 lays the groundwork for IR imaging by delving into the theory of IR spectroscopy theory and the instrumentation of IR imaging, establishing a foundation for subsequent studies.

Chapter 2 discusses the principles of popular machine learning methods applied in IR imaging, including supervised learning, unsupervised learning and deep learning, providing the algorithmic backbone for later chapters. Additionally, it provides an overview of existing biomedical applications using label-free IR imaging combined with machine learning, facilitating a deeper understanding of the current research landscape and the focal points of IR imaging for traditional biomedical studies.

Chapter 3-5 focus on applying IR imaging coupled with machine learning for novel application of phenotypic profiling. Chapter 3 explores the design and development of IR-active vibrational probes for IR imaging. Three types of vibrational probes, including azide, 13C-based probes and deuterium-based probes are introduced to study dynamic metabolic activities of protein, lipids and carbohydrates in cells, small organisms and mice for the first time. The developed probes largely improve the metabolic specificity of IR imaging, enhancing the sensitivity of IR imaging towards different phenotypes.

Chapter 4 studies the combination of IR imaging, heavy water labeling and unsupervised learning for tissue metabolic profiling, which provides a novel method to map metabolic tissue atlas in complex mammalian systems. In particular, cell type-, tissue- and organ-specific metabolic profiles are identified with spatial information in situ. In addition, this method further captures metabolic changes during brain development and characterized intratumor metabolic heterogeneity of glioblastoma, showing great promise for disease modeling.

Chapter 5 developed Vibrational Painting (VIBRANT), a method using IR imaging, multiplexed vibrational probes and supervised learning for cellular phenotypic profiling of drug perturbations. Three IR-active vibrational probes were designed to measure distinct essential metabolic activities in human cancer cells. More than 20,000 single-cell drug responses were collected, corresponding to 23 drug treatments. Supervised learning is used to accurately predict drug mechanism of action at single-cell level with minimal batch effects. We further designed an algorithm to discover drug candidates with novel mechanisms of action and evaluate drug combinations. Overall, VIBRANT has demonstrated great potential across multiple areas of phenotypic drug screening.


  • thumnail for Liu_columbia_0054D_18605.pdf Liu_columbia_0054D_18605.pdf application/pdf 6.8 MB Download File

More About This Work

Academic Units
Thesis Advisors
Min, Wei
Ph.D., Columbia University
Published Here
July 10, 2024