Theses Doctoral

A computational framework leveraging machine learning and multi-omics data for characterizing response to immunotherapy

Park, Cameron Young

Immunotherapy has transformed cancer treatments in recent years; however, responses vary between tumor types and among patients with the same disease (1). Single-cell RNA sequencing (scRNA-seq), which measures gene expression at the resolution of individual cells, is a promising technology for understanding the nuances and complexity of the tumor microenvironment. The recent expansion of single-cell datasets to include not only RNA, but also protein, TCR, and spatial transcriptomics over multiple time points in longitudinal clinical cohorts presents an exciting opportunity to characterize the mechanisms underlying response and resistance to immunotherapy. With so many new data modalities, there exists an unmet need for computational methods and frameworks to best interpret this data and uncover the relevant biology.

This dissertation incorporates both method development and application to better understand the mechanisms of response and resistance to immunotherapy in different hematological malignancies. Throughout this work, we aim to answer the fundamental question, why do some patients respond to treatment and others do not?

In Aim 1, we develop DIISCO, a Bayesian machine learning method based on Gaussian Process Regression Networks that infers cell-cell interactions in time-series single-cell data. We show the interpretability of DIISCO in simulated data and new data collected from T cells co cultured with lymphoma cells, demonstrating its potential to uncover dynamic cell–cell cross talk. Characterizing this cross talk is crucial in better understanding the complexities underlying the tumor microenvironment.

In Aim 2, applying DIISCO in conjunction with other computational tools, we systematically analyze an established immunotherapy, donor lymphocyte infusion (DLI), in patients with relapsed acute and chronic myeloid leukemia (AML/CML). We identify clonally expanded ZNF683+ CD8+ cytotoxic T lymphocytes (CTLs) with in vitro specificity for patient-matched Acute Myeloid Leukemia (AML) originating primarily from the DLI product.

In Aim 3, we use spatial transcriptomics to look at relapsed/refractory Diffuse Large B Cell Lymphoma (R/R DLBCL) with the goal to characterize the spatial organization of patients’ tumor microenvironment before Chimeric Antigen Receptor T-cell therapy (CAR-T). Durable response to CAR-T is varied among R/R DLBCL patients, and even with a small sample size, we already note differences in colocalization, with nonresponder patients showing higher autocorrelation of immune compartments.

In Aims 2 and 3, we demonstrate the power of machine learning to generate hypotheses and drive clinical research. In both AML and DLBCL, this work identifies differences in the patient’s own tumor microenvironment as a potential determinant of effective response, opening opportunities for improved cellular therapy. As we consider next steps, both on the direct horizon of these projects as well as next steps in the broader context of this field, we cannot understate the importance of experimental work in validating and building upon computational results. Collectively, this dissertation provides a framework for showcasing how biology can inform method development, and method development can generate novel biological hypotheses.

Files

This item is currently under embargo. It will be available starting 2027-05-09.

More About This Work

Academic Units
Biomedical Engineering
Thesis Advisors
Azizi, Elham
Degree
Ph.D., Columbia University
Published Here
May 14, 2025