2025 Theses Doctoral
Quantifying Peptide-Specific Immune Surveillance via HLA-I in Viral Infection & Cancer
Cancer is an enormous global health burden, affecting millions of people each year. It is estimated that 15–20% of human cancers are attributable to infections, predominantly by carcinogenic viruses, with incidence varying worldwide. In order to investigate the role that viral infection plays in cancer, we conducted a comparative analysis of virus-positive and virus-negative tumors across nine cancers linked to five viruses. We observed a higher frequency of virus-positive tumors in males, along with notable geographic differences in incidence.
Genomic analyses of 1,971 tumors revealed that virus-positive cases generally exhibit a lower somatic mutation burden, distinct mutation signatures, and characteristic driver gene alterations. Clinical trial analyses of PD-(L)1 inhibitors suggest that virus positivity is associated with higher treatment response rates, particularly in gastric cancer and head and neck squamous cell carcinoma, both of which also demonstrate increased CD8+ T cell infiltration and T cell receptor clonal selection in virus-positive tumors.In order to further explore the role of the immune system in virus-associated cancers, we developed HLAScope, a metric designed to quantify immune surveillance at a granular, peptide-subspace specific level via HLA class I (HLA-I) molecules. Effective immune detection and elimination of abnormal cells, critical for cancer and infectious disease outcomes, relies on the presentation of intracellular peptides by HLA-I to cytotoxic T cells. HLAScope integrates HLA genotypes, peptides derived from selected proteomes, and binding affinity predictions to produce individual-level coverage scores across user-defined peptide spaces.
We demonstrate the utility of HLAScope across several contexts. When applied to the Epstein-Barr virus (EBV) proteome, HLAScope reveals that EBV-associated cancers, including nasopharyngeal carcinoma, natural killer/T-cell lymphoma, and classical Hodgkin’s lymphoma, exhibit reduced coverage of key viral proteins, such as LMP1 and LMP2 in EBV latency phases II/III. Extending this analysis to 73 EBV-associated conditions in the UK Biobank, HLAScope identifies a greater number of statistically significant associations between HLA-I peptide coverage and disease risk, recapitulating known links (e.g., diffuse large B cell lymphoma) and uncovering new ones (e.g., mononucleosis and multiple sclerosis). In two independent non-small cell lung cancer cohorts, HLAScope scores derived from the human proteome correlate significantly with improved survival in patients treated with immunotherapy, identifying associations unseen through existing genomic metrics such as tumor mutational burden, HLA-I zygosity, and HLA evolutionary divergence.
These results illustrate epidemiological, genetic, and therapeutic insights across virus-associated malignancies and establish HLAScope as a robust framework for quantifying immune visibility.
Subjects
Files
This item is currently under embargo. It will be available starting 2027-10-29.
More About This Work
- Academic Units
- Physics
- Thesis Advisors
- Rabadan, Raul
- Degree
- Ph.D., Columbia University
- Published Here
- November 19, 2025