Academic Commons

Theses Doctoral

Characterizing molecular drivers of clinical outcome in pediatric acute leukemias by systems biology and machine learning

Alloy, Alexandre Paul

Acute leukemias are the main type of malignancy affecting children. They are defined by their precursor cell lineage: myeloid lineage for acute myeloid leukemia (AML) and lymphoid lineage for acute lymphoblastic leukemia (ALL). In this thesis, we use systems biology approaches to characterize transcription factor (TF) programs that define novel AML subtypes. We combine this approach with machine learning methods to group patients sharing similar TF programs and risk-stratify them. We identify a 9-cluster solution with statistically significant survival differences ranging from 84% for the best group to 41% for the worst. Each of the clusters is composed of patients with various cytogenetic aberrations that would not necessarily have been classified together. We identify top aberrantly activated TFs and potential master regulators or drug targets in each cluster. We also propose a novel stratification for FLT3-ITD patients with no other cytogenetic abnormalities. These patients are currently all classified as high-risk; however, we find a low-risk subtype and identify a TF signature that is predictive of risk in this subtype. Finally, we develop a binary classifier that is able to stratify the patients into two risk groups. We find that the activity of a large cluster of HOXA TFs is highly correlated with poor prognosis.

In the second part, we characterize some mechanisms of relapse in B-ALL at a single-cell resolution focusing again on the patterns of activation and deactivation of TF activity in the course of the disease in matched trios of samples (diagnosis, remission and relapse). After a discussion on some of the technical aspects of differentiating normal cells and leukemic cells at a single-cell RNA sequencing resolution, we perform computational pseudo-lineage reconstruction based on groups of TFs whose activities rise and fall together through pseudotime. We find that each patient has unique mechanisms at the earliest pseudotimes but they seem to converge at the later pseudotimes into signatures in which the B-cell identity (in the case of B-ALL) gradually fades away. We also identify small populations of cells isolated at diagnosis in the later pseudotimes which is consistent with the view that many of the persistent cells in ALL pre-exist the malignancy and are selected by the treatment.

This novel systems biology approach for characterizing clinical outcome in patients and defining lineage reconstruction identifies biochemical mechanisms and signaling pathways that are responsible for the development and maintenance of the malignancy and identifies potential therapeutic targets. The results exposed in this thesis will lead to a better understanding of some of the inner workings of pediatric acute leukemias and may lead to the development of improved targeted therapies.


  • thumnail for Alloy_columbia_0054D_16462.pdf Alloy_columbia_0054D_16462.pdf application/pdf 7.61 MB Download File

More About This Work

Academic Units
Biological Sciences
Thesis Advisors
Califano, Andrea
Ph.D., Columbia University
Published Here
June 2, 2021