Theses Doctoral

Using electronic health records to understand COVID-19 risks

Ramlall, Vijendra

On December 31, 2019, a new disease, which would in due time would come to be identified as COVID-19, was reported to the World Health Organization. During the two and a half years since the emergence of COVID-19 and the more than two years since the start of the COVID-19 pandemic, which is caused by infection of SARS-CoV-2, more than 500 million cases have been reported around the world with more than six million deaths attributed it with than 85 million cases and more than one million deaths from the United States of America. This novel disease has had profound economic, political, public health and social impact in the United States and around the world. Subsequent research, both concurrent and ongoing, throughout the pandemic has been necessary to identify population at risk of SARS-CoV-2 infection, severe disease, beneficial treatments, death and long-term complications. Clinical data, sourced from electronic health records, had been paramount to identifying these risks.

The novelty of SARS-CoV-2 and COVID-19 brought uncertainty as to who was at risk of infection, who was at risk for death, how should patients be treated and what are the long-term effects. At the start of the pandemic, there was a focus on public health measures, such as proper hygiene, quarantining when sick and reducing close contacts. As the number of cases continued to rise and hospitals became inundated with patients, researchers set out to identify patients at risk for severe disease and death and to identify existing treatment options that may benefit patients who were hospitalized and suffering from severe disease. Clinical trials and on-going retrospective analysis of patients helped to identify beneficial treatments for patients as well as rule out treatments that were not beneficial or associated with negative outcomes.

In one of our studies were identified patients who had a history of macular degeneration and coagulation disorders were at increased risk for severe disease and death as a result of COIVD-19 and identified variants in gene underpinning the inflammatory response as associated with altered risk. In another study using retrospective analysis, we utilized clinical data to identify patients who were intubation and investigated the effect of steroid hormone exposure on the survival of these patients. Our analysis indicated that exposure to melatonin between intubation and extubation was significantly associated with survival in COVID-19 patients and in mechanically ventilated COVID-19 patients. This association was observed when accounting for patient demographics and previous clinical history.

As multiple vaccines have been developed and distributed and therapeutics have become widely available, surges in case counts have not been associated with a proportional rise in hospitalizations and death. Research has shifted to trying to understand the long-term impact of COVID-19 on the health of patients. While viral infections are not uncommon, some can have lasting impacts on patients. With more than 500 million cases reported worldwide long-term analysis of COVID-19 patients and their health after COVID-19 will remain important. Additionally, the incomplete success of vaccination campaigns also highlights the need to monitor any future endemic spikes. While clinical data has been important for conducting studies, they are incomplete and lead to challenges as we transition to an endemic state. To that end, we trained a random forest classifier to assign a probability of a patient having had COVID-19 during each of their visits and utilized these probabilities to identify clinical phenotypes that are associated with patients who had COVID-19. Within one year, our analysis identified myocardial infarction, urinary tract infection, type 2 diabetes and acute renal failure as being associated with higher probabilities of COVID-19.

The projects presented here demonstrate how to use electronic health records to identify patients at risk for severe disease and death, monitor drug exposure and evaluate its effect on survival of patients with severe COVID-19, how to use machine learning to circumvent the limitations of using clinical data and sets a foundation for further work in identifying the effects of COVID-19. Moreover, these projects also show methods that can be applied to any future emerging disease.

Geographic Areas


  • thumnail for Ramlall_columbia_0054D_17669.pdf Ramlall_columbia_0054D_17669.pdf application/pdf 3.48 MB Download File

More About This Work

Academic Units
Cellular Physiology and Biophysics
Thesis Advisors
Tatonetti, Nicholas P.
Ph.D., Columbia University
Published Here
February 15, 2023