Prospective prediction of PTSD diagnosis in a nationally representative sample using machine learning

Worthington, Michelle A.; Mandavia, Amar; Richardson-Vejlgaard, Randall

Recent research has identified a number of pre-traumatic, peri-traumatic and post-traumatic psychological and ecological factors that put an individual at increased risk for developing PTSD following a life-threatening event. While these factors have been found to be associated with PTSD in univariate analyses, the complex interactions of these risk factors and how they contribute to individual trajectories of the illness are not yet well understood. In this study, we examine the impact of prior trauma, psychopathology, sociodemographic characteristics, community and environmental information, on PTSD onset in a nationally representative sample of adults in the United States, using machine learning methods to establish the relative contributions of each variable.

Individual risk factors identified in Waves 1 of the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC) were combined with community-level data for the years concurrent to the NESARC Wave 1 (n = 43,093) and 2 (n = 34,653) surveys. Machine learning feature selection and classification analyses were used at the national level to create models using individual- and community-level variables that would best predict the new onset of PTSD at Wave 2.

Our classification algorithms yielded 89.7 to 95.6% accuracy for predicting new onset of PTSD at Wave 2. A prior diagnosis of DSM-IV-TR Borderline Personality Disorder, Major Depressive Disorder or Anxiety Disorder conferred the greatest relative influence in new diagnosis of PTSD. Distal risk factors such as prior psychiatric diagnosis accounted for significantly greater relative risk than proximal factors (such as adverse event exposure).

Our findings show that a machine learning classification approach can successfully integrate large numbers of known risk factors for PTSD into stronger models that account for high-dimensional interactions and collinearity between variables. We discuss the implications of these findings as pertaining to the targeted mobilization emergency mental health resources. These findings also inform the creation of a more comprehensive risk assessment profile to the likelihood of developing PTSD following an extremely adverse event.


  • thumnail for 12888_2020_Article_2933.pdf 12888_2020_Article_2933.pdf application/pdf 402 KB Download File

Also Published In

More About This Work

Published Here
September 22, 2023


Post-traumatic stress disorder, Machine learning, Nationally representative