Theses Doctoral

Is the way forward to step back? A meta-research analysis of misalignment between goals, methods, and conclusions in epidemiologic studies.

Kezios, Katrina Lynn

Recent discussion in the epidemiologic methods and teaching literatures centers around the importance of clearly stating study goals, disentangling the goal of causation from prediction (or description), and clarifying the statistical tools that can address each goal. This discussion illuminates different ways in which mismatches can occur between study goals, methods, and interpretations, which this dissertation synthesizes into the concept of “misalignment”; misalignment occurs when the study methods and/or interpretations are inappropriate for (i.e., do not match) the study’s goal. While misalignments can occur and may cause problems, their pervasiveness and consequences have not been examined in the epidemiologic literature. Thus, the overall purpose of this dissertation was to document and examine the effects of misalignment problems seen in epidemiologic practice.

First, a review was conducted to document misalignment in a random sample of epidemiologic studies and explore how the framing of study goals contributes to its occurrence. Among the reviewed articles, full alignment between study goals, methods, and interpretations was infrequently observed, although “clearly causal” studies (those that framed causal goals using causal language) were more often fully aligned (5/13, 38%) than “seemingly causal” ones (those that framed causal goals using associational language; 3/71, 4%).

Next, two simulation studies were performed to examine the potential consequences of different types of misalignment problems seen in epidemiologic practice. They are based on the observation that, often, studies that are causally motivated perform analyses that appear disconnected from, or “misaligned” with, their causal goal.

A primary aim of the first simulation study was to examine goal--methods misalignment in terms of inappropriate variable selection for exposure effect estimation (a causal goal). The main difference between predictive and causal models is the conceptualization and treatment of “covariates”. Therefore, exposure coefficients were compared from regression models built using different variable selection approaches that were either aligned (appropriate for causation) or misaligned (appropriate for prediction) with the causal goal of the simulated analysis. The regression models were characterized by different combinations of variable pools and inclusion criteria to select variables from the pools into the models. Overall, for valid exposure effect estimation in a causal analysis, the creation of the variable pool mattered more than the specific inclusion criteria, and the most important criterion when creating the variable pool was to exclude mediators.

The second simulation study concretized the misalignment problem by examining the consequences of goal--method misalignment in the application of the structured life course approach, a statistical method for distinguishing among different causal life course models of disease (e.g., critical period, accumulation of risk). Although exchangeability must be satisfied for valid results using this approach, in its empirical applications, confounding is often ignored. These applications are misaligned because they use methods for description (crude associations) for a causal goal (identifying causal processes). Simulations were used to mimic this misaligned approach and examined its consequences. On average, when life course data was generated under a “no confounding” scenario - an unlikely real-world scenario - the structured life course approach was quite accurate in identifying the life course model that generated the data. However, in the presence of confounding, the wrong underlying life course model was often identified. Five life course confounding structures were examined; as the complexity of examined confounding scenarios increased, particularly when this confounding was strong, incorrect model selection using the structured life course approach was common.

The misalignment problem is recognized but underappreciated in the epidemiologic literature. This dissertation contributes to the literature by documenting, simulating, and concretizing problems of misalignment in epidemiologic practice.


  • thumnail for Kezios_columbia_0054D_16400.pdf Kezios_columbia_0054D_16400.pdf application/pdf 20 MB Download File

More About This Work

Academic Units
Thesis Advisors
Schwartz, Sharon B.
Factor-Litvak, Pam
Ph.D., Columbia University
Published Here
March 8, 2021