Theses Doctoral

Multiple Imputation for Handling Missing Data of Covariates in Meta-Regression

Diaz Yanez, Karina Gabriela

The term meta-analysis refers to the quantitative process of statistically combining results of studies in order to draw overall trends found in a research literature. This technique has become the preferred form of systematic review in fields such as social science and education. As the method has become more standard, the number of large meta-analyses has expanded in these fields as well. Accordingly, the purpose of meta-analysis has expanded to explaining the variation of effect sizes across studies using meta-regression. Unfortunately, missing data is a common problem in meta-analysis. Particularly in meta-regression, missing data problems are frequently related to missing covariates.

When not handled properly, missing covariates in meta-regression can impact the precision of statistical inferences and thus the precision of systematic reviews. Ad hoc methods such as complete-case analysis and shifting units of analysis are the most common approaches to address missing data in meta-analysis. These techniques, to some extent, ignore missing values which in turn can lead to biased estimates. The use of model-based methods for missing data are more justifiable than ad hoc approaches. However, its application in meta-analysis is very limited. Multiple imputation is one of these approaches. Its precision relies mainly on how missing values are imputed. Standard multiple imputation approaches do not consider imputations that are compatible with meta-regression and thus can still yield biased estimates.

This dissertation addresses these issues by firstly assessing the performance of standard multiple imputation methods in the meta-regression context through a simulation study. To later develop compatible multiple imputations that accommodate features of meta-regression assuming dependent effect sizes.

Results show that even though multiple imputation methods can accurately estimate missing data in meta-regression, its accuracy decreases with larger missingness rates and when missingness is strongly related to effect sizes. This study also revealed that, in general, the developed compatible multiple imputation method outperforms standard multiple imputations. These findings also hold for cases in which missingness in a covariate is highly related to the effect size estimates. Finally, an algorithm that allows practitioners to apply compatible imputations in meta-regression was implemented using the R software language.


  • thumnail for DiazYanez_columbia_0054D_16495.pdf DiazYanez_columbia_0054D_16495.pdf application/pdf 1.18 MB Download File

More About This Work

Academic Units
Measurement and Evaluation
Thesis Advisors
Keller, Bryan Sean
Ph.D., Columbia University
Published Here
May 3, 2021