Theses Doctoral

Sequential Rerandomization in the Context of Small Samples

Yang, Jiaxi

Rerandomization (Morgan & Rubin, 2012) is designed for the elimination of covariate imbalance at the design stage of causal inference studies. By improving the covariate balance, rerandomization helps provide more precise and trustworthy estimates (i.e., lower variance) of the average treatment effect (ATE). However, there are only a limited number of studies considering rerandomization strategies or discussing the covariate balance criteria that are observed before conducting the rerandomization procedure. In addition, researchers may find more difficulty in ensuring covariate balance across groups with small-sized samples. Furthermore, researchers conducting experimental design studies in psychology and education fields may not be able to gather data from all subjects simultaneously. Subjects may not arrive at the same time and experiments can hardly wait until the recruitment of all subjects.

As a result, we have presented the following research questions:
1) How does the rerandomization procedure perform when the sample size is small?
2) Are there any other balancing criteria that may work better than the Mahalanobis distance in the context of small samples?
3) How well does the balancing criterion work in a sequential rerandomization design?

Based on the Early Childhood Longitudinal Study, Kindergarten Class, a Monte-Carlo simulation study is presented for finding a better covariate balance criterion with respect to small samples. In this study, the neural network predicting model is used to calculate missing counterfactuals. Then, to ensure covariate balance in the context of small samples, the rerandomization procedure uses various criteria measuring covariate balance to find the specific criterion for the most precise estimate of sample average treatment effect. Lastly, a relatively good covariate balance criterion is adapted to Zhou et al.’s (2018) sequential rerandomization procedure and we examined its performance.

In this dissertation, we aim to identify the best covariate balance criterion using the rerandomization procedure to determine the most appropriate randomized assignment with respect to small samples. On the use of Bayesian logistic regression with Cauchy prior as the covariate balance criterion, there is a 19% decrease in the root mean square error (RMSE) of the estimated sample average treatment effect compared to pure randomization procedures. Additionally, it is proved to work effectively in sequential rerandomization, thus making a meaningful contribution to the studies of psychology and education. It further enhances the power of hypothesis testing in randomized experimental designs.


  • thumnail for Yang_columbia_0054D_16568.pdf Yang_columbia_0054D_16568.pdf application/pdf 680 KB Download File

More About This Work

Academic Units
Measurement and Evaluation
Thesis Advisors
Keller, Bryan Sean
Ph.D., Columbia University
Published Here
June 14, 2021