Theses Doctoral

Modeling Nonignorable Missingness with Response Times Using Tree-based Framework in Cognitive Diagnostic Models

Yang, Yi

As the testing moves from paper-and-pencil to computer-based assessment, both response accuracy (RA) and response time (RT) together provide a potential for improving the performance evaluation and ability estimation of the test takers. Most joint models utilizing RAs and RTs simultaneously assumed an IRT model for the RA measurement at the lower level, among which the hierarchical speed-accuracy (SA) model proposed by van der Linden (2007) is the most prevalent in literature.

Zhan et al. (2017) extended the SA model in cognitive diagnostic modeling (CDM) by proposing the hierarchical joint response and times DINA (JRT-DINA) model, but little is known about its generalizability with the presence of missing data. Large-scale assessments are used in educational effectiveness studies to quantify educational achievement, in which the amount of item nonresponses is not negligible (Pohl et al., 2012; Pohl et al., 2019; Rose et al., 2017; Rose et al., 2010) due to lack of proficiency, lack of motivation and/or lack of time.

Treating unplanned missingness as ignorable leads to biased sample-based estimates of item and person parameters (R. J. A. Little & Rubin, 2020; Rubin, 1976), therefore, in the past few decades, intensive efforts have been focused on nonignorable missingness (Glas & Pimentel, 2008; Holman & Glas, 2005; Pohl et al., 2019; Rose et al., 2017; Rose et al., 2010; Ulitzsch et al., 2020a, 2020b). However, a great majority of these methods were limited in item nonresponse types and/or model complexity until J. Lu and Wang (2020) incorporated the mixture cure-rate model (Lee & Ying, 2015) and the tree-based IRT framework (Debeer et al., 2017), which inherited a built-in behavior process for item nonresponses thus introduced no additional latent propensity parameters to the joint model. Nevertheless, these approaches were discussed within the IRT framework, and the traditional measurement models could not provide cognitive diagnostic information about attribute mastery.

This dissertation first postulates the CDMTree model, an extension of the tree-based RT process model in CDM, and then explores its efficacy through a real data analysis using PISA 2012 computer-based assessment of mathematics data. The follow-up simulation study compares the proposed model to the JRT-DINA model under multiple conditions to deal with various types of nonignorable missingness, i.e. both omitted items (OIs) and not-reached items (NRIs) due to time limits. A fully Bayesian approach is used for the estimation of the model with the Markov chain Monte Carlo (MCMC) method.


  • thumnail for Yang_columbia_0054D_17690.pdf Yang_columbia_0054D_17690.pdf application/pdf 2.22 MB Download File

More About This Work

Academic Units
Measurement and Evaluation
Thesis Advisors
Lee, Young-Sun
Ph.D., Columbia University
Published Here
March 29, 2023