Theses Doctoral

Diagnostic Classification Modeling of Rubric-Scored Constructed-Response Items

Muller, Eric William

The need for formative assessments has led to the development of a psychometric framework known as diagnostic classification models (DCMs), which are mathematical measurement models designed to estimate the possession or mastery of a designated set of skills or attributes within a chosen construct. Furthermore, much research has gone into the practice of “retrofitting” diagnostic measurement models to existing assessments in order to improve their diagnostic capability. Although retrofitting DCMs to existing assessments can theoretically improve diagnostic potential, it is also prone to challenges including identifying multidimensional traits from largely unidimensional assessments, a lack of assessments that are suitable for the DCM framework, and statistical quality, specifically highly correlated attributes and poor model fit. Another recent trend in assessment has been a move towards creating more authentic constructed-response assessments. For such assessments, rubric-based scoring is often seen as method of providing reliable scoring and interpretive formative feedback. However, rubric-scored tests are limited in their diagnostic potential in that they are usually used to assign unidimensional numeric scores.
It is the purpose of this thesis to propose general methods for retrofitting DCMs to rubric-scored assessments. Two methods will be proposed and compared: (1) automatic construction of an attribute hierarchy to represent all possible numeric score levels from a rubric-scored assessment and (2) using rubric criterion score level descriptions to imply an attribute hierarchy. This dissertation will describe these methods, discuss the technical and mathematical issues that arise in using them, and apply and compare both methods to a prominent rubric-scored test of critical thinking skills, the Collegiate Learning Assessment+ (CLA+). Finally, the utility of the proposed methods will be compared to a reasonable alternative methodology: the use of polytomous IRT models, including the Graded Response Model (GRM), the Partial Credit Model (PCM), and the Generalized-Partial Credit Model (G-PCM), for this type of test score data.


  • thumnail for Muller_columbia_0054D_14698.pdf Muller_columbia_0054D_14698.pdf application/pdf 26.5 MB Download File

More About This Work

Academic Units
Thesis Advisors
Corter, James E.
Ph.D., Columbia University
Published Here
May 15, 2018