Theses Doctoral

Latent Variable Models in Measurement: Theory and Application

Fang, Guanhua

Latent variable models play an important role in educational and psychological measurement, where items are presented to individuals, resulting in item response data. Such data entail important information about the individual latent traits, population structure and item design, which are key components to be understood in educational and psychological assessments. This thesis focuses on the development of statistical learning methods based on latent variable models with identifiability theories. The thesis consists of three parts, with three kinds of applications in mind.

The first part is on the identifiability of diagnostic classification models (DCMs), which is a special subfamily of latent class models. It aims to examine the test takers' ability based on his/her mastery of set of required skills. A key issue common to DCMs and more generally to latent class model is the identifiability which is a property whether the unknown model or related parameters can be estimated consistently under a suitably defined asymptotic regime. Most existing works focus on the identifiability of DCMs with binary responses and attributes.

In this thesis, we provide general identifiability results for DCMs with polytoumous responses and attributes and less parameter restrictions.

The second part considers the identifiability of testlet factor models, which is a subfamily of latent variable models with underlying continuous latent traits assumed to follow normal distribution. Similar to DCMs, factor models also suffer from identifiability issues, where the parameters can only be identified up to a rotation in general. However, in most applications, testlet models or bifactor models are popular in educational assessment. They are constrained factor models assuming that the response test items can be accounted for by one primary factor and multiple secondary group-specific factors. By aid of this special structure, we can show that the model can be strictly identifiable and we provide checkable necessary and sufficient conditions accordingly.

The third part focuses on the statistical learning in studying the complex problem-solving (CPS) items. With advanced computer technology, there is a new trend of developing CPS test items through online platform, where the examinees are asked to solve challenging tasks in a simulated environment. During the test, all actions performed by examinees will be recored into a log file. Therefore, we can not only observe their final responses, but also have access to their entire solving process. Such data type is known as process data in the measurement literature. The traditional item response model cannot be applicable, at least directly. The analysis of the process data is still in its infancy. In the thesis, we propose a new model-based approach and show its usefulness through an interesting real data application.


Downloadable resources are currently unavailable for this item.

More About This Work

Academic Units
Thesis Advisors
Ying, Zhiliang
Ph.D., Columbia University
Published Here
July 15, 2020