The Content of Their Coursework: Understanding Course-Taking Patterns at Community Colleges by Clustering Student Transcripts

Zeidenberg, Matthew; Scott, Marc

Community college students typically have access to a large selection of courses and programs, and therefore the student transcripts at any one college or college system tend to be very diverse. As a result, it is difficult for faculty, administrators, and researchers to understand the course-taking patterns of students in order to determine what programs of study they appear to be pursuing. Attempting to examine these patterns and then comparing them with listed program requirements would be a very time-consuming activity; clustering can be a useful way to make sense of the relevant data. Clustering allows researchers to group similar items into clusters, relying only on a measure of the similarity of those items. In this paper, we apply a clustering algorithm to the problem of understanding college transcripts, which serve as the items to be clustered. To our knowledge, this is the first effort to organize transcripts based on their course content using clustering. We base the measure of similarity on the proportion of curricular subjects that each transcript has in common with every other one. Our data are community and technical college transcripts for a cohort of students who first entered the Washington State system during the fall of the 2005-06 academic year and who had no prior postsecondary experience. We used our clustering algorithm to separately cluster liberal arts and career-technical students. We found that the algorithm did a good job of separately clustering each of these groups. The clusters roughly corresponded to programs of study, so we were able to estimate how many students were undertaking each program and what subjects students were studying within each cluster. We were also able to examine the demographics and the completion and transfer rates of the students within each cluster, in order to get an idea of what types of students were in each program of study and how successful they seemed to be in college. We found substantial variation on these dimensions as well as on the extent to which students' programs were either concentrated in a single subject or spread across several subjects. We conclude that this method would be useful to researchers throughout education who are trying to understand student course-taking patterns and programs of study, and who need to organize large amounts of transcript data.


  • thumnail for coursework-patterns-clustering-transcript.pdf coursework-patterns-clustering-transcript.pdf application/pdf 281 KB Download File

More About This Work

Academic Units
Community College Research Center
Community College Research Center, Teachers College, Columbia University
CCRC Working Papers, 35
Published Here
February 22, 2012