2025 Reports
Data Science in Education Administration, Policy, and Practice
The purpose of the present paper is to discuss the application of the field of data science to education research and issues of school organization, leadership, policy, and practice. The intended audience for the paper is education researchers, practitioners, and policymakers interested in an overview of the history and current discussion of the application of data science in education organizations and management. I divide the paper into ten sections. In the first section, A Definition of Data Science, I overview the rich conversation in the research and practice literature across the broad data science domain, overviewing the discussion of the work of data scientists from “learning from data” to the application of a rich set of tools that includes data management, visualization, analysis (statistical and machine learning), communication with management and stakeholders, with domain knowledge central to the work. Yet, data science is neither a novel nor new domain, and so in the second section, 50 Years of Data Analytics and Decision-Making: A Brief History, I consider what has been termed “50 years of data science” (Donoho, 2017) as it has developed from the original work of Tukey and Exploratory Data Analysis (EDA) in the 1960s and early 1970s. Importantly for education and organizational management, concurrent to the work of Tukey, organizational theorists, such as Herbert Simon (1971) were mirroring the same calls for the need to make the ever increasing vast sets of data in organizations interpretable by management in a way that helps leaders and policymakers see the patterns that matter to organizational decision-making. Indeed, for education leadership and policy researchers, the research literature from 50 years ago mirrors exactly these calls from what have become known as data scientists within education systems.
In the third section, Education Data Science and the 21st Century, I discuss the current state of the field of education data science (Agasisti & Bowers, 2017; Bowers, 2017; Piety et al., 2014) and the contemporary discussion on the need for the application of the tools of data science in education research, policy, and practice. As an example, I briefly turn then in the fourth section, Testing Management Ideas Using Data Science and Experimentation, to current discussions of the need of data science practices such as causal A/B testing to test management ideas, as the vast majority of management ideas in education and elsewhere are never subjected to experimentation to test the extent to which a prediction was accurate, and thus organizations know very little about the extent of failure or success of management’s ideas as they are put into practice. Nevertheless, training programs and professional development in education leadership, administration, and policy as yet rarely take up the issue of data science, and so in the fifth section, A Roadmap for Training in Education Data Science, I outline the current research on the skills and domains currently proposed across the education data science domain, and next steps for building capacity and training programs to provide the needed capacity building and training in data science as applied to issues of education organizations, leadership, and policy.
Throughout this discussion however, one might question the extent to which this discussion of data science is just statistics, and thus the current quantitative training and research across education may already address these issues. Hence, in the sixth section, Accuracy of Prediction Versus Model Fitting, I foreground the longstanding critique of statistics research that notes that almost all of applied statistics is focused on model fitting and reporting p-values, variance explained, and effect sizes, yet what the decision-maker wants to know is how accurate are the predictions of the model (Breiman, 2001), and thus accuracy of predictions is a core concern of data science, equal to or above model fitting. Yet, a focus on prediction, especially in machine learning, has downsides, and so in the seventh section, Machine Learning Only Learns From Data, Data from a Flawed and Inequitable System, I turn to a discussion of issues of fairness and bias in prediction and early warning systems, discussing the “4As” of algorithms in education of Accurate, Accessible, Actionable, and Accountable (Bowers, 2021b).
To address these issues, the eighth section, The Common Task Framework (CTF): Building Capacity in Data Science, presents an overview of the Common Task Framework (CTF), which has been termed the “secret sauce” of data science (Donoho, 2017) and includes (a) open large-scale real-world deidentified datasets, (b) a shared culture of shared code for shared research, (c) public and open evaluation of algorithms, and (d) neutral referees. Then, in the ninth section, Data Science as a Third Methodology in Education Research, I propose data science as a third methodology in education research to join quantitative and qualitative research, which when combined with a focus on theory, description, and prediction, has the potential to help bridge between methods domains that focus on tabular data, traditionally the domain of quantitative methods, and unstructured nontabular data such as text, images, and videos, traditionally the domain of qualitative data. And finally, in the tenth section, Conclusion and a Look to the Future, I conclude by pointing to three potential near-term benefits of the integration of data science into the lexicon of education research, administration, policy, and practice.
Subjects
Files
-
Bowers 2025 Data science in Ed Admin Policy and Practice.pdf
application/pdf
602 KB
Download File
More About This Work
- Academic Units
- Education Leadership
- Published Here
- August 4, 2025