Teachers College, Columbia University
Bowers@tc.columbia.edu
Teachers College, Columbia University
mms2380@tc.columbia.edu
Alex J. Bowers, Maeghan Sill
Teachers College, Columbia University
This document is the online R markdown supplement to the report titled “Mapping Public Open Access K-12 State Education Indicator Data Across 7 States and Washington D.C. Using the FAIR Data Principles” https://doi.org/10.7916/c0jk-5e64
This document contains the R code and visualizations for each of the Interrater Reliability (IRR) figures in the report.
Note: For more information on each of the visualizations, please refer to the full report published online.
This research was made possible by support from The Wallace Foundation. The opinions described herein are solely those of the authors.
To cite this online supplement appendix: Bowers, A.J., Sill, M. (2025). Online Supplement 12 Appendix for Bowers et al. (2025) Mapping Public Open Access K-12 State Education Indicator Data Across 7 States and Washington D.C. Using the FAIR Data Principles: Interrater Reliability Visualizations and Code. Teachers College, Columbia University. New York, NY. https://doi.org/10.7916/3vny-6b48
Copyright © The Authors, 2025.
Text, images, and data in this work are licensed under Creative
Commons BY 4.0 (CC-BY:4.0)
https://creativecommons.org/licenses/by/4.0/
Code, markdowns, and scripts are released under the MIT License
https://opensource.org/licenses/MIT
This Markdown document serves as a comprehensive online supplement
that provides complete transparency and reproducibility for the
interrater reliability (IRR) analysis conducted in the FAIR Data
Principles study of K-12 state education indicator data. The document
contains all R code, statistical computations, and data visualizations
used to assess agreement between raters who evaluated to what extent
state education datasets could be mapped to the National Academies of
Sciences, Engineering, and Medicine (NASEM) framework indicators. By
including detailed code for Cohen’s kappa calculations, percentage
agreement metrics, weighted kappa statistics, and sophisticated
visualizations (including heatmaps with marginal plots and balloon
plots), this supplement enables other researchers to fully understand
the methodology, reproduce the findings, and potentially adapt the
approach for similar data mapping reliability studies in educational
research.
The file begins with loading the necessary packages and cleaning the
data. Please see the sections below for more information about each of
these. Next, the state-level analysis and data visualizations are
presented in the following order: Ohio, Texas, North Carolina,
Washington, D.C., Kentucky, Maryland, Oregon, and California. The final
section presents the cross-state analysis and data visualizations.
This analysis is using the Interrater Reliability Tracker (IRRT)
dataset that contains ratings from multiple validators who assessed how
well state education datasets could be mapped to the National Academies
of Sciences, Engineering, and Medicine (NASEM) framework indicators. Two
independent raters (identified as “A” and “B”) evaluated each identified
and screened state education dataset from Ohio, Texas, North Carolina,
Washington D.C., Kentucky, Maryland, Oregon, and California according to
the coding schema described in the full report. Then, the raters
conferenced about disagreements, recording each of their reasoning and
how they made their final decision. These final decisions are what is
coded in the “nasem” columns in each state’s Metadata Megatable (MDMT).
These MDMT/IRRT datasets are available through Columbia University’s
Academic Commons.
The actual file name for the Interrater Reliability Tracker is
“20250501_CALL-ECL-MEI_IRRT_Combined_8_States.xlsx”. This data can be
downloaded from: Online Supplement 10 Appendix for Bowers et al. (2025)
Mapping Public Open Access K-12 State Education Indicator Data Across 7
States and Washington D.C. Using the FAIR Data Principles: Interrater
Reliability Tracker Data. https://doi.org/10.7916/0bv2-zm35
The file has four tabs: ABY_Merge, Conference_Notes, Final_Decisions,
and Data_Dictionary. ABY_Merge includes two raters’ ratings for each
dataset assessed for eligibility in the Screening phase of the project.
Each row is for a unique rater and indicator combination and includes
the following variables: id, validation_round, rating, quote,
quote_page, reasoning, year, supplemental_links, dataset_name, and
rater. Conference_Notes includes one row for each dataset screened for
eligibility and includes notes about whether the raters conferenced
about disagreements in their ratings. The validator_conference,
state_captain_conference, and pi_conference columns are all binary, with
0 indicating that level of conferencing did not take place and a 1
indicating that it did. The variables for this tab include: id,
conference_notes, validator_conference, state_captain_conference,
pi_conference, year, dataset_name, rater_a, rater_b. The Final_Decisions
tab includes the results that the two raters agreed upon after any
necessary conferencing. There is one row for each dataset and indicator
combination. It includes the following variables: id, indicator, rating,
quote, quote_page, reasoning, year, supplemental_links, and
dataset_name. The Data_Dictionary tab includes information about the
other tabs in the file. It includes the following variables: sheet,
variable_name, variable_label, variable_type, data_values,
assigned_missing_value, and notes. For more information about the
screening process and its results, please refer to the full “Mapping
Public Open Access K-12 State Education Indicator Data Across 7 States
and Washington D.C. Using the FAIR Data Principles” report.
We used R version 4.5.1 [@base] and the following R packages for this analysis: tidyverse v. 2.0.0 [@tidyverse] for comprehensive data manipulation, visualization, and the overall analytical workflow; irr v. 0.84.1 [@irr] for computing Cohen’s kappa and other interrater reliability statistics; ggpubr v. 0.6.2 [@ggpubr] for creating publication-ready statistical graphics including balloon plots; viridis v. 0.6.5 [@viridis] for perceptually uniform color scales in visualizations; ComplexHeatmap v. 2.25.2 [@ComplexHeatmap2016; @ComplexHeatmap2022] along with circlize v. 0.4.16 [@circlize] and BiocManager v. 1.30.26 [@BiocManager] for creating sophisticated heatmap visualizations with clustering capabilities; hopach v. 2.70.0 [@hopach] for hierarchical clustering algorithms; cowplot v. 1.2.0 [@cowplot] and patchwork v. 1.3.2 [@patchwork] for combining and arranging multiple plots into complex figure layouts; scales v. 1.4.0 [@scales] for axis scaling and formatting; svglite v. 2.2.2 [@svglite] and rsvg v. 2.7.0 [@rsvg] for high-quality vector graphics output; visdat v. 0.6.0 [@visdat] for data structure visualization; devtools v. 2.4.6 [@devtools] for package development utilities; Kendall v. 2.2.1 [@Kendall] for non-parametric correlation analysis; and rmarkdown v. 2.30 [@rmarkdown2018; @rmarkdown2020; @rmarkdown2025] for creating this reproducible research document. These packages can be found on the Comprehensive R Archive Network (CRAN) at https://cran.r-project.org/ or through Bioconductor at https://www.bioconductor.org/ for specialized bioinformatics packages.
Loading the packages.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.2
## ✔ ggplot2 4.0.0 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(irr)
## Warning: package 'irr' was built under R version 4.5.2
## Loading required package: lpSolve
## Warning: package 'lpSolve' was built under R version 4.5.2
library(ggplot2)
library(ggpubr)
## Warning: package 'ggpubr' was built under R version 4.5.2
library(readxl)
library(Kendall)
## Warning: package 'Kendall' was built under R version 4.5.2
library(dplyr)
library(viridis)
## Loading required package: viridisLite
library(ComplexHeatmap)
## Loading required package: grid
## ========================================
## ComplexHeatmap version 2.25.2
## Bioconductor page: http://bioconductor.org/packages/ComplexHeatmap/
## Github page: https://github.com/jokergoo/ComplexHeatmap
## Documentation: http://jokergoo.github.io/ComplexHeatmap-reference
##
## If you use it in published research, please cite either one:
## - Gu, Z. Complex Heatmap Visualization. iMeta 2022.
## - Gu, Z. Complex heatmaps reveal patterns and correlations in multidimensional
## genomic data. Bioinformatics 2016.
##
##
## The new InteractiveComplexHeatmap package can directly export static
## complex heatmaps into an interactive Shiny app with zero effort. Have a try!
##
## This message can be suppressed by:
## suppressPackageStartupMessages(library(ComplexHeatmap))
## ========================================
library(BiocManager)
## Warning: package 'BiocManager' was built under R version 4.5.2
library(hopach)
## Loading required package: cluster
## Loading required package: Biobase
## Loading required package: BiocGenerics
## Loading required package: generics
##
## Attaching package: 'generics'
##
## The following object is masked from 'package:lubridate':
##
## as.difftime
##
## The following object is masked from 'package:dplyr':
##
## explain
##
## The following objects are masked from 'package:base':
##
## as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
## setequal, union
##
##
## Attaching package: 'BiocGenerics'
##
## The following object is masked from 'package:dplyr':
##
## combine
##
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
##
## The following objects are masked from 'package:base':
##
## anyDuplicated, aperm, append, as.data.frame, basename, cbind,
## colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
## get, grep, grepl, is.unsorted, lapply, Map, mapply, match, mget,
## order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
## rbind, Reduce, rownames, sapply, saveRDS, table, tapply, unique,
## unsplit, which.max, which.min
##
## Welcome to Bioconductor
##
## Vignettes contain introductory material; view with
## 'browseVignettes()'. To cite Bioconductor, see
## 'citation("Biobase")', and for packages 'citation("pkgname")'.
##
##
## Attaching package: 'hopach'
##
## The following object is masked from 'package:generics':
##
## prune
library(circlize)
## Warning: package 'circlize' was built under R version 4.5.2
## ========================================
## circlize version 0.4.16
## CRAN page: https://cran.r-project.org/package=circlize
## Github page: https://github.com/jokergoo/circlize
## Documentation: https://jokergoo.github.io/circlize_book/book/
##
## If you use it in published research, please cite:
## Gu, Z. circlize implements and enhances circular visualization
## in R. Bioinformatics 2014.
##
## This message can be suppressed by:
## suppressPackageStartupMessages(library(circlize))
## ========================================
library(tibble)
library(devtools)
## Warning: package 'devtools' was built under R version 4.5.2
## Loading required package: usethis
## Warning: package 'usethis' was built under R version 4.5.2
##
## Attaching package: 'devtools'
##
## The following object is masked from 'package:BiocManager':
##
## install
library(svglite)
## Warning: package 'svglite' was built under R version 4.5.2
library(stringr)
library(scales)
##
## Attaching package: 'scales'
##
## The following object is masked from 'package:viridis':
##
## viridis_pal
##
## The following object is masked from 'package:purrr':
##
## discard
##
## The following object is masked from 'package:readr':
##
## col_factor
library(cowplot)
##
## Attaching package: 'cowplot'
##
## The following object is masked from 'package:ggpubr':
##
## get_legend
##
## The following object is masked from 'package:lubridate':
##
## stamp
library(patchwork)
##
## Attaching package: 'patchwork'
##
## The following object is masked from 'package:cowplot':
##
## align_plots
library(visdat)
## Warning: package 'visdat' was built under R version 4.5.2
library(knitr)
library(grateful)
Loading in the ABY Merge dataset (the first sheet of the excel file
“20250501_CALL-ECL-MEI_IRRT_Combined_8_States.xlsx”).
This dataset can be downloaded from: Online Supplement 10 Appendix for
Bowers et al. (2025) Mapping Public Open Access K-12 State Education
Indicator Data Across 7 States and Washington D.C. Using the FAIR Data
Principles: Interrater Reliability Tracker Data. https://doi.org/10.7916/0bv2-zm35.
To use, download the file into the same folder as this R studio code or
project and then read the file as in the markdown below.
abymerge <- read_excel("C:/Users/Public/Downloads/20250501_CALL-ECL-MEI_IRRT_Combined_8_States.xlsx",
sheet = 1,
col_types = "text"
) %>%
select(id, rater, validation_round, dataset_name, indicator, rating) %>%
mutate(rating = as.numeric(rating))
head(abymerge)
## # A tibble: 6 × 6
## id rater validation_round dataset_name indicator rating
## <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 CA1218 R514 A expulsion12.txt K-12_Nonexclusionary_dis… 4
## 2 CA1218 R562 B expulsion12.txt K-12_Nonexclusionary_dis… 4
## 3 CA1218 R562 B expulsion12.txt K-12_School_climate 2
## 4 CA1219 R514 A expulsion13.txt K-12_Nonexclusionary_dis… 4
## 5 CA1219 R562 B expulsion13.txt K-12_Nonexclusionary_dis… 4
## 6 CA1219 R562 B expulsion13.txt K-12_School_climate 2
str(abymerge)
## tibble [16,470 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:16470] "CA1218" "CA1218" "CA1218" "CA1219" ...
## $ rater : chr [1:16470] "R514" "R562" "R562" "R514" ...
## $ validation_round: chr [1:16470] "A" "B" "B" "A" ...
## $ dataset_name : chr [1:16470] "expulsion12.txt" "expulsion12.txt" "expulsion12.txt" "expulsion13.txt" ...
## $ indicator : chr [1:16470] "K-12_Nonexclusionary_discipline_practices" "K-12_Nonexclusionary_discipline_practices" "K-12_School_climate" "K-12_Nonexclusionary_discipline_practices" ...
## $ rating : num [1:16470] 4 4 2 4 4 2 4 4 2 4 ...
Filter only raters A and B.
rater_ab <- abymerge %>%
filter(validation_round %in% c("A", "B")) %>%
mutate(id = as.character(id),
indicator = trimws(indicator))
str(rater_ab)
## tibble [13,736 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:13736] "CA1218" "CA1218" "CA1218" "CA1219" ...
## $ rater : chr [1:13736] "R514" "R562" "R562" "R514" ...
## $ validation_round: chr [1:13736] "A" "B" "B" "A" ...
## $ dataset_name : chr [1:13736] "expulsion12.txt" "expulsion12.txt" "expulsion12.txt" "expulsion13.txt" ...
## $ indicator : chr [1:13736] "K-12_Nonexclusionary_discipline_practices" "K-12_Nonexclusionary_discipline_practices" "K-12_School_climate" "K-12_Nonexclusionary_discipline_practices" ...
## $ rating : num [1:13736] 4 4 2 4 4 2 4 4 2 4 ...
This step creates a universal list of indicator names and saves them as a dataframe for use later.
indicator_names <- c(
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs",
"Pre-K_Academic_readiness",
"Pre-K_Self-regulation_and_attention_skills",
"K-12_Access_to_effective_teaching",
"K-12_Access_to_rigorous_coursework",
"K-12_Curricular_breadth",
"K-12_Access_to_high-quality_academic_supports",
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
"K-12_School_climate",
"K-12_Nonexclusionary_discipline_practices",
"K-12_Nonacademic_supports_for_student_success",
"K-12_Engagement_in_schooling",
"K-12_Performance_in_coursework",
"K-12_Performance_on_tests",
"EA_On-time_graduation",
"EA_Postsecondary_readiness"
)
indicators_df <- data.frame(indicator = indicator_names)
Filtering for only Ohio datasets.
OH_rater_ab <- rater_ab %>%
filter(str_starts(id, "OH"))
str(OH_rater_ab)
## tibble [2,158 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:2158] "OH1218" "OH1218" "OH1219" "OH1219" ...
## $ rater : chr [1:2158] "R562" "R294" "R294" "R562" ...
## $ validation_round: chr [1:2158] "A" "B" "B" "A" ...
## $ dataset_name : chr [1:2158] "21-22_Achievement_Building.xlsx" "21-22_Achievement_Building.xlsx" "20-21_Achievement_Building.xlsx" "20-21_Achievement_Building.xlsx" ...
## $ indicator : chr [1:2158] "K-12_Performance_on_tests" "K-12_Performance_on_tests" "K-12_Engagement_in_schooling" "K-12_Performance_on_tests" ...
## $ rating : num [1:2158] 3 1 3 3 2 3 3 2 3 3 ...
OH_duplicates_check <- OH_rater_ab %>%
group_by(id, indicator, validation_round) %>%
filter(n() > 1) %>%
ungroup()
# View the duplicates
print(OH_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## # dataset_name <chr>, indicator <chr>, rating <dbl>
Making sure each id, indicator, validator combination is present.
OH_expanded_grid <- OH_rater_ab %>%
distinct(id, dataset_name) %>%
crossing(indicators_df, validation_round = c("A", "B"))
head(OH_expanded_grid)
## # A tibble: 6 × 4
## id dataset_name indicator validation_round
## <chr> <chr> <chr> <chr>
## 1 OH1218 21-22_Achievement_Building.xlsx EA_On-time_graduation A
## 2 OH1218 21-22_Achievement_Building.xlsx EA_On-time_graduation B
## 3 OH1218 21-22_Achievement_Building.xlsx EA_Postsecondary_read… A
## 4 OH1218 21-22_Achievement_Building.xlsx EA_Postsecondary_read… B
## 5 OH1218 21-22_Achievement_Building.xlsx K-12_Access_to_effect… A
## 6 OH1218 21-22_Achievement_Building.xlsx K-12_Access_to_effect… B
Making sure that id/indicator combinations that were not rated are represented as zero.
OH_ratings_with_zeros <- OH_expanded_grid %>%
left_join(OH_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating),
by = c("id", "dataset_name", "indicator", "validation_round")) %>%
mutate(rating = replace_na(rating, 0))
OH_ab_long <- OH_ratings_with_zeros %>%
pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
rename(rating_a = rating_A, rating_b = rating_B) %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
head(OH_ab_long)
## # A tibble: 6 × 6
## id dataset_name indicator rating_a rating_b agreement
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 OH1218 21-22_Achievement_Building.xlsx EA_On-time… 0 0 1
## 2 OH1218 21-22_Achievement_Building.xlsx EA_Postsec… 0 0 1
## 3 OH1218 21-22_Achievement_Building.xlsx K-12_Acces… 0 0 1
## 4 OH1218 21-22_Achievement_Building.xlsx K-12_Acces… 0 0 1
## 5 OH1218 21-22_Achievement_Building.xlsx K-12_Acces… 0 0 1
## 6 OH1218 21-22_Achievement_Building.xlsx K-12_Curri… 0 0 1
OH_dataset_agreement <- OH_ab_long %>%
group_by(id) %>%
summarise(percent_agreement = mean(agreement) * 100)
OH_dataset_agreement <- OH_dataset_agreement %>%
ungroup()
n_distinct(OH_dataset_agreement$id)
## [1] 446
head(OH_dataset_agreement)
## # A tibble: 6 × 2
## id percent_agreement
## <chr> <dbl>
## 1 OH1218 93.8
## 2 OH1219 87.5
## 3 OH1220 87.5
## 4 OH1221 87.5
## 5 OH1222 93.8
## 6 OH1223 93.8
write.csv(OH_dataset_agreement, "20251113_percent_agreement_by_id_Ohio.csv", row.names = FALSE)
OH_indicators_mapped_per_id <- OH_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(id) %>%
summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
right_join(OH_ab_long %>% distinct(id), by = "id") %>%
mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))
OH_dataset_agreement_count <- OH_dataset_agreement %>%
left_join(OH_indicators_mapped_per_id, by = "id")
head(OH_dataset_agreement_count)
## # A tibble: 6 × 3
## id percent_agreement indicators_mapped_per_id
## <chr> <dbl> <int>
## 1 OH1218 93.8 1
## 2 OH1219 87.5 2
## 3 OH1220 87.5 2
## 4 OH1221 87.5 2
## 5 OH1222 93.8 1
## 6 OH1223 93.8 1
OH_dataset_agreement_count <- OH_dataset_agreement_count %>%
arrange(desc(percent_agreement), (indicators_mapped_per_id))
OH_dataset_agreement_count$id <- factor(OH_dataset_agreement_count$id, levels = unique(OH_dataset_agreement_count$id))
OH_dataset_agreement_count$indicators_mapped_per_id <-
factor(OH_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
OH_percentagreement_barplot <- ggplot(OH_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
geom_bar(stat = "identity") +
theme_minimal() +
scale_fill_viridis_d(direction = -1) +
scale_y_continuous(expand = c(0, 0)) +
labs(
title = "Ohio - Percentage Agreement by Dataset",
x = "Dataset",
y = "Agreement (%)",
fill = "Number of Indicators"
) +
theme(
axis.text.y = element_text(size = 1)
) +
coord_flip()
OH_percentagreement_barplot
ggsave("20251113_percentagreement_barplot_Ohio.svg", plot = OH_percentagreement_barplot, width = 6.5, height = 6.5)
Printing results for chart in findings section.
OH_percent_agreement <- OH_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
group_by(indicator) %>%
summarise(percent_agreement = mean(agreement) * 100)
OH_percent_agreement
## # A tibble: 16 × 2
## indicator percent_agreement
## <chr> <dbl>
## 1 EA_On-time_graduation 74.4
## 2 EA_Postsecondary_readiness 97.8
## 3 K-12_Access_to_effective_teaching 98.4
## 4 K-12_Access_to_high-quality_academic_supports 85.2
## 5 K-12_Access_to_rigorous_coursework 90.1
## 6 K-12_Curricular_breadth 93.0
## 7 K-12_Engagement_in_schooling 71.1
## 8 K-12_Nonacademic_supports_for_student_success 100
## 9 K-12_Nonexclusionary_discipline_practices 100
## 10 K-12_Performance_in_coursework 83.9
## 11 K-12_Performance_on_tests 66.4
## 12 K-12_School_climate 100
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre… 95.7
## 14 Pre-K_Academic_readiness 97.8
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr… 100
## 16 Pre-K_Self-regulation_and_attention_skills 97.3
OH_ab_long <- OH_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
Now calculating Cohen’s kappa, first for the whole dataset, then by indicator.
OH_contingency_table <- data.frame(OH_ab_long$rating_a, OH_ab_long$rating_b)
OH_kappa_result <- kappa2(OH_contingency_table, weight = "unweighted")
print(OH_kappa_result)
## Cohen's Kappa for 2 Raters (Weights: unweighted)
##
## Subjects = 7136
## Raters = 2
## Kappa = 0.654
##
## z = 76
## p-value = 0
OH_kappa_by_indicator <- OH_ab_long %>%
group_by(indicator) %>%
summarise(
kappa_stat = {
OH_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
OH_kappa_result <- kappa2(OH_contingency_table, weight = "unweighted")
OH_kappa_result$value
},
.groups = "drop"
)
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `kappa_stat = { ... }`.
## ℹ In group 16: `indicator = "Pre-K_Self-regulation_and_attention_skills"`.
## Caused by warning in `sqrt()`:
## ! NaNs produced
print(OH_kappa_by_indicator)
## # A tibble: 16 × 2
## indicator kappa_stat
## <chr> <dbl>
## 1 EA_On-time_graduation 0.605
## 2 EA_Postsecondary_readiness 0.819
## 3 K-12_Access_to_effective_teaching 0.879
## 4 K-12_Access_to_high-quality_academic_supports -0.0335
## 5 K-12_Access_to_rigorous_coursework 0.445
## 6 K-12_Curricular_breadth 0.546
## 7 K-12_Engagement_in_schooling 0.553
## 8 K-12_Nonacademic_supports_for_student_success 1
## 9 K-12_Nonexclusionary_discipline_practices 1
## 10 K-12_Performance_in_coursework -0.0527
## 11 K-12_Performance_on_tests 0.515
## 12 K-12_School_climate 1
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation 0.758
## 14 Pre-K_Academic_readiness 0.854
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs NaN
## 16 Pre-K_Self-regulation_and_attention_skills 0
Counting the number of datasets mapped to each indicator to create a plot afterwards.
OH_mapped_counts <- OH_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(indicator) %>%
summarise(mapped_counts = n_distinct(id), .groups = "drop")
OH_mapped_counts
## # A tibble: 15 × 2
## indicator mapped_counts
## <chr> <int>
## 1 EA_On-time_graduation 278
## 2 EA_Postsecondary_readiness 34
## 3 K-12_Access_to_effective_teaching 34
## 4 K-12_Access_to_high-quality_academic_supports 66
## 5 K-12_Access_to_rigorous_coursework 64
## 6 K-12_Curricular_breadth 49
## 7 K-12_Engagement_in_schooling 268
## 8 K-12_Nonacademic_supports_for_student_success 1
## 9 K-12_Nonexclusionary_discipline_practices 6
## 10 K-12_Performance_in_coursework 72
## 11 K-12_Performance_on_tests 338
## 12 K-12_School_climate 6
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati… 53
## 14 Pre-K_Academic_readiness 42
## 15 Pre-K_Self-regulation_and_attention_skills 12
OH_indicator_kappa_count_merge <- OH_kappa_by_indicator %>%
left_join(OH_mapped_counts, by = "indicator")
head(OH_indicator_kappa_count_merge)
## # A tibble: 6 × 3
## indicator kappa_stat mapped_counts
## <chr> <dbl> <int>
## 1 EA_On-time_graduation 0.605 278
## 2 EA_Postsecondary_readiness 0.819 34
## 3 K-12_Access_to_effective_teaching 0.879 34
## 4 K-12_Access_to_high-quality_academic_supports -0.0335 66
## 5 K-12_Access_to_rigorous_coursework 0.445 64
## 6 K-12_Curricular_breadth 0.546 49
clean_labels <- sub("^[^_]+_", "", OH_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
OH_indicator_kappa_count_merge$indicator_cleaned <- factor(
wrapped_labels,
levels = wrapped_labels[order(OH_indicator_kappa_count_merge$kappa_stat)]
)
OH_balloonplot <- ggballoonplot(OH_indicator_kappa_count_merge,
x = "indicator_cleaned",
y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
scale_fill_viridis_c(option = "C") +
theme_minimal() +
ylim(-1,1) +
labs(
title = "Ohio Interrater Reliability by Indicator",
subtitle = "Overall Kappa is 0.654",
x = "Indicator",
y = "Cohen's Kappa",
fill = "Kappa",
size = "Number of Datasets"
) +
theme(
axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
) +
coord_flip()
OH_balloonplot
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
ggsave("20251113_IRR_kappa_balloonplot_Ohio.svg", plot = OH_balloonplot, width = 6.5, height = 6.5)
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
OH_heatmap_long <- OH_ab_long %>%
mutate(weight = case_when(
rating_a == 0 & rating_b == 0 ~ 0, # Both rated zero = Not Rated
(rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2, # One rated zero, other rated indirect = Moderate Disagreement
(rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3, # One rated zero, other rated direct = Serious Disagreement
(rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1, # Both rated indirect = Agreement
(rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1, # Both rated direct = Agreement
TRUE ~ NA_real_ # Handle any other cases (e.g., missing or invalid ratings)
))
clean_labels <- sub("^[^_]+_", "", OH_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
OH_heatmap_long$indicator_cleaned <- wrapped_labels
OH_heatmap_long_selected <- OH_heatmap_long %>%
select(id, indicator_cleaned, weight)
head(OH_heatmap_long_selected)
## # A tibble: 6 × 3
## id indicator_cleaned weight
## <chr> <chr> <dbl>
## 1 OH1218 "On-time graduation" 0
## 2 OH1218 "Postsecondary readiness" 0
## 3 OH1218 "Access to effective teaching" 0
## 4 OH1218 "Access to high-quality academic\nsupports" 0
## 5 OH1218 "Access to rigorous coursework" 0
## 6 OH1218 "Curricular breadth" 0
Dataset disagreement with marginal plots.
# 1. Set factor levels based on heatmap ordering
OH_indicator_levels <- levels(with(OH_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
OH_dataset_levels <- levels(with(OH_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
OH_agreement_heatmap <- ggplot(OH_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Ohio Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
axis.text.y = element_text(size = 1),
legend.text = element_text(size = 6),
plot.title = element_text(hjust = 0.5)
)
# 3. Extract the legend
OH_legend <- cowplot::get_legend(OH_agreement_heatmap)
OH_legend_plot <- wrap_elements(OH_legend)
# 4. Remove legend from main plot
OH_main_plot <- OH_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))
# 5. Create top marginal plot (stacked bar chart by indicator)
OH_top_plot <- OH_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = OH_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
scale_x_discrete(drop = FALSE) +
theme_minimal() +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "none"
)
# 6. Create right marginal plot (stacked bar chart by dataset)
OH_right_plot <- OH_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
mutate(id = factor(id, levels = OH_dataset_levels)) %>%
ggplot(aes(x = id, fill = factor(weight))) +
geom_bar(position = "stack") +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
theme_minimal() +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
legend.position = "none"
)
OH_final_plot <- OH_top_plot + OH_legend + OH_main_plot + OH_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))
OH_final_plot
ggsave("20251113_agreement_heatmap_with_marginals_Ohio.svg", plot = OH_final_plot, width = 8, height = 12)
Creating the png.
# 1. Set factor levels based on heatmap ordering
OH_indicator_levels <- levels(with(OH_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
OH_dataset_levels <- levels(with(OH_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
OH_agreement_heatmap <- ggplot(OH_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Ohio Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = element_text(size = 30),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20),
legend.text = element_text(size = 20),
legend.title = element_text(size = 20),
plot.title = element_text(size = 25, hjust = 0.5) # ✅ Center the title
)
# 3. Extract the legend
OH_legend <- cowplot::get_legend(OH_agreement_heatmap)
# 4. Remove legend from main plot and adjust bar width
OH_main_plot <- OH_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20), hjust = 0.5)
# 5. Create top marginal plot
OH_top_plot <- OH_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = OH_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 20),
legend.position = "none")
# 6. Create right marginal plot
OH_right_data <- OH_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
count(id, weight) %>%
mutate(id = factor(id, levels = OH_dataset_levels))
OH_right_plot <- ggplot(OH_right_data, aes(x = id, y = n, fill = factor(weight))) +
geom_col(position = "stack", width = 1, colour = NA) +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") + # <-- horizontal axis after flip
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
axis.title.x = element_text(size = 20),
legend.position = "none",
panel.spacing = unit(0, "lines")
)
OH_final_plot3 <- OH_top_plot + OH_legend + OH_main_plot + OH_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))
OH_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
ggsave("20251113_agreement_heatmap_with_marginals_Ohio.png", plot = OH_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.
OH_ab_directindirect <- OH_ab_long %>%
select(id, dataset_name, indicator, rating_a, rating_b) %>%
mutate(across(c(rating_a, rating_b), ~ case_when(
. == 2 ~ 1,
. %in% c(3,4) ~ 2,
TRUE ~ .
)))
OH_ratings_kappaweighted <- OH_ab_directindirect %>%
select(rating_a, rating_b)
OH_kappa_weighted <- kappa2(OH_ratings_kappaweighted, weight = "equal")
print(OH_kappa_weighted)
## Cohen's Kappa for 2 Raters (Weights: equal)
##
## Subjects = 7136
## Raters = 2
## Kappa = 0.716
##
## z = 62.2
## p-value = 0
OH_weighted_kappa_by_indicator <- OH_ab_directindirect %>%
group_by(indicator) %>%
summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
## Warning: There were 2 warnings in `summarise()`.
## The first warning was:
## ℹ In argument: `weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a,
## rating_b)), weight = "equal")$value`.
## ℹ In group 1: `indicator = "EA_On-time_graduation"`.
## Caused by warning:
## ! `cur_data()` was deprecated in dplyr 1.1.0.
## ℹ Please use `pick()` instead.
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
OH_weighted_kappa_by_indicator
## # A tibble: 16 × 2
## indicator weighted_kappa
## <chr> <dbl>
## 1 EA_On-time_graduation 0.633
## 2 EA_Postsecondary_readiness 0.789
## 3 K-12_Access_to_effective_teaching 0.877
## 4 K-12_Access_to_high-quality_academic_supports -0.0426
## 5 K-12_Access_to_rigorous_coursework 0.459
## 6 K-12_Curricular_breadth 0.647
## 7 K-12_Engagement_in_schooling 0.603
## 8 K-12_Nonacademic_supports_for_student_success 1
## 9 K-12_Nonexclusionary_discipline_practices 1
## 10 K-12_Performance_in_coursework -0.0711
## 11 K-12_Performance_on_tests 0.547
## 12 K-12_School_climate 1
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat… 0.758
## 14 Pre-K_Academic_readiness 0.893
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr… NA
## 16 Pre-K_Self-regulation_and_attention_skills 0
IRRstatistics_Ohio <- OH_indicator_kappa_count_merge %>%
left_join(OH_weighted_kappa_by_indicator, by = "indicator") %>%
left_join(OH_percent_agreement, by = "indicator") %>%
select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)
indicator_order <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs",
"K-12_Access_to_effective_teaching",
"K-12_Access_to_rigorous_coursework",
"K-12_Curricular_breadth",
"K-12_Access_to_high-quality_academic_supports",
"K-12_School_climate",
"K-12_Nonexclusionary_discipline_practices",
"K-12_Nonacademic_supports_for_student_success",
"Pre-K_Academic_readiness",
"Pre-K_Self-regulation_and_attention_skills",
"K-12_Engagement_in_schooling",
"K-12_Performance_in_coursework",
"K-12_Performance_on_tests",
"EA_On-time_graduation",
"EA_Postsecondary_readiness"
)
# Create a mapping of old names to new names
indicator_name_mapping <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
"K-12_Access_to_effective_teaching" = "Effective Teaching",
"K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
"K-12_Curricular_breadth" = "Curricular Breadth",
"K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
"K-12_School_climate" = "School Climate",
"K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
"K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
"Pre-K_Academic_readiness" = "Academic Readiness",
"Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
"K-12_Engagement_in_schooling" = "Engagement in Schooling",
"K-12_Performance_in_coursework" = "Performance in Coursework",
"K-12_Performance_on_tests" = "Performance on Tests",
"EA_On-time_graduation" = "On-Time Graduation",
"EA_Postsecondary_readiness" = "Postsecondary Readiness"
)
IRRstatistics_Ohio_ordered <- IRRstatistics_Ohio %>%
arrange(match(indicator, indicator_order)) %>%
mutate(indicator_name = indicator_name_mapping[indicator]) %>%
select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
rename(
"Indicator" = indicator_name,
"Mapped Datasets" = mapped_counts,
"Percent Agreement" = percent_agreement,
"Cohen's Kappa" = kappa_stat,
"Cohen's Weighted Kappa" = weighted_kappa) %>%
mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))
kable(IRRstatistics_Ohio_ordered)
| Indicator | Mapped Datasets | Percent Agreement | Cohen’s Kappa | Cohen’s Weighted Kappa |
|---|---|---|---|---|
| Racial, Ethnic, and Economic Segregation | 53 | 95.73991 | 0.7580378 | 0.7580378 |
| High-Quality Pre-K Programs | NA | 100.00000 | NA | NA |
| Effective Teaching | 34 | 98.43049 | 0.8786630 | 0.8769413 |
| Rigorous Coursework | 64 | 90.13453 | 0.4451795 | 0.4592629 |
| Curricular Breadth | 49 | 93.04933 | 0.5463911 | 0.6468111 |
| High-Quality Academic Supports | 66 | 85.20179 | -0.0334948 | -0.0425627 |
| School Climate | 6 | 100.00000 | 1.0000000 | 1.0000000 |
| Nonexclusionary Discipline Practices | 6 | 100.00000 | 1.0000000 | 1.0000000 |
| Nonacademic Supports for Student Success | 1 | 100.00000 | 1.0000000 | 1.0000000 |
| Academic Readiness | 42 | 97.75785 | 0.8538088 | 0.8929722 |
| Self-Regulation and Attention Skills | 12 | 97.30942 | 0.0000000 | 0.0000000 |
| Engagement in Schooling | 268 | 71.07623 | 0.5527206 | 0.6031111 |
| Performance in Coursework | 72 | 83.85650 | -0.0527144 | -0.0710500 |
| Performance on Tests | 338 | 66.36771 | 0.5153473 | 0.5474093 |
| On-Time Graduation | 278 | 74.43946 | 0.6045300 | 0.6331089 |
| Postsecondary Readiness | 34 | 97.75785 | 0.8190816 | 0.7887146 |
write.csv(IRRstatistics_Ohio_ordered, "20251113_IRRstatistics_Ohio.csv", row.names = FALSE)
Filtering for only Texas datasets.
TX_rater_ab <- rater_ab %>%
filter(str_starts(id, "TX"))
str(TX_rater_ab)
## tibble [2,920 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:2920] "TX1218" "TX1218" "TX1218" "TX1218" ...
## $ rater : chr [1:2920] "R294" "R398" "R294" "R398" ...
## $ validation_round: chr [1:2920] "A" "B" "A" "B" ...
## $ dataset_name : chr [1:2920] "CREF.dat" "CREF.dat" "CREF.dat" "CREF.dat" ...
## $ indicator : chr [1:2920] "EA_Postsecondary_readiness" "EA_Postsecondary_readiness" "K-12_Performance_on_tests" "K-12_Performance_on_tests" ...
## $ rating : num [1:2920] 1 1 1 3 4 4 4 4 4 4 ...
TX_duplicates_check <- TX_rater_ab %>%
group_by(id, indicator, validation_round) %>%
filter(n() > 1) %>%
ungroup()
# View the duplicates
print(TX_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## # dataset_name <chr>, indicator <chr>, rating <dbl>
Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).
TX_rater_ab <- TX_rater_ab %>%
filter(!(id %in% c("TX1252")))
Making sure each id, indicator, validator combination is present.
TX_expanded_grid <- TX_rater_ab %>%
distinct(id, dataset_name) %>%
crossing(indicators_df, validation_round = c("A", "B"))
head(TX_expanded_grid)
## # A tibble: 6 × 4
## id dataset_name indicator validation_round
## <chr> <chr> <chr> <chr>
## 1 TX1218 CREF.dat EA_On-time_graduation A
## 2 TX1218 CREF.dat EA_On-time_graduation B
## 3 TX1218 CREF.dat EA_Postsecondary_readiness A
## 4 TX1218 CREF.dat EA_Postsecondary_readiness B
## 5 TX1218 CREF.dat K-12_Access_to_effective_teaching A
## 6 TX1218 CREF.dat K-12_Access_to_effective_teaching B
Making sure that id/indicator combinations that were not rated are represented as zero.
TX_ratings_with_zeros <- TX_expanded_grid %>%
left_join(TX_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating),
by = c("id", "dataset_name", "indicator", "validation_round")) %>%
mutate(rating = replace_na(rating, 0))
TX_ab_long <- TX_ratings_with_zeros %>%
pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
rename(rating_a = rating_A, rating_b = rating_B) %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
head(TX_ab_long)
## # A tibble: 6 × 6
## id dataset_name indicator rating_a rating_b agreement
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 TX1218 CREF.dat EA_On-time_graduation 0 0 1
## 2 TX1218 CREF.dat EA_Postsecondary_readiness 1 1 1
## 3 TX1218 CREF.dat K-12_Access_to_effective_teac… 0 0 1
## 4 TX1218 CREF.dat K-12_Access_to_high-quality_a… 0 0 1
## 5 TX1218 CREF.dat K-12_Access_to_rigorous_cours… 0 0 1
## 6 TX1218 CREF.dat K-12_Curricular_breadth 0 0 1
TX_dataset_agreement <- TX_ab_long %>%
group_by(id) %>%
summarise(percent_agreement = mean(agreement) * 100)
TX_dataset_agreement <- TX_dataset_agreement %>%
ungroup()
n_distinct(TX_dataset_agreement$id)
## [1] 825
head(TX_dataset_agreement)
## # A tibble: 6 × 2
## id percent_agreement
## <chr> <dbl>
## 1 TX1218 93.8
## 2 TX1219 81.2
## 3 TX1220 81.2
## 4 TX1221 87.5
## 5 TX1222 93.8
## 6 TX1223 81.2
write.csv(TX_dataset_agreement, "20251113_percent_agreement_by_id_Texas.csv", row.names = FALSE)
TX_indicators_mapped_per_id <- TX_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(id) %>%
summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
right_join(TX_ab_long %>% distinct(id), by = "id") %>%
mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))
TX_dataset_agreement_count <- TX_dataset_agreement %>%
left_join(TX_indicators_mapped_per_id, by = "id")
head(TX_dataset_agreement_count)
## # A tibble: 6 × 3
## id percent_agreement indicators_mapped_per_id
## <chr> <dbl> <int>
## 1 TX1218 93.8 2
## 2 TX1219 81.2 3
## 3 TX1220 81.2 3
## 4 TX1221 87.5 3
## 5 TX1222 93.8 1
## 6 TX1223 81.2 3
TX_dataset_agreement_count <- TX_dataset_agreement_count %>%
arrange(desc(percent_agreement), (indicators_mapped_per_id))
TX_dataset_agreement_count$id <- factor(TX_dataset_agreement_count$id, levels = unique(TX_dataset_agreement_count$id))
TX_dataset_agreement_count$indicators_mapped_per_id <-
factor(TX_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
TX_percentagreement_barplot <- ggplot(TX_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
geom_bar(stat = "identity") +
theme_minimal() +
scale_fill_viridis_d(direction = -1) +
scale_y_continuous(expand = c(0, 0)) +
labs(
title = "Texas - Percentage Agreement by Dataset",
x = "Dataset",
y = "Agreement (%)",
fill = "Number of Indicators"
) +
theme(
axis.text.y = element_text(size = 1)
) +
coord_flip()
TX_percentagreement_barplot
ggsave("20251113_percentagreement_barplot_Texas.svg", plot = TX_percentagreement_barplot, width = 6.5, height = 6.5)
Printing results for chart in findings section.
TX_percent_agreement <- TX_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
group_by(indicator) %>%
summarise(percent_agreement = mean(agreement) * 100)
TX_percent_agreement
## # A tibble: 16 × 2
## indicator percent_agreement
## <chr> <dbl>
## 1 EA_On-time_graduation 88.4
## 2 EA_Postsecondary_readiness 81.7
## 3 K-12_Access_to_effective_teaching 100
## 4 K-12_Access_to_high-quality_academic_supports 98.5
## 5 K-12_Access_to_rigorous_coursework 94.8
## 6 K-12_Curricular_breadth 99.5
## 7 K-12_Engagement_in_schooling 90.2
## 8 K-12_Nonacademic_supports_for_student_success 99.5
## 9 K-12_Nonexclusionary_discipline_practices 98.4
## 10 K-12_Performance_in_coursework 88.1
## 11 K-12_Performance_on_tests 87.3
## 12 K-12_School_climate 98.4
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre… 87.0
## 14 Pre-K_Academic_readiness 99.6
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr… 96.6
## 16 Pre-K_Self-regulation_and_attention_skills 99.3
TX_ab_long <- TX_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
Calculating Cohen’s kappa, first for the whole dataset, then by indicator.
TX_contingency_table <- data.frame(TX_ab_long$rating_a, TX_ab_long$rating_b)
TX_kappa_result <- kappa2(TX_contingency_table, weight = "unweighted")
print(TX_kappa_result)
## Cohen's Kappa for 2 Raters (Weights: unweighted)
##
## Subjects = 13200
## Raters = 2
## Kappa = 0.707
##
## z = 96.6
## p-value = 0
TX_kappa_by_indicator <- TX_ab_long %>%
group_by(indicator) %>%
summarise(
kappa_stat = {
TX_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
TX_kappa_result <- kappa2(TX_contingency_table, weight = "unweighted")
TX_kappa_result$value
},
.groups = "drop"
)
print(TX_kappa_by_indicator)
## # A tibble: 16 × 2
## indicator kappa_stat
## <chr> <dbl>
## 1 EA_On-time_graduation 0.705
## 2 EA_Postsecondary_readiness 0.514
## 3 K-12_Access_to_effective_teaching 1
## 4 K-12_Access_to_high-quality_academic_supports 0.744
## 5 K-12_Access_to_rigorous_coursework 0.809
## 6 K-12_Curricular_breadth 0.748
## 7 K-12_Engagement_in_schooling 0.662
## 8 K-12_Nonacademic_supports_for_student_success 0
## 9 K-12_Nonexclusionary_discipline_practices 0.734
## 10 K-12_Performance_in_coursework 0.372
## 11 K-12_Performance_on_tests 0.769
## 12 K-12_School_climate -0.00393
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation 0.614
## 14 Pre-K_Academic_readiness 0.798
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs -0.00746
## 16 Pre-K_Self-regulation_and_attention_skills 0
Counting the number of datasets mapped to each indicator to create a plot afterwards.
TX_mapped_counts <- TX_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(indicator) %>%
summarise(mapped_counts = n_distinct(id), .groups = "drop")
TX_mapped_counts
## # A tibble: 16 × 2
## indicator mapped_counts
## <chr> <int>
## 1 EA_On-time_graduation 217
## 2 EA_Postsecondary_readiness 254
## 3 K-12_Access_to_effective_teaching 20
## 4 K-12_Access_to_high-quality_academic_supports 30
## 5 K-12_Access_to_rigorous_coursework 148
## 6 K-12_Curricular_breadth 10
## 7 K-12_Engagement_in_schooling 171
## 8 K-12_Nonacademic_supports_for_student_success 4
## 9 K-12_Nonexclusionary_discipline_practices 28
## 10 K-12_Performance_in_coursework 126
## 11 K-12_Performance_on_tests 440
## 12 K-12_School_climate 13
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati… 230
## 14 Pre-K_Academic_readiness 9
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra… 28
## 16 Pre-K_Self-regulation_and_attention_skills 6
TX_indicator_kappa_count_merge <- TX_kappa_by_indicator %>%
left_join(TX_mapped_counts, by = "indicator")
head(TX_indicator_kappa_count_merge)
## # A tibble: 6 × 3
## indicator kappa_stat mapped_counts
## <chr> <dbl> <int>
## 1 EA_On-time_graduation 0.705 217
## 2 EA_Postsecondary_readiness 0.514 254
## 3 K-12_Access_to_effective_teaching 1 20
## 4 K-12_Access_to_high-quality_academic_supports 0.744 30
## 5 K-12_Access_to_rigorous_coursework 0.809 148
## 6 K-12_Curricular_breadth 0.748 10
clean_labels <- sub("^[^_]+_", "", TX_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
TX_indicator_kappa_count_merge$indicator_cleaned <- factor(
wrapped_labels,
levels = wrapped_labels[order(TX_indicator_kappa_count_merge$kappa_stat)]
)
TX_balloonplot <- ggballoonplot(TX_indicator_kappa_count_merge,
x = "indicator_cleaned",
y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
scale_fill_viridis_c(option = "C") +
theme_minimal() +
ylim(-1,1) +
labs(
title = "Texas Interrater Reliability by Indicator",
subtitle = "Overall Kappa is 0.707",
x = "Indicator",
y = "Cohen's Kappa",
fill = "Kappa",
size = "Number of Datasets"
) +
theme(
axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
) +
coord_flip()
TX_balloonplot
ggsave("20251113_IRR_kappa_balloonplot_Texas.svg", plot = TX_balloonplot, width = 6.5, height = 6.5)
TX_heatmap_long <- TX_ab_long %>%
mutate(weight = case_when(
rating_a == 0 & rating_b == 0 ~ 0, # Both rated zero = Not Rated
(rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2, # One rated zero, other rated indirect = Moderate Disagreement
(rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3, # One rated zero, other rated direct = Serious Disagreement
(rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1, # Both rated indirect = Agreement
(rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1, # Both rated direct = Agreement
TRUE ~ NA_real_ # Handle any other cases (e.g., missing or invalid ratings)
))
clean_labels <- sub("^[^_]+_", "", TX_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
TX_heatmap_long$indicator_cleaned <- wrapped_labels
TX_heatmap_long_selected <- TX_heatmap_long %>%
select(id, indicator_cleaned, weight)
head(TX_heatmap_long_selected)
## # A tibble: 6 × 3
## id indicator_cleaned weight
## <chr> <chr> <dbl>
## 1 TX1218 "On-time graduation" 0
## 2 TX1218 "Postsecondary readiness" 1
## 3 TX1218 "Access to effective teaching" 0
## 4 TX1218 "Access to high-quality academic\nsupports" 0
## 5 TX1218 "Access to rigorous coursework" 0
## 6 TX1218 "Curricular breadth" 0
Dataset disagreement with marginal plots.
# 1. Set factor levels based on heatmap ordering
TX_indicator_levels <- levels(with(TX_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
TX_dataset_levels <- levels(with(TX_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
TX_agreement_heatmap <- ggplot(TX_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Texas Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
axis.text.y = element_text(size = 1),
legend.text = element_text(size = 6),
plot.title = element_text(hjust = 0.5)
)
# 3. Extract the legend
TX_legend <- cowplot::get_legend(TX_agreement_heatmap)
TX_legend_plot <- wrap_elements(TX_legend)
# 4. Remove legend from main plot
TX_main_plot <- TX_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))
# 5. Create top marginal plot (stacked bar chart by indicator)
TX_top_plot <- TX_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = TX_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
scale_x_discrete(drop = FALSE) +
theme_minimal() +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "none"
)
# 6. Create right marginal plot (stacked bar chart by dataset)
TX_right_plot <- TX_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
mutate(id = factor(id, levels = TX_dataset_levels)) %>%
ggplot(aes(x = id, fill = factor(weight))) +
geom_bar(position = "stack") +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
theme_minimal() +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
legend.position = "none"
)
TX_final_plot <- TX_top_plot + TX_legend + TX_main_plot + TX_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))
TX_final_plot
ggsave("20251113_agreement_heatmap_with_marginals_Texas.svg", plot = TX_final_plot, width = 8, height = 12)
Creating the png.
# 1. Set factor levels based on heatmap ordering
TX_indicator_levels <- levels(with(TX_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
TX_dataset_levels <- levels(with(TX_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
TX_agreement_heatmap <- ggplot(TX_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Texas Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = element_text(size = 30),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20),
legend.text = element_text(size = 20),
legend.title = element_text(size = 20),
plot.title = element_text(size = 25, hjust = 0.5)
)
# 3. Extract the legend
TX_legend <- cowplot::get_legend(TX_agreement_heatmap)
# 4. Remove legend from main plot and adjust bar width
TX_main_plot <- TX_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20), hjust = 0.5)
# 5. Create top marginal plot
TX_top_plot <- TX_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = TX_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 20),
legend.position = "none")
# 6. Create right marginal plot
TX_right_data <- TX_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
count(id, weight) %>%
mutate(id = factor(id, levels = TX_dataset_levels))
TX_right_plot <- ggplot(TX_right_data, aes(x = id, y = n, fill = factor(weight))) +
geom_col(position = "stack", width = 1, colour = NA) +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
axis.title.x = element_text(size = 20),
legend.position = "none",
panel.spacing = unit(0, "lines")
)
TX_final_plot3 <- TX_top_plot + TX_legend + TX_main_plot + TX_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))
TX_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
ggsave("20251113_agreement_heatmap_with_marginals_Texas.png", plot = TX_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.
TX_ab_directindirect <- TX_ab_long %>%
select(id, dataset_name, indicator, rating_a, rating_b) %>%
mutate(across(c(rating_a, rating_b), ~ case_when(
. == 2 ~ 1,
. %in% c(3,4) ~ 2,
TRUE ~ .
)))
TX_ratings_kappaweighted <- TX_ab_directindirect %>%
select(rating_a, rating_b)
TX_kappa_weighted <- kappa2(TX_ratings_kappaweighted, weight = "equal")
print(TX_kappa_weighted)
## Cohen's Kappa for 2 Raters (Weights: equal)
##
## Subjects = 13200
## Raters = 2
## Kappa = 0.741
##
## z = 89
## p-value = 0
TX_weighted_kappa_by_indicator <- TX_ab_directindirect %>%
group_by(indicator) %>%
summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
TX_weighted_kappa_by_indicator
## # A tibble: 16 × 2
## indicator weighted_kappa
## <chr> <dbl>
## 1 EA_On-time_graduation 0.815
## 2 EA_Postsecondary_readiness 0.535
## 3 K-12_Access_to_effective_teaching 1
## 4 K-12_Access_to_high-quality_academic_supports 0.743
## 5 K-12_Access_to_rigorous_coursework 0.820
## 6 K-12_Curricular_breadth 0.748
## 7 K-12_Engagement_in_schooling 0.679
## 8 K-12_Nonacademic_supports_for_student_success 0
## 9 K-12_Nonexclusionary_discipline_practices 0.790
## 10 K-12_Performance_in_coursework 0.355
## 11 K-12_Performance_on_tests 0.793
## 12 K-12_School_climate -0.00789
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat… 0.614
## 14 Pre-K_Academic_readiness 0.798
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr… -0.0107
## 16 Pre-K_Self-regulation_and_attention_skills 0
IRRstatistics_Texas <- TX_indicator_kappa_count_merge %>%
left_join(TX_weighted_kappa_by_indicator, by = "indicator") %>%
left_join(TX_percent_agreement, by = "indicator") %>%
select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)
indicator_order <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs",
"K-12_Access_to_effective_teaching",
"K-12_Access_to_rigorous_coursework",
"K-12_Curricular_breadth",
"K-12_Access_to_high-quality_academic_supports",
"K-12_School_climate",
"K-12_Nonexclusionary_discipline_practices",
"K-12_Nonacademic_supports_for_student_success",
"Pre-K_Academic_readiness",
"Pre-K_Self-regulation_and_attention_skills",
"K-12_Engagement_in_schooling",
"K-12_Performance_in_coursework",
"K-12_Performance_on_tests",
"EA_On-time_graduation",
"EA_Postsecondary_readiness"
)
indicator_name_mapping <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
"K-12_Access_to_effective_teaching" = "Effective Teaching",
"K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
"K-12_Curricular_breadth" = "Curricular Breadth",
"K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
"K-12_School_climate" = "School Climate",
"K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
"K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
"Pre-K_Academic_readiness" = "Academic Readiness",
"Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
"K-12_Engagement_in_schooling" = "Engagement in Schooling",
"K-12_Performance_in_coursework" = "Performance in Coursework",
"K-12_Performance_on_tests" = "Performance on Tests",
"EA_On-time_graduation" = "On-Time Graduation",
"EA_Postsecondary_readiness" = "Postsecondary Readiness"
)
IRRstatistics_Texas_ordered <- IRRstatistics_Texas %>%
arrange(match(indicator, indicator_order)) %>%
mutate(indicator_name = indicator_name_mapping[indicator]) %>%
select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
rename(
"Indicator" = indicator_name,
"Mapped Datasets" = mapped_counts,
"Percent Agreement" = percent_agreement,
"Cohen's Kappa" = kappa_stat,
"Cohen's Weighted Kappa" = weighted_kappa) %>%
mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))
kable(IRRstatistics_Texas_ordered)
| Indicator | Mapped Datasets | Percent Agreement | Cohen’s Kappa | Cohen’s Weighted Kappa |
|---|---|---|---|---|
| Racial, Ethnic, and Economic Segregation | 230 | 87.03030 | 0.6144270 | 0.6144270 |
| High-Quality Pre-K Programs | 28 | 96.60606 | -0.0074578 | -0.0107436 |
| Effective Teaching | 20 | 100.00000 | 1.0000000 | 1.0000000 |
| Rigorous Coursework | 148 | 94.78788 | 0.8094207 | 0.8199913 |
| Curricular Breadth | 10 | 99.51515 | 0.7477064 | 0.7477064 |
| High-Quality Academic Supports | 30 | 98.54545 | 0.7441860 | 0.7429907 |
| School Climate | 13 | 98.42424 | -0.0039315 | -0.0078940 |
| Nonexclusionary Discipline Practices | 28 | 98.42424 | 0.7342534 | 0.7897920 |
| Nonacademic Supports for Student Success | 4 | 99.51515 | 0.0000000 | 0.0000000 |
| Academic Readiness | 9 | 99.63636 | 0.7982392 | 0.7982392 |
| Self-Regulation and Attention Skills | 6 | 99.27273 | 0.0000000 | 0.0000000 |
| Engagement in Schooling | 171 | 90.18182 | 0.6623073 | 0.6789389 |
| Performance in Coursework | 126 | 88.12121 | 0.3718974 | 0.3547443 |
| Performance on Tests | 440 | 87.27273 | 0.7692080 | 0.7928960 |
| On-Time Graduation | 217 | 88.36364 | 0.7046297 | 0.8150208 |
| Postsecondary Readiness | 254 | 81.69697 | 0.5142195 | 0.5345932 |
write.csv(IRRstatistics_Texas_ordered, "20251113_IRRstatistics_Texas.csv", row.names = FALSE)
Filtering for only North Carolina datasets.
NC_rater_ab <- rater_ab %>%
filter(str_starts(id, "NC"))
str(NC_rater_ab)
## tibble [995 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:995] "NC1218" "NC1218" "NC1218" "NC1218" ...
## $ rater : chr [1:995] "R562" "R917" "R562" "R917" ...
## $ validation_round: chr [1:995] "A" "B" "A" "B" ...
## $ dataset_name : chr [1:995] "rcd_161.xlsx" "rcd_161.xlsx" "rcd_161.xlsx" "rcd_161.xlsx" ...
## $ indicator : chr [1:995] "EA_Postsecondary_readiness" "EA_Postsecondary_readiness" "K-12_Access_to_rigorous_coursework" "K-12_Access_to_rigorous_coursework" ...
## $ rating : num [1:995] 2 2 4 4 2 4 4 4 4 4 ...
NC_duplicates_check <- NC_rater_ab %>%
group_by(id, indicator, validation_round) %>%
filter(n() > 1) %>%
ungroup()
# View the duplicates
print(NC_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## # dataset_name <chr>, indicator <chr>, rating <dbl>
Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).
NC_rater_ab <- NC_rater_ab %>%
filter(!(id %in% c("NC1318", "NC1319")))
Making sure each id, indicator, validator combination is present.
NC_expanded_grid <- NC_rater_ab %>%
distinct(id, dataset_name) %>%
crossing(indicators_df, validation_round = c("A", "B"))
head(NC_expanded_grid)
## # A tibble: 6 × 4
## id dataset_name indicator validation_round
## <chr> <chr> <chr> <chr>
## 1 NC1218 rcd_161.xlsx EA_On-time_graduation A
## 2 NC1218 rcd_161.xlsx EA_On-time_graduation B
## 3 NC1218 rcd_161.xlsx EA_Postsecondary_readiness A
## 4 NC1218 rcd_161.xlsx EA_Postsecondary_readiness B
## 5 NC1218 rcd_161.xlsx K-12_Access_to_effective_teaching A
## 6 NC1218 rcd_161.xlsx K-12_Access_to_effective_teaching B
Making sure that id/indicator combinations that were not rated are represented as zero.
NC_ratings_with_zeros <- NC_expanded_grid %>%
left_join(NC_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating),
by = c("id", "dataset_name", "indicator", "validation_round")) %>%
mutate(rating = replace_na(rating, 0))
NC_ab_long <- NC_ratings_with_zeros %>%
pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
rename(rating_a = rating_A, rating_b = rating_B) %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
head(NC_ab_long)
## # A tibble: 6 × 6
## id dataset_name indicator rating_a rating_b agreement
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 NC1218 rcd_161.xlsx EA_On-time_graduation 0 0 1
## 2 NC1218 rcd_161.xlsx EA_Postsecondary_readiness 2 2 1
## 3 NC1218 rcd_161.xlsx K-12_Access_to_effective_teac… 0 0 1
## 4 NC1218 rcd_161.xlsx K-12_Access_to_high-quality_a… 0 0 1
## 5 NC1218 rcd_161.xlsx K-12_Access_to_rigorous_cours… 4 4 1
## 6 NC1218 rcd_161.xlsx K-12_Curricular_breadth 0 0 1
NC_dataset_agreement <- NC_ab_long %>%
group_by(id) %>%
summarise(percent_agreement = mean(agreement) * 100)
NC_dataset_agreement <- NC_dataset_agreement %>%
ungroup()
n_distinct(NC_dataset_agreement$id)
## [1] 275
head(NC_dataset_agreement)
## # A tibble: 6 × 2
## id percent_agreement
## <chr> <dbl>
## 1 NC1218 100
## 2 NC1219 93.8
## 3 NC1220 100
## 4 NC1221 100
## 5 NC1222 100
## 6 NC1223 93.8
write.csv(NC_dataset_agreement, "20251113_percent_agreement_by_id_North_Carolina.csv", row.names = FALSE)
NC_indicators_mapped_per_id <- NC_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(id) %>%
summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
right_join(NC_ab_long %>% distinct(id), by = "id") %>%
mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))
NC_dataset_agreement_count <- NC_dataset_agreement %>%
left_join(NC_indicators_mapped_per_id, by = "id")
head(NC_dataset_agreement_count)
## # A tibble: 6 × 3
## id percent_agreement indicators_mapped_per_id
## <chr> <dbl> <int>
## 1 NC1218 100 2
## 2 NC1219 93.8 1
## 3 NC1220 100 1
## 4 NC1221 100 1
## 5 NC1222 100 1
## 6 NC1223 93.8 2
NC_dataset_agreement_count <- NC_dataset_agreement_count %>%
arrange(desc(percent_agreement), (indicators_mapped_per_id))
NC_dataset_agreement_count$id <- factor(NC_dataset_agreement_count$id, levels = unique(NC_dataset_agreement_count$id))
NC_dataset_agreement_count$indicators_mapped_per_id <-
factor(NC_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
NC_percentagreement_barplot <- ggplot(NC_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
geom_bar(stat = "identity") +
theme_minimal() +
scale_fill_viridis_d(direction = -1) +
scale_y_continuous(expand = c(0, 0)) +
labs(
title = "North Carolina - Percentage Agreement by Dataset",
x = "Dataset",
y = "Agreement (%)",
fill = "Number of Indicators"
) +
theme(
axis.text.y = element_text(size = 1)
) +
coord_flip()
NC_percentagreement_barplot
ggsave("20251113_percentagreement_barplot_North_Carolina.svg", plot = NC_percentagreement_barplot, width = 6.5, height = 6.5)
Printing results for chart in findings section.
NC_percent_agreement <- NC_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
group_by(indicator) %>%
summarise(percent_agreement = mean(agreement) * 100)
NC_percent_agreement
## # A tibble: 16 × 2
## indicator percent_agreement
## <chr> <dbl>
## 1 EA_On-time_graduation 87.6
## 2 EA_Postsecondary_readiness 79.3
## 3 K-12_Access_to_effective_teaching 97.5
## 4 K-12_Access_to_high-quality_academic_supports 97.8
## 5 K-12_Access_to_rigorous_coursework 94.9
## 6 K-12_Curricular_breadth 98.2
## 7 K-12_Engagement_in_schooling 76
## 8 K-12_Nonacademic_supports_for_student_success 99.6
## 9 K-12_Nonexclusionary_discipline_practices 98.9
## 10 K-12_Performance_in_coursework 88.7
## 11 K-12_Performance_on_tests 89.1
## 12 K-12_School_climate 88.4
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre… 92.4
## 14 Pre-K_Academic_readiness 99.6
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr… 100
## 16 Pre-K_Self-regulation_and_attention_skills 99.6
NC_ab_long <- NC_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
Calculating Cohen’s kappa, first for the whole dataset, then by indicator.
NC_contingency_table <- data.frame(NC_ab_long$rating_a, NC_ab_long$rating_b)
NC_kappa_result <- kappa2(NC_contingency_table, weight = "unweighted")
print(NC_kappa_result)
## Cohen's Kappa for 2 Raters (Weights: unweighted)
##
## Subjects = 4400
## Raters = 2
## Kappa = 0.657
##
## z = 66.5
## p-value = 0
NC_kappa_by_indicator <- NC_ab_long %>%
group_by(indicator) %>%
summarise(
kappa_stat = {
NC_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
NC_kappa_result <- kappa2(NC_contingency_table, weight = "unweighted")
NC_kappa_result$value
},
.groups = "drop"
)
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `kappa_stat = { ... }`.
## ℹ In group 4: `indicator = "K-12_Access_to_high-quality_academic_supports"`.
## Caused by warning in `sqrt()`:
## ! NaNs produced
print(NC_kappa_by_indicator)
## # A tibble: 16 × 2
## indicator kappa_stat
## <chr> <dbl>
## 1 EA_On-time_graduation 0.759
## 2 EA_Postsecondary_readiness 0.644
## 3 K-12_Access_to_effective_teaching 0.525
## 4 K-12_Access_to_high-quality_academic_supports 0
## 5 K-12_Access_to_rigorous_coursework 0.639
## 6 K-12_Curricular_breadth 0.607
## 7 K-12_Engagement_in_schooling 0.408
## 8 K-12_Nonacademic_supports_for_student_success 0
## 9 K-12_Nonexclusionary_discipline_practices 0.945
## 10 K-12_Performance_in_coursework 0.421
## 11 K-12_Performance_on_tests 0.708
## 12 K-12_School_climate 0.566
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation 0.295
## 14 Pre-K_Academic_readiness 0.499
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs 1
## 16 Pre-K_Self-regulation_and_attention_skills 0
Counting the number of datasets mapped to each indicator to create a plot afterwards.
NC_mapped_counts <- NC_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(indicator) %>%
summarise(mapped_counts = n_distinct(id), .groups = "drop")
NC_mapped_counts
## # A tibble: 16 × 2
## indicator mapped_counts
## <chr> <int>
## 1 EA_On-time_graduation 104
## 2 EA_Postsecondary_readiness 130
## 3 K-12_Access_to_effective_teaching 9
## 4 K-12_Access_to_high-quality_academic_supports 6
## 5 K-12_Access_to_rigorous_coursework 27
## 6 K-12_Curricular_breadth 8
## 7 K-12_Engagement_in_schooling 99
## 8 K-12_Nonacademic_supports_for_student_success 1
## 9 K-12_Nonexclusionary_discipline_practices 30
## 10 K-12_Performance_in_coursework 45
## 11 K-12_Performance_on_tests 70
## 12 K-12_School_climate 56
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati… 26
## 14 Pre-K_Academic_readiness 1
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra… 1
## 16 Pre-K_Self-regulation_and_attention_skills 1
NC_indicator_kappa_count_merge <- NC_kappa_by_indicator %>%
left_join(NC_mapped_counts, by = "indicator")
head(NC_indicator_kappa_count_merge)
## # A tibble: 6 × 3
## indicator kappa_stat mapped_counts
## <chr> <dbl> <int>
## 1 EA_On-time_graduation 0.759 104
## 2 EA_Postsecondary_readiness 0.644 130
## 3 K-12_Access_to_effective_teaching 0.525 9
## 4 K-12_Access_to_high-quality_academic_supports 0 6
## 5 K-12_Access_to_rigorous_coursework 0.639 27
## 6 K-12_Curricular_breadth 0.607 8
clean_labels <- sub("^[^_]+_", "", NC_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
NC_indicator_kappa_count_merge$indicator_cleaned <- factor(
wrapped_labels,
levels = wrapped_labels[order(NC_indicator_kappa_count_merge$kappa_stat)]
)
NC_balloonplot <- ggballoonplot(NC_indicator_kappa_count_merge,
x = "indicator_cleaned",
y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
scale_fill_viridis_c(option = "C") +
theme_minimal() +
ylim(-1,1) +
labs(
title = "North Carolina Interrater Reliability by Indicator",
subtitle = "Overall Kappa is 0.657",
x = "Indicator",
y = "Cohen's Kappa",
fill = "Kappa",
size = "Number of Datasets"
) +
theme(
axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
) +
coord_flip()
NC_balloonplot
ggsave("20251113_IRR_kappa_balloonplot_North_Carolina.svg", plot = NC_balloonplot, width = 6.5, height = 6.5)
NC_heatmap_long <- NC_ab_long %>%
mutate(weight = case_when(
rating_a == 0 & rating_b == 0 ~ 0, # Both rated zero = Not Rated
(rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2, # One rated zero, other rated indirect = Moderate Disagreement
(rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3, # One rated zero, other rated direct = Serious Disagreement
(rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1, # Both rated indirect = Agreement
(rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1, # Both rated direct = Agreement
TRUE ~ NA_real_ # Handle any other cases (e.g., missing or invalid ratings)
))
clean_labels <- sub("^[^_]+_", "", NC_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
NC_heatmap_long$indicator_cleaned <- wrapped_labels
NC_heatmap_long_selected <- NC_heatmap_long %>%
select(id, indicator_cleaned, weight)
head(NC_heatmap_long_selected)
## # A tibble: 6 × 3
## id indicator_cleaned weight
## <chr> <chr> <dbl>
## 1 NC1218 "On-time graduation" 0
## 2 NC1218 "Postsecondary readiness" 1
## 3 NC1218 "Access to effective teaching" 0
## 4 NC1218 "Access to high-quality academic\nsupports" 0
## 5 NC1218 "Access to rigorous coursework" 1
## 6 NC1218 "Curricular breadth" 0
Dataset disagreement with marginal plots
# 1. Set factor levels based on heatmap ordering
NC_indicator_levels <- levels(with(NC_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
NC_dataset_levels <- levels(with(NC_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
NC_agreement_heatmap <- ggplot(NC_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "North Carolina Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
axis.text.y = element_text(size = 1),
legend.text = element_text(size = 6),
plot.title = element_text(hjust = 0.5)
)
# 3. Extract the legend
NC_legend <- cowplot::get_legend(NC_agreement_heatmap)
NC_legend_plot <- wrap_elements(NC_legend)
# 4. Remove legend from main plot
NC_main_plot <- NC_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))
# 5. Create top marginal plot (stacked bar chart by indicator)
NC_top_plot <- NC_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = NC_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
scale_x_discrete(drop = FALSE) +
theme_minimal() +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "none"
)
# 6. Create right marginal plot (stacked bar chart by dataset)
NC_right_plot <- NC_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
mutate(id = factor(id, levels = NC_dataset_levels)) %>%
ggplot(aes(x = id, fill = factor(weight))) +
geom_bar(position = "stack") +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
theme_minimal() +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
legend.position = "none"
)
NC_final_plot <- NC_top_plot + NC_legend + NC_main_plot + NC_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))
NC_final_plot
ggsave("20251113_agreement_heatmap_with_marginals_North_Carolina.svg", plot = NC_final_plot, width = 8, height = 12)
Creating the png.
# 1. Set factor levels based on heatmap ordering
NC_indicator_levels <- levels(with(NC_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
NC_dataset_levels <- levels(with(NC_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
NC_agreement_heatmap <- ggplot(NC_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "North Carolina Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = element_text(size = 30),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20),
legend.text = element_text(size = 20),
legend.title = element_text(size = 20),
plot.title = element_text(size = 25, hjust = 0.5)
)
# 3. Extract the legend
NC_legend <- cowplot::get_legend(NC_agreement_heatmap)
# 4. Remove legend from main plot and adjust bar width
NC_main_plot <- NC_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20), hjust = 0.5)
# 5. Create top marginal plot
NC_top_plot <- NC_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = NC_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 20),
legend.position = "none")
# 6. Create right marginal plot
NC_right_data <- NC_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
count(id, weight) %>%
mutate(id = factor(id, levels = NC_dataset_levels))
NC_right_plot <- ggplot(NC_right_data, aes(x = id, y = n, fill = factor(weight))) +
geom_col(position = "stack", width = 1, colour = NA) +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
axis.title.x = element_text(size = 20),
legend.position = "none",
panel.spacing = unit(0, "lines")
)
NC_final_plot3 <- NC_top_plot + NC_legend + NC_main_plot + NC_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))
NC_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
ggsave("20251113_agreement_heatmap_with_marginals_North_Carolina.png", plot = NC_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.
NC_ab_directindirect <- NC_ab_long %>%
select(id, dataset_name, indicator, rating_a, rating_b) %>%
mutate(across(c(rating_a, rating_b), ~ case_when(
. == 2 ~ 1,
. %in% c(3,4) ~ 2,
TRUE ~ .
)))
NC_ratings_kappaweighted <- NC_ab_directindirect %>%
select(rating_a, rating_b)
NC_kappa_weighted <- kappa2(NC_ratings_kappaweighted, weight = "equal")
print(NC_kappa_weighted)
## Cohen's Kappa for 2 Raters (Weights: equal)
##
## Subjects = 4400
## Raters = 2
## Kappa = 0.744
##
## z = 54.5
## p-value = 0
NC_weighted_kappa_by_indicator <- NC_ab_directindirect %>%
group_by(indicator) %>%
summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a,
## rating_b)), weight = "equal")$value`.
## ℹ In group 4: `indicator = "K-12_Access_to_high-quality_academic_supports"`.
## Caused by warning in `sqrt()`:
## ! NaNs produced
NC_weighted_kappa_by_indicator
## # A tibble: 16 × 2
## indicator weighted_kappa
## <chr> <dbl>
## 1 EA_On-time_graduation 0.802
## 2 EA_Postsecondary_readiness 0.768
## 3 K-12_Access_to_effective_teaching 0.764
## 4 K-12_Access_to_high-quality_academic_supports 0
## 5 K-12_Access_to_rigorous_coursework 0.661
## 6 K-12_Curricular_breadth 0.659
## 7 K-12_Engagement_in_schooling 0.410
## 8 K-12_Nonacademic_supports_for_student_success 0
## 9 K-12_Nonexclusionary_discipline_practices 0.991
## 10 K-12_Performance_in_coursework 0.423
## 11 K-12_Performance_on_tests 0.828
## 12 K-12_School_climate 0.723
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat… 0.315
## 14 Pre-K_Academic_readiness 1
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr… 1
## 16 Pre-K_Self-regulation_and_attention_skills 0
IRRstatistics_North_Carolina <- NC_indicator_kappa_count_merge %>%
left_join(NC_weighted_kappa_by_indicator, by = "indicator") %>%
left_join(NC_percent_agreement, by = "indicator") %>%
select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)
indicator_order <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs",
"K-12_Access_to_effective_teaching",
"K-12_Access_to_rigorous_coursework",
"K-12_Curricular_breadth",
"K-12_Access_to_high-quality_academic_supports",
"K-12_School_climate",
"K-12_Nonexclusionary_discipline_practices",
"K-12_Nonacademic_supports_for_student_success",
"Pre-K_Academic_readiness",
"Pre-K_Self-regulation_and_attention_skills",
"K-12_Engagement_in_schooling",
"K-12_Performance_in_coursework",
"K-12_Performance_on_tests",
"EA_On-time_graduation",
"EA_Postsecondary_readiness"
)
indicator_name_mapping <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
"K-12_Access_to_effective_teaching" = "Effective Teaching",
"K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
"K-12_Curricular_breadth" = "Curricular Breadth",
"K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
"K-12_School_climate" = "School Climate",
"K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
"K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
"Pre-K_Academic_readiness" = "Academic Readiness",
"Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
"K-12_Engagement_in_schooling" = "Engagement in Schooling",
"K-12_Performance_in_coursework" = "Performance in Coursework",
"K-12_Performance_on_tests" = "Performance on Tests",
"EA_On-time_graduation" = "On-Time Graduation",
"EA_Postsecondary_readiness" = "Postsecondary Readiness"
)
IRRstatistics_North_Carolina_ordered <- IRRstatistics_North_Carolina %>%
arrange(match(indicator, indicator_order)) %>%
mutate(indicator_name = indicator_name_mapping[indicator]) %>%
select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
rename(
"Indicator" = indicator_name,
"Mapped Datasets" = mapped_counts,
"Percent Agreement" = percent_agreement,
"Cohen's Kappa" = kappa_stat,
"Cohen's Weighted Kappa" = weighted_kappa) %>%
mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))
kable(IRRstatistics_North_Carolina_ordered)
| Indicator | Mapped Datasets | Percent Agreement | Cohen’s Kappa | Cohen’s Weighted Kappa |
|---|---|---|---|---|
| Racial, Ethnic, and Economic Segregation | 26 | 92.36364 | 0.2951300 | 0.3152298 |
| High-Quality Pre-K Programs | 1 | 100.00000 | 1.0000000 | 1.0000000 |
| Effective Teaching | 9 | 97.45455 | 0.5248087 | 0.7636780 |
| Rigorous Coursework | 27 | 94.90909 | 0.6388368 | 0.6605776 |
| Curricular Breadth | 8 | 98.18182 | 0.6074793 | 0.6592317 |
| High-Quality Academic Supports | 6 | 97.81818 | 0.0000000 | 0.0000000 |
| School Climate | 56 | 88.36364 | 0.5664811 | 0.7228346 |
| Nonexclusionary Discipline Practices | 30 | 98.90909 | 0.9454473 | 0.9905774 |
| Nonacademic Supports for Student Success | 1 | 99.63636 | 0.0000000 | 0.0000000 |
| Academic Readiness | 1 | 99.63636 | 0.4990893 | 1.0000000 |
| Self-Regulation and Attention Skills | 1 | 99.63636 | 0.0000000 | 0.0000000 |
| Engagement in Schooling | 99 | 76.00000 | 0.4084287 | 0.4103579 |
| Performance in Coursework | 45 | 88.72727 | 0.4212492 | 0.4225115 |
| Performance on Tests | 70 | 89.09091 | 0.7077474 | 0.8277325 |
| On-Time Graduation | 104 | 87.63636 | 0.7585601 | 0.8017844 |
| Postsecondary Readiness | 130 | 79.27273 | 0.6435313 | 0.7679129 |
write.csv(IRRstatistics_North_Carolina_ordered, "20251113_IRRstatistics_North_Carolina.csv", row.names = FALSE)
Filtering for only Washington D.C. datasets.
DC_rater_ab <- rater_ab %>%
filter(str_starts(id, "DC"))
str(DC_rater_ab)
## tibble [929 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:929] "DC1218" "DC1218" "DC1218" "DC1218" ...
## $ rater : chr [1:929] "R562" "R217" "R562" "R217" ...
## $ validation_round: chr [1:929] "A" "B" "A" "B" ...
## $ dataset_name : chr [1:929] "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" ...
## $ indicator : chr [1:929] "K-12_Engagement_in_schooling" "K-12_Engagement_in_schooling" "K-12_Performance_on_tests" "K-12_Performance_on_tests" ...
## $ rating : num [1:929] 2 2 4 4 2 2 4 4 2 2 ...
DC_duplicates_check <- DC_rater_ab %>%
group_by(id, indicator, validation_round) %>%
filter(n() > 1) %>%
ungroup()
# View the duplicates
print(DC_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## # dataset_name <chr>, indicator <chr>, rating <dbl>
Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).
DC_rater_ab <- DC_rater_ab %>%
filter(!(id %in% c("DC1268", "DC1270", "DC1360", "DC1446", "DC1447", "DC1452", "DC1462", "DC1463", "DC1472", "DC1485", "DC1494", "DC1496", "DC1516", "DC1523", "DC1525", "DC1535", "DC1536", "DC1544", "DC1546", "DC1553")
))
str(DC_rater_ab)
## tibble [889 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:889] "DC1218" "DC1218" "DC1218" "DC1218" ...
## $ rater : chr [1:889] "R562" "R217" "R562" "R217" ...
## $ validation_round: chr [1:889] "A" "B" "A" "B" ...
## $ dataset_name : chr [1:889] "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" ...
## $ indicator : chr [1:889] "K-12_Engagement_in_schooling" "K-12_Engagement_in_schooling" "K-12_Performance_on_tests" "K-12_Performance_on_tests" ...
## $ rating : num [1:889] 2 2 4 4 2 2 4 4 2 2 ...
Making sure each id, indicator, validator combination is present.
DC_expanded_grid <- DC_rater_ab %>%
distinct(id, dataset_name) %>%
crossing(indicators_df, validation_round = c("A", "B"))
head(DC_expanded_grid)
## # A tibble: 6 × 4
## id dataset_name indicator validation_round
## <chr> <chr> <chr> <chr>
## 1 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… EA_On-ti… A
## 2 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… EA_On-ti… B
## 3 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… EA_Posts… A
## 4 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… EA_Posts… B
## 5 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… K-12_Acc… A
## 6 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… K-12_Acc… B
Making sure that id/indicator combinations that were not rated are represented as zero.
DC_ratings_with_zeros <- DC_expanded_grid %>%
left_join(DC_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating),
by = c("id", "dataset_name", "indicator", "validation_round")) %>%
mutate(rating = replace_na(rating, 0))
DC_ab_long <- DC_ratings_with_zeros %>%
pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
rename(rating_a = rating_A, rating_b = rating_B) %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
head(DC_ab_long)
## # A tibble: 6 × 6
## id dataset_name indicator rating_a rating_b agreement
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 DC1218 2023-24 School Level DCCAPE and … EA_On-ti… 0 0 1
## 2 DC1218 2023-24 School Level DCCAPE and … EA_Posts… 0 0 1
## 3 DC1218 2023-24 School Level DCCAPE and … K-12_Acc… 0 0 1
## 4 DC1218 2023-24 School Level DCCAPE and … K-12_Acc… 0 0 1
## 5 DC1218 2023-24 School Level DCCAPE and … K-12_Acc… 0 0 1
## 6 DC1218 2023-24 School Level DCCAPE and … K-12_Cur… 0 0 1
DC_dataset_agreement <- DC_ab_long %>%
group_by(id) %>%
summarise(percent_agreement = mean(agreement) * 100)
DC_dataset_agreement <- DC_dataset_agreement %>%
ungroup()
n_distinct(DC_dataset_agreement$id)
## [1] 314
head(DC_dataset_agreement)
## # A tibble: 6 × 2
## id percent_agreement
## <chr> <dbl>
## 1 DC1218 100
## 2 DC1219 100
## 3 DC1220 100
## 4 DC1221 100
## 5 DC1222 100
## 6 DC1223 100
write.csv(DC_dataset_agreement, "20251113_percent_agreement_by_id_Washington_DC.csv", row.names = FALSE)
DC_indicators_mapped_per_id <- DC_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(id) %>%
summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
right_join(DC_ab_long %>% distinct(id), by = "id") %>%
mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))
DC_dataset_agreement_count <- DC_dataset_agreement %>%
left_join(DC_indicators_mapped_per_id, by = "id")
head(DC_dataset_agreement_count)
## # A tibble: 6 × 3
## id percent_agreement indicators_mapped_per_id
## <chr> <dbl> <int>
## 1 DC1218 100 2
## 2 DC1219 100 2
## 3 DC1220 100 2
## 4 DC1221 100 2
## 5 DC1222 100 1
## 6 DC1223 100 1
DC_dataset_agreement_count <- DC_dataset_agreement_count %>%
arrange(desc(percent_agreement), (indicators_mapped_per_id))
DC_dataset_agreement_count$id <- factor(DC_dataset_agreement_count$id, levels = unique(DC_dataset_agreement_count$id))
DC_dataset_agreement_count$indicators_mapped_per_id <-
factor(DC_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
DC_percentagreement_barplot <- ggplot(DC_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
geom_bar(stat = "identity") +
theme_minimal() +
scale_fill_viridis_d(direction = -1) +
scale_y_continuous(expand = c(0, 0)) +
labs(
title = "Washington D.C. - Percentage Agreement by Dataset",
x = "Dataset",
y = "Agreement (%)",
fill = "Number of Indicators"
) +
theme(
axis.text.y = element_text(size = 1)
) +
coord_flip()
DC_percentagreement_barplot
ggsave("20251113_percentagreement_barplot_Washington_DC.svg", plot = DC_percentagreement_barplot, width = 6.5, height = 6.5)
Printing results for chart in findings section.
DC_percent_agreement <- DC_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
group_by(indicator) %>%
summarise(percent_agreement = mean(agreement) * 100)
DC_percent_agreement
## # A tibble: 16 × 2
## indicator percent_agreement
## <chr> <dbl>
## 1 EA_On-time_graduation 90.1
## 2 EA_Postsecondary_readiness 86.0
## 3 K-12_Access_to_effective_teaching 97.5
## 4 K-12_Access_to_high-quality_academic_supports 88.2
## 5 K-12_Access_to_rigorous_coursework 93.9
## 6 K-12_Curricular_breadth 96.8
## 7 K-12_Engagement_in_schooling 86.3
## 8 K-12_Nonacademic_supports_for_student_success 93.6
## 9 K-12_Nonexclusionary_discipline_practices 98.4
## 10 K-12_Performance_in_coursework 93.6
## 11 K-12_Performance_on_tests 93.0
## 12 K-12_School_climate 94.6
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre… 88.5
## 14 Pre-K_Academic_readiness 100
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr… 90.4
## 16 Pre-K_Self-regulation_and_attention_skills 99.7
DC_ab_long <- DC_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
Calculating Cohen’s kappa, first for the whole dataset, then by indicator.
DC_contingency_table <- data.frame(DC_ab_long$rating_a, DC_ab_long$rating_b)
DC_kappa_result <- kappa2(DC_contingency_table, weight = "unweighted")
print(DC_kappa_result)
## Cohen's Kappa for 2 Raters (Weights: unweighted)
##
## Subjects = 5024
## Raters = 2
## Kappa = 0.547
##
## z = 56.7
## p-value = 0
DC_kappa_by_indicator <- DC_ab_long %>%
group_by(indicator) %>%
summarise(
kappa_stat = {
DC_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
DC_kappa_result <- kappa2(DC_contingency_table, weight = "unweighted")
DC_kappa_result$value
},
.groups = "drop"
)
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `kappa_stat = { ... }`.
## ℹ In group 16: `indicator = "Pre-K_Self-regulation_and_attention_skills"`.
## Caused by warning in `sqrt()`:
## ! NaNs produced
print(DC_kappa_by_indicator)
## # A tibble: 16 × 2
## indicator kappa_stat
## <chr> <dbl>
## 1 EA_On-time_graduation 0.664
## 2 EA_Postsecondary_readiness 0.582
## 3 K-12_Access_to_effective_teaching 0.459
## 4 K-12_Access_to_high-quality_academic_supports 0.245
## 5 K-12_Access_to_rigorous_coursework 0.220
## 6 K-12_Curricular_breadth -0.00512
## 7 K-12_Engagement_in_schooling 0.490
## 8 K-12_Nonacademic_supports_for_student_success 0.466
## 9 K-12_Nonexclusionary_discipline_practices 0.731
## 10 K-12_Performance_in_coursework 0.181
## 11 K-12_Performance_on_tests 0.780
## 12 K-12_School_climate 0.435
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation 0.555
## 14 Pre-K_Academic_readiness NaN
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs 0.444
## 16 Pre-K_Self-regulation_and_attention_skills 0
Counting the number of datasets mapped to each indicator to create a plot afterwards.
DC_mapped_counts <- DC_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(indicator) %>%
summarise(mapped_counts = n_distinct(id), .groups = "drop")
DC_mapped_counts
## # A tibble: 15 × 2
## indicator mapped_counts
## <chr> <int>
## 1 EA_On-time_graduation 61
## 2 EA_Postsecondary_readiness 79
## 3 K-12_Access_to_effective_teaching 9
## 4 K-12_Access_to_high-quality_academic_supports 38
## 5 K-12_Access_to_rigorous_coursework 21
## 6 K-12_Curricular_breadth 10
## 7 K-12_Engagement_in_schooling 62
## 8 K-12_Nonacademic_supports_for_student_success 28
## 9 K-12_Nonexclusionary_discipline_practices 12
## 10 K-12_Performance_in_coursework 21
## 11 K-12_Performance_on_tests 60
## 12 K-12_School_climate 23
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati… 61
## 14 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra… 37
## 15 Pre-K_Self-regulation_and_attention_skills 1
DC_indicator_kappa_count_merge <- DC_kappa_by_indicator %>%
left_join(DC_mapped_counts, by = "indicator")
head(DC_indicator_kappa_count_merge)
## # A tibble: 6 × 3
## indicator kappa_stat mapped_counts
## <chr> <dbl> <int>
## 1 EA_On-time_graduation 0.664 61
## 2 EA_Postsecondary_readiness 0.582 79
## 3 K-12_Access_to_effective_teaching 0.459 9
## 4 K-12_Access_to_high-quality_academic_supports 0.245 38
## 5 K-12_Access_to_rigorous_coursework 0.220 21
## 6 K-12_Curricular_breadth -0.00512 10
clean_labels <- sub("^[^_]+_", "", DC_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
DC_indicator_kappa_count_merge$indicator_cleaned <- factor(
wrapped_labels,
levels = wrapped_labels[order(DC_indicator_kappa_count_merge$kappa_stat)]
)
DC_balloonplot <- ggballoonplot(DC_indicator_kappa_count_merge,
x = "indicator_cleaned",
y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
scale_fill_viridis_c(option = "C") +
theme_minimal() +
ylim(-1,1) +
labs(
title = "Washington D.C. Interrater Reliability by Indicator",
subtitle = "Overall Kappa is 0.547",
x = "Indicator",
y = "Cohen's Kappa",
fill = "Kappa",
size = "Number of Datasets"
) +
theme(
axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
) +
coord_flip()
DC_balloonplot
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
ggsave("20251113_IRR_kappa_balloonplot_Washington_DC.svg", plot = DC_balloonplot, width = 6.5, height = 6.5)
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
DC_heatmap_long <- DC_ab_long %>%
mutate(weight = case_when(
rating_a == 0 & rating_b == 0 ~ 0, # Both rated zero = Not Rated
(rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2, # One rated zero, other rated indirect = Moderate Disagreement
(rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3, # One rated zero, other rated direct = Serious Disagreement
(rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1, # Both rated indirect = Agreement
(rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1, # Both rated direct = Agreement
TRUE ~ NA_real_ # Handle any other cases (e.g., missing or invalid ratings)
))
clean_labels <- sub("^[^_]+_", "", DC_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
DC_heatmap_long$indicator_cleaned <- wrapped_labels
DC_heatmap_long_selected <- DC_heatmap_long %>%
select(id, indicator_cleaned, weight)
head(DC_heatmap_long_selected)
## # A tibble: 6 × 3
## id indicator_cleaned weight
## <chr> <chr> <dbl>
## 1 DC1218 "On-time graduation" 0
## 2 DC1218 "Postsecondary readiness" 0
## 3 DC1218 "Access to effective teaching" 0
## 4 DC1218 "Access to high-quality academic\nsupports" 0
## 5 DC1218 "Access to rigorous coursework" 0
## 6 DC1218 "Curricular breadth" 0
Dataset disagreement with marginal plots.
# 1. Set factor levels based on heatmap ordering
DC_indicator_levels <- levels(with(DC_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
DC_dataset_levels <- levels(with(DC_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
DC_agreement_heatmap <- ggplot(DC_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Washington D.C. Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
axis.text.y = element_text(size = 1),
legend.text = element_text(size = 6),
plot.title = element_text(hjust = 0.5)
)
# 3. Extract the legend
DC_legend <- cowplot::get_legend(DC_agreement_heatmap)
DC_legend_plot <- wrap_elements(DC_legend)
# 4. Remove legend from main plot
DC_main_plot <- DC_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))
# 5. Create top marginal plot (stacked bar chart by indicator)
DC_top_plot <- DC_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = DC_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
scale_x_discrete(drop = FALSE) +
theme_minimal() +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "none"
)
# 6. Create right marginal plot (stacked bar chart by dataset)
DC_right_plot <- DC_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
mutate(id = factor(id, levels = DC_dataset_levels)) %>%
ggplot(aes(x = id, fill = factor(weight))) +
geom_bar(position = "stack") +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
theme_minimal() +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
legend.position = "none"
)
DC_final_plot <- DC_top_plot + DC_legend + DC_main_plot + DC_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))
DC_final_plot
ggsave("20251113_agreement_heatmap_with_marginals_Washington_DC.svg", plot = DC_final_plot, width = 8, height = 12)
Creating the png.
# 1. Set factor levels based on heatmap ordering
DC_indicator_levels <- levels(with(DC_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
DC_dataset_levels <- levels(with(DC_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
DC_agreement_heatmap <- ggplot(DC_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Washington D.C. Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = element_text(size = 30),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20),
legend.text = element_text(size = 20),
legend.title = element_text(size = 20),
plot.title = element_text(size = 25, hjust = 0.5)
)
# 3. Extract the legend
DC_legend <- cowplot::get_legend(DC_agreement_heatmap)
# 4. Remove legend from main plot and adjust bar width
DC_main_plot <- DC_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20), hjust = 0.5)
# 5. Create top marginal plot
DC_top_plot <- DC_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = DC_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 20),
legend.position = "none")
# 6. Create right marginal plot
DC_right_data <- DC_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
count(id, weight) %>%
mutate(id = factor(id, levels = DC_dataset_levels))
DC_right_plot <- ggplot(DC_right_data, aes(x = id, y = n, fill = factor(weight))) +
geom_col(position = "stack", width = 1, colour = NA) +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
axis.title.x = element_text(size = 20),
legend.position = "none",
panel.spacing = unit(0, "lines")
)
DC_final_plot3 <- DC_top_plot + DC_legend + DC_main_plot + DC_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))
DC_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
ggsave("20251113_agreement_heatmap_with_marginals_Washington_DC.png", plot = DC_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.
DC_ab_directindirect <- DC_ab_long %>%
select(id, dataset_name, indicator, rating_a, rating_b) %>%
mutate(across(c(rating_a, rating_b), ~ case_when(
. == 2 ~ 1,
. %in% c(3,4) ~ 2,
TRUE ~ .
)))
DC_ratings_kappaweighted <- DC_ab_directindirect %>%
select(rating_a, rating_b)
DC_kappa_weighted <- kappa2(DC_ratings_kappaweighted, weight = "equal")
print(DC_kappa_weighted)
## Cohen's Kappa for 2 Raters (Weights: equal)
##
## Subjects = 5024
## Raters = 2
## Kappa = 0.638
##
## z = 48.9
## p-value = 0
DC_weighted_kappa_by_indicator <- DC_ab_directindirect %>%
group_by(indicator) %>%
summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a,
## rating_b)), weight = "equal")$value`.
## ℹ In group 16: `indicator = "Pre-K_Self-regulation_and_attention_skills"`.
## Caused by warning in `sqrt()`:
## ! NaNs produced
DC_weighted_kappa_by_indicator
## # A tibble: 16 × 2
## indicator weighted_kappa
## <chr> <dbl>
## 1 EA_On-time_graduation 0.787
## 2 EA_Postsecondary_readiness 0.708
## 3 K-12_Access_to_effective_teaching 0.795
## 4 K-12_Access_to_high-quality_academic_supports 0.475
## 5 K-12_Access_to_rigorous_coursework 0.239
## 6 K-12_Curricular_breadth -0.00569
## 7 K-12_Engagement_in_schooling 0.552
## 8 K-12_Nonacademic_supports_for_student_success 0.574
## 9 K-12_Nonexclusionary_discipline_practices 0.729
## 10 K-12_Performance_in_coursework 0.202
## 11 K-12_Performance_on_tests 0.852
## 12 K-12_School_climate 0.444
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat… 0.602
## 14 Pre-K_Academic_readiness NA
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr… 0.475
## 16 Pre-K_Self-regulation_and_attention_skills 0
IRRstatistics_Washington_DC <- DC_indicator_kappa_count_merge %>%
left_join(DC_weighted_kappa_by_indicator, by = "indicator") %>%
left_join(DC_percent_agreement, by = "indicator") %>%
select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)
indicator_order <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs",
"K-12_Access_to_effective_teaching",
"K-12_Access_to_rigorous_coursework",
"K-12_Curricular_breadth",
"K-12_Access_to_high-quality_academic_supports",
"K-12_School_climate",
"K-12_Nonexclusionary_discipline_practices",
"K-12_Nonacademic_supports_for_student_success",
"Pre-K_Academic_readiness",
"Pre-K_Self-regulation_and_attention_skills",
"K-12_Engagement_in_schooling",
"K-12_Performance_in_coursework",
"K-12_Performance_on_tests",
"EA_On-time_graduation",
"EA_Postsecondary_readiness"
)
indicator_name_mapping <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
"K-12_Access_to_effective_teaching" = "Effective Teaching",
"K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
"K-12_Curricular_breadth" = "Curricular Breadth",
"K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
"K-12_School_climate" = "School Climate",
"K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
"K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
"Pre-K_Academic_readiness" = "Academic Readiness",
"Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
"K-12_Engagement_in_schooling" = "Engagement in Schooling",
"K-12_Performance_in_coursework" = "Performance in Coursework",
"K-12_Performance_on_tests" = "Performance on Tests",
"EA_On-time_graduation" = "On-Time Graduation",
"EA_Postsecondary_readiness" = "Postsecondary Readiness"
)
IRRstatistics_Washington_DC_ordered <- IRRstatistics_Washington_DC %>%
arrange(match(indicator, indicator_order)) %>%
mutate(indicator_name = indicator_name_mapping[indicator]) %>%
select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
rename(
"Indicator" = indicator_name,
"Mapped Datasets" = mapped_counts,
"Percent Agreement" = percent_agreement,
"Cohen's Kappa" = kappa_stat,
"Cohen's Weighted Kappa" = weighted_kappa) %>%
mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))
kable(IRRstatistics_Washington_DC_ordered)
| Indicator | Mapped Datasets | Percent Agreement | Cohen’s Kappa | Cohen’s Weighted Kappa |
|---|---|---|---|---|
| Racial, Ethnic, and Economic Segregation | 61 | 88.53503 | 0.5546450 | 0.6016207 |
| High-Quality Pre-K Programs | 37 | 90.44586 | 0.4438541 | 0.4750064 |
| Effective Teaching | 9 | 97.45223 | 0.4589705 | 0.7953064 |
| Rigorous Coursework | 21 | 93.94904 | 0.2197227 | 0.2393844 |
| Curricular Breadth | 10 | 96.81529 | -0.0051216 | -0.0056940 |
| High-Quality Academic Supports | 38 | 88.21656 | 0.2451923 | 0.4748088 |
| School Climate | 23 | 94.58599 | 0.4351323 | 0.4436913 |
| Nonexclusionary Discipline Practices | 12 | 98.40764 | 0.7308879 | 0.7288428 |
| Nonacademic Supports for Student Success | 28 | 93.63057 | 0.4655774 | 0.5742266 |
| Academic Readiness | NA | 100.00000 | NA | NA |
| Self-Regulation and Attention Skills | 1 | 99.68153 | 0.0000000 | 0.0000000 |
| Engagement in Schooling | 62 | 86.30573 | 0.4896625 | 0.5515271 |
| Performance in Coursework | 21 | 93.63057 | 0.1811188 | 0.2022358 |
| Performance on Tests | 60 | 92.99363 | 0.7795507 | 0.8524543 |
| On-Time Graduation | 61 | 90.12739 | 0.6636489 | 0.7872374 |
| Postsecondary Readiness | 79 | 85.98726 | 0.5821437 | 0.7083194 |
write.csv(IRRstatistics_Washington_DC_ordered, "20251113_IRRstatistics_Washington_DC.csv", row.names = FALSE)
Filtering for only Kentucky datasets.
KY_rater_ab <- rater_ab %>%
filter(str_starts(id, "KY"))
str(KY_rater_ab)
## tibble [1,784 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:1784] "KY1218" "KY1218" "KY1219" "KY1219" ...
## $ rater : chr [1:1784] "R562" "R625" "R562" "R625" ...
## $ validation_round: chr [1:1784] "A" "B" "A" "B" ...
## $ dataset_name : chr [1:1784] NA NA "english_learners_attainment_2021.csv" "english_learners_attainment_2021.csv" ...
## $ indicator : chr [1:1784] "none" NA "K-12_Performance_on_tests" "K-12_Performance_on_tests" ...
## $ rating : num [1:1784] 0 0 1 1 4 4 4 4 4 4 ...
KY_duplicates_check <- KY_rater_ab %>%
group_by(id, indicator, validation_round) %>%
filter(n() > 1) %>%
ungroup()
# View the duplicates
print(KY_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## # dataset_name <chr>, indicator <chr>, rating <dbl>
Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).
KY_rater_ab <- KY_rater_ab %>%
filter(!(id %in% c("KY1772", "KY1773")
))
str(KY_rater_ab)
## tibble [1,780 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:1780] "KY1218" "KY1218" "KY1219" "KY1219" ...
## $ rater : chr [1:1780] "R562" "R625" "R562" "R625" ...
## $ validation_round: chr [1:1780] "A" "B" "A" "B" ...
## $ dataset_name : chr [1:1780] NA NA "english_learners_attainment_2021.csv" "english_learners_attainment_2021.csv" ...
## $ indicator : chr [1:1780] "none" NA "K-12_Performance_on_tests" "K-12_Performance_on_tests" ...
## $ rating : num [1:1780] 0 0 1 1 4 4 4 4 4 4 ...
Making sure each id, indicator, validator combination is present.
KY_expanded_grid <- KY_rater_ab %>%
distinct(id, dataset_name) %>%
crossing(indicators_df, validation_round = c("A", "B"))
head(KY_expanded_grid)
## # A tibble: 6 × 4
## id dataset_name indicator validation_round
## <chr> <chr> <chr> <chr>
## 1 KY1218 <NA> EA_On-time_graduation A
## 2 KY1218 <NA> EA_On-time_graduation B
## 3 KY1218 <NA> EA_Postsecondary_readiness A
## 4 KY1218 <NA> EA_Postsecondary_readiness B
## 5 KY1218 <NA> K-12_Access_to_effective_teaching A
## 6 KY1218 <NA> K-12_Access_to_effective_teaching B
Making sure that id/indicator combinations that were not rated are represented as zero.
KY_ratings_with_zeros <- KY_expanded_grid %>%
left_join(KY_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating),
by = c("id", "dataset_name", "indicator", "validation_round")) %>%
mutate(rating = replace_na(rating, 0))
KY_ab_long <- KY_ratings_with_zeros %>%
pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
rename(rating_a = rating_A, rating_b = rating_B) %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
head(KY_ab_long)
## # A tibble: 6 × 6
## id dataset_name indicator rating_a rating_b agreement
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 KY1218 <NA> EA_On-time_graduation 0 0 1
## 2 KY1218 <NA> EA_Postsecondary_readiness 0 0 1
## 3 KY1218 <NA> K-12_Access_to_effective_teac… 0 0 1
## 4 KY1218 <NA> K-12_Access_to_high-quality_a… 0 0 1
## 5 KY1218 <NA> K-12_Access_to_rigorous_cours… 0 0 1
## 6 KY1218 <NA> K-12_Curricular_breadth 0 0 1
KY_dataset_agreement <- KY_ab_long %>%
group_by(id) %>%
summarise(percent_agreement = mean(agreement) * 100)
KY_dataset_agreement <- KY_dataset_agreement %>%
ungroup()
n_distinct(KY_dataset_agreement$id)
## [1] 576
head(KY_dataset_agreement)
## # A tibble: 6 × 2
## id percent_agreement
## <chr> <dbl>
## 1 KY1218 100
## 2 KY1219 100
## 3 KY1220 81.2
## 4 KY1221 100
## 5 KY1222 87.5
## 6 KY1223 87.5
write.csv(KY_dataset_agreement, "20251113_percent_agreement_by_id_Kentucky.csv", row.names = FALSE)
KY_indicators_mapped_per_id <- KY_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(id) %>%
summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
right_join(KY_ab_long %>% distinct(id), by = "id") %>%
mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))
KY_dataset_agreement_count <- KY_dataset_agreement %>%
left_join(KY_indicators_mapped_per_id, by = "id")
head(KY_dataset_agreement_count)
## # A tibble: 6 × 3
## id percent_agreement indicators_mapped_per_id
## <chr> <dbl> <int>
## 1 KY1218 100 0
## 2 KY1219 100 1
## 3 KY1220 81.2 4
## 4 KY1221 100 1
## 5 KY1222 87.5 2
## 6 KY1223 87.5 2
KY_dataset_agreement_count <- KY_dataset_agreement_count %>%
arrange(desc(percent_agreement), (indicators_mapped_per_id))
KY_dataset_agreement_count$id <- factor(KY_dataset_agreement_count$id, levels = unique(KY_dataset_agreement_count$id))
KY_dataset_agreement_count$indicators_mapped_per_id <-
factor(KY_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
KY_percentagreement_barplot <- ggplot(KY_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
geom_bar(stat = "identity") +
theme_minimal() +
scale_fill_viridis_d(direction = -1) +
scale_y_continuous(expand = c(0, 0)) +
labs(
title = "Kentucky - Percentage Agreement by Dataset",
x = "Dataset",
y = "Agreement (%)",
fill = "Number of Indicators"
) +
theme(
axis.text.y = element_text(size = 1)
) +
coord_flip()
KY_percentagreement_barplot
ggsave("20251113_percentagreement_barplot_Kentucky.svg", plot = KY_percentagreement_barplot, width = 6.5, height = 6.5)
Printing results for chart in findings section.
KY_percent_agreement <- KY_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
group_by(indicator) %>%
summarise(percent_agreement = mean(agreement) * 100)
KY_percent_agreement
## # A tibble: 16 × 2
## indicator percent_agreement
## <chr> <dbl>
## 1 EA_On-time_graduation 92.0
## 2 EA_Postsecondary_readiness 88.2
## 3 K-12_Access_to_effective_teaching 95.1
## 4 K-12_Access_to_high-quality_academic_supports 97.6
## 5 K-12_Access_to_rigorous_coursework 95.7
## 6 K-12_Curricular_breadth 93.1
## 7 K-12_Engagement_in_schooling 93.9
## 8 K-12_Nonacademic_supports_for_student_success 97.2
## 9 K-12_Nonexclusionary_discipline_practices 96.5
## 10 K-12_Performance_in_coursework 91.1
## 11 K-12_Performance_on_tests 91.8
## 12 K-12_School_climate 92.9
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre… 89.9
## 14 Pre-K_Academic_readiness 99.0
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr… 96.4
## 16 Pre-K_Self-regulation_and_attention_skills 100
KY_ab_long <- KY_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
Calculating Cohen’s kappa, first for the whole dataset, then by indicator.
KY_contingency_table <- data.frame(KY_ab_long$rating_a, KY_ab_long$rating_b)
KY_kappa_result <- kappa2(KY_contingency_table, weight = "unweighted")
print(KY_kappa_result)
## Cohen's Kappa for 2 Raters (Weights: unweighted)
##
## Subjects = 9216
## Raters = 2
## Kappa = 0.682
##
## z = 98
## p-value = 0
KY_kappa_by_indicator <- KY_ab_long %>%
group_by(indicator) %>%
summarise(
kappa_stat = {
KY_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
KY_kappa_result <- kappa2(KY_contingency_table, weight = "unweighted")
KY_kappa_result$value
},
.groups = "drop"
)
print(KY_kappa_by_indicator)
## # A tibble: 16 × 2
## indicator kappa_stat
## <chr> <dbl>
## 1 EA_On-time_graduation 0.700
## 2 EA_Postsecondary_readiness 0.693
## 3 K-12_Access_to_effective_teaching 0.744
## 4 K-12_Access_to_high-quality_academic_supports 0.688
## 5 K-12_Access_to_rigorous_coursework 0.636
## 6 K-12_Curricular_breadth 0.498
## 7 K-12_Engagement_in_schooling 0.679
## 8 K-12_Nonacademic_supports_for_student_success 0.489
## 9 K-12_Nonexclusionary_discipline_practices 0.363
## 10 K-12_Performance_in_coursework 0.545
## 11 K-12_Performance_on_tests 0.789
## 12 K-12_School_climate 0.700
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation 0.620
## 14 Pre-K_Academic_readiness 0.745
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs 0.521
## 16 Pre-K_Self-regulation_and_attention_skills 1
Counting the number of datasets mapped to each indicator to create a plot afterwards.
KY_mapped_counts <- KY_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(indicator) %>%
summarise(mapped_counts = n_distinct(id), .groups = "drop")
KY_mapped_counts
## # A tibble: 16 × 2
## indicator mapped_counts
## <chr> <int>
## 1 EA_On-time_graduation 97
## 2 EA_Postsecondary_readiness 159
## 3 K-12_Access_to_effective_teaching 67
## 4 K-12_Access_to_high-quality_academic_supports 30
## 5 K-12_Access_to_rigorous_coursework 47
## 6 K-12_Curricular_breadth 56
## 7 K-12_Engagement_in_schooling 72
## 8 K-12_Nonacademic_supports_for_student_success 22
## 9 K-12_Nonexclusionary_discipline_practices 26
## 10 K-12_Performance_in_coursework 82
## 11 K-12_Performance_on_tests 144
## 12 K-12_School_climate 92
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati… 106
## 14 Pre-K_Academic_readiness 15
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra… 33
## 16 Pre-K_Self-regulation_and_attention_skills 11
KY_indicator_kappa_count_merge <- KY_kappa_by_indicator %>%
left_join(KY_mapped_counts, by = "indicator")
head(KY_indicator_kappa_count_merge)
## # A tibble: 6 × 3
## indicator kappa_stat mapped_counts
## <chr> <dbl> <int>
## 1 EA_On-time_graduation 0.700 97
## 2 EA_Postsecondary_readiness 0.693 159
## 3 K-12_Access_to_effective_teaching 0.744 67
## 4 K-12_Access_to_high-quality_academic_supports 0.688 30
## 5 K-12_Access_to_rigorous_coursework 0.636 47
## 6 K-12_Curricular_breadth 0.498 56
clean_labels <- sub("^[^_]+_", "", KY_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
KY_indicator_kappa_count_merge$indicator_cleaned <- factor(
wrapped_labels,
levels = wrapped_labels[order(KY_indicator_kappa_count_merge$kappa_stat)]
)
KY_balloonplot <- ggballoonplot(KY_indicator_kappa_count_merge,
x = "indicator_cleaned",
y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
scale_fill_viridis_c(option = "C") +
theme_minimal() +
ylim(-1,1) +
labs(
title = "Kentucky Interrater Reliability by Indicator",
subtitle = "Overall Kappa is 0.682",
x = "Indicator",
y = "Cohen's Kappa",
fill = "Kappa",
size = "Number of Datasets"
) +
theme(
axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
) +
coord_flip()
KY_balloonplot
ggsave("20251113_IRR_kappa_balloonplot_Kentucky.svg", plot = KY_balloonplot, width = 6.5, height = 6.5)
KY_heatmap_long <- KY_ab_long %>%
mutate(weight = case_when(
rating_a == 0 & rating_b == 0 ~ 0, # Both rated zero = Not Rated
(rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2, # One rated zero, other rated indirect = Moderate Disagreement
(rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3, # One rated zero, other rated direct = Serious Disagreement
(rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1, # Both rated indirect = Agreement
(rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1, # Both rated direct = Agreement
TRUE ~ NA_real_ # Handle any other cases (e.g., missing or invalid ratings)
))
clean_labels <- sub("^[^_]+_", "", KY_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
KY_heatmap_long$indicator_cleaned <- wrapped_labels
KY_heatmap_long_selected <- KY_heatmap_long %>%
select(id, indicator_cleaned, weight)
head(KY_heatmap_long_selected)
## # A tibble: 6 × 3
## id indicator_cleaned weight
## <chr> <chr> <dbl>
## 1 KY1218 "On-time graduation" 0
## 2 KY1218 "Postsecondary readiness" 0
## 3 KY1218 "Access to effective teaching" 0
## 4 KY1218 "Access to high-quality academic\nsupports" 0
## 5 KY1218 "Access to rigorous coursework" 0
## 6 KY1218 "Curricular breadth" 0
Dataset disagreement with marginal plots.
# 1. Set factor levels based on heatmap ordering
KY_indicator_levels <- levels(with(KY_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
KY_dataset_levels <- levels(with(KY_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
KY_agreement_heatmap <- ggplot(KY_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Kentucky Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
axis.text.y = element_text(size = 1),
legend.text = element_text(size = 6),
plot.title = element_text(hjust = 0.5)
)
# 3. Extract the legend
KY_legend <- cowplot::get_legend(KY_agreement_heatmap)
KY_legend_plot <- wrap_elements(KY_legend)
# 4. Remove legend from main plot
KY_main_plot <- KY_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))
# 5. Create top marginal plot (stacked bar chart by indicator)
KY_top_plot <- KY_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = KY_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
scale_x_discrete(drop = FALSE) +
theme_minimal() +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "none"
)
# 6. Create right marginal plot (stacked bar chart by dataset)
KY_right_plot <- KY_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
mutate(id = factor(id, levels = KY_dataset_levels)) %>%
ggplot(aes(x = id, fill = factor(weight))) +
geom_bar(position = "stack") +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
theme_minimal() +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
legend.position = "none"
)
KY_final_plot <- KY_top_plot + KY_legend + KY_main_plot + KY_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))
KY_final_plot
ggsave("20251113_agreement_heatmap_with_marginals_Kentucky.svg", plot = KY_final_plot, width = 8, height = 12)
Creating the png.
# 1. Set factor levels based on heatmap ordering
KY_indicator_levels <- levels(with(KY_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
KY_dataset_levels <- levels(with(KY_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
KY_agreement_heatmap <- ggplot(KY_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Kentucky Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = element_text(size = 30),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20),
legend.text = element_text(size = 20),
legend.title = element_text(size = 20),
plot.title = element_text(size = 25, hjust = 0.5)
)
# 3. Extract the legend
KY_legend <- cowplot::get_legend(KY_agreement_heatmap)
# 4. Remove legend from main plot and adjust bar width
KY_main_plot <- KY_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20), hjust = 0.5)
# 5. Create top marginal plot
KY_top_plot <- KY_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = KY_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 20),
legend.position = "none")
# 6. Create right marginal plot
KY_right_data <- KY_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
count(id, weight) %>%
mutate(id = factor(id, levels = KY_dataset_levels))
KY_right_plot <- ggplot(KY_right_data, aes(x = id, y = n, fill = factor(weight))) +
geom_col(position = "stack", width = 1, colour = NA) +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
axis.title.x = element_text(size = 20),
legend.position = "none",
panel.spacing = unit(0, "lines")
)
KY_final_plot3 <- KY_top_plot + KY_legend + KY_main_plot + KY_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))
KY_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
ggsave("20251113_agreement_heatmap_with_marginals_Kentucky.png", plot = KY_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.
KY_ab_directindirect <- KY_ab_long %>%
select(id, dataset_name, indicator, rating_a, rating_b) %>%
mutate(across(c(rating_a, rating_b), ~ case_when(
. == 2 ~ 1,
. %in% c(3,4) ~ 2,
TRUE ~ .
)))
KY_ratings_kappaweighted <- KY_ab_directindirect %>%
select(rating_a, rating_b)
KY_kappa_weighted <- kappa2(KY_ratings_kappaweighted, weight = "equal")
print(KY_kappa_weighted)
## Cohen's Kappa for 2 Raters (Weights: equal)
##
## Subjects = 9216
## Raters = 2
## Kappa = 0.738
##
## z = 77.6
## p-value = 0
KY_weighted_kappa_by_indicator <- KY_ab_directindirect %>%
group_by(indicator) %>%
summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
KY_weighted_kappa_by_indicator
## # A tibble: 16 × 2
## indicator weighted_kappa
## <chr> <dbl>
## 1 EA_On-time_graduation 0.771
## 2 EA_Postsecondary_readiness 0.756
## 3 K-12_Access_to_effective_teaching 0.893
## 4 K-12_Access_to_high-quality_academic_supports 0.586
## 5 K-12_Access_to_rigorous_coursework 0.645
## 6 K-12_Curricular_breadth 0.603
## 7 K-12_Engagement_in_schooling 0.799
## 8 K-12_Nonacademic_supports_for_student_success 0.522
## 9 K-12_Nonexclusionary_discipline_practices 0.388
## 10 K-12_Performance_in_coursework 0.617
## 11 K-12_Performance_on_tests 0.849
## 12 K-12_School_climate 0.734
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat… 0.646
## 14 Pre-K_Academic_readiness 0.745
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr… 0.402
## 16 Pre-K_Self-regulation_and_attention_skills 1
IRRstatistics_Kentucky <- KY_indicator_kappa_count_merge %>%
left_join(KY_weighted_kappa_by_indicator, by = "indicator") %>%
left_join(KY_percent_agreement, by = "indicator") %>%
select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)
indicator_order <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs",
"K-12_Access_to_effective_teaching",
"K-12_Access_to_rigorous_coursework",
"K-12_Curricular_breadth",
"K-12_Access_to_high-quality_academic_supports",
"K-12_School_climate",
"K-12_Nonexclusionary_discipline_practices",
"K-12_Nonacademic_supports_for_student_success",
"Pre-K_Academic_readiness",
"Pre-K_Self-regulation_and_attention_skills",
"K-12_Engagement_in_schooling",
"K-12_Performance_in_coursework",
"K-12_Performance_on_tests",
"EA_On-time_graduation",
"EA_Postsecondary_readiness"
)
indicator_name_mapping <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
"K-12_Access_to_effective_teaching" = "Effective Teaching",
"K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
"K-12_Curricular_breadth" = "Curricular Breadth",
"K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
"K-12_School_climate" = "School Climate",
"K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
"K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
"Pre-K_Academic_readiness" = "Academic Readiness",
"Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
"K-12_Engagement_in_schooling" = "Engagement in Schooling",
"K-12_Performance_in_coursework" = "Performance in Coursework",
"K-12_Performance_on_tests" = "Performance on Tests",
"EA_On-time_graduation" = "On-Time Graduation",
"EA_Postsecondary_readiness" = "Postsecondary Readiness"
)
IRRstatistics_Kentucky_ordered <- IRRstatistics_Kentucky %>%
arrange(match(indicator, indicator_order)) %>%
mutate(indicator_name = indicator_name_mapping[indicator]) %>%
select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
rename(
"Indicator" = indicator_name,
"Mapped Datasets" = mapped_counts,
"Percent Agreement" = percent_agreement,
"Cohen's Kappa" = kappa_stat,
"Cohen's Weighted Kappa" = weighted_kappa) %>%
mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))
kable(IRRstatistics_Kentucky_ordered)
| Indicator | Mapped Datasets | Percent Agreement | Cohen’s Kappa | Cohen’s Weighted Kappa |
|---|---|---|---|---|
| Racial, Ethnic, and Economic Segregation | 106 | 89.93056 | 0.6197717 | 0.6459107 |
| High-Quality Pre-K Programs | 33 | 96.35417 | 0.5213675 | 0.4020385 |
| Effective Teaching | 67 | 95.13889 | 0.7437926 | 0.8931468 |
| Rigorous Coursework | 47 | 95.65972 | 0.6364187 | 0.6451993 |
| Curricular Breadth | 56 | 93.05556 | 0.4983234 | 0.6027915 |
| High-Quality Academic Supports | 30 | 97.56944 | 0.6879981 | 0.5862281 |
| School Climate | 92 | 92.88194 | 0.6998322 | 0.7344902 |
| Nonexclusionary Discipline Practices | 26 | 96.52778 | 0.3628319 | 0.3875236 |
| Nonacademic Supports for Student Success | 22 | 97.22222 | 0.4889086 | 0.5222749 |
| Academic Readiness | 15 | 98.95833 | 0.7448317 | 0.7448317 |
| Self-Regulation and Attention Skills | 11 | 100.00000 | 1.0000000 | 1.0000000 |
| Engagement in Schooling | 72 | 93.92361 | 0.6794097 | 0.7991544 |
| Performance in Coursework | 82 | 91.14583 | 0.5452913 | 0.6165257 |
| Performance on Tests | 144 | 91.84028 | 0.7894409 | 0.8491343 |
| On-Time Graduation | 97 | 92.01389 | 0.7000136 | 0.7705538 |
| Postsecondary Readiness | 159 | 88.19444 | 0.6933388 | 0.7559226 |
write.csv(IRRstatistics_Kentucky_ordered, "20251113_IRRstatistics_Kentucky.csv", row.names = FALSE)
Filtering for only Maryland datasets.
MD_rater_ab <- rater_ab %>%
filter(str_starts(id, "MD"))
str(MD_rater_ab)
## tibble [1,719 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:1719] "MD1218" "MD1218" "MD1218" "MD1218" ...
## $ rater : chr [1:1719] "R514" "R625" "R514" "R514" ...
## $ validation_round: chr [1:1719] "A" "B" "A" "A" ...
## $ dataset_name : chr [1:1719] "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" ...
## $ indicator : chr [1:1719] "EA_On-time_graduation" "EA_On-time_graduation" "EA_Postsecondary_readiness" "K-12_Access_to_rigorous_coursework" ...
## $ rating : num [1:1719] 3 3 3 3 3 3 3 3 3 3 ...
MD_duplicates_check <- MD_rater_ab %>%
group_by(id, indicator, validation_round) %>%
filter(n() > 1) %>%
ungroup()
# View the duplicates
print(MD_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## # dataset_name <chr>, indicator <chr>, rating <dbl>
Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).
MD_rater_ab <- MD_rater_ab %>%
filter(!(id %in% c("MD1597", "MD1600", "MD1603", "MD1606", "MD1609", "MD1612", "MD1615", "MD1618", "MD1621", "MD1625", "MD1629", "MD1633", "MD1637", "MD1638", "MD1639", "MD1640", "MD1644", "MD1645", "MD1646", "MD1647", "MD1651", "MD1652", "MD1653", "MD1654", "MD1660", "MD1663", "MD1664", "MD1665", "MD1668", "MD1677", "MD1678", "MD1679", "MD1688", "MD1689", "MD1690", "MD1701", "MD1702", "MD1703", "MD1704", "MD1705", "MD1706")
))
str(MD_rater_ab)
## tibble [1,636 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:1636] "MD1218" "MD1218" "MD1218" "MD1218" ...
## $ rater : chr [1:1636] "R514" "R625" "R514" "R514" ...
## $ validation_round: chr [1:1636] "A" "B" "A" "A" ...
## $ dataset_name : chr [1:1636] "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" ...
## $ indicator : chr [1:1636] "EA_On-time_graduation" "EA_On-time_graduation" "EA_Postsecondary_readiness" "K-12_Access_to_rigorous_coursework" ...
## $ rating : num [1:1636] 3 3 3 3 3 3 3 3 3 3 ...
Making sure each id, indicator, validator combination is present.
MD_expanded_grid <- MD_rater_ab %>%
distinct(id, dataset_name) %>%
crossing(indicators_df, validation_round = c("A", "B"))
head(MD_expanded_grid)
## # A tibble: 6 × 4
## id dataset_name indicator validation_round
## <chr> <chr> <chr> <chr>
## 1 MD1218 2019 Accountability Detail.csv EA_On-time_graduation A
## 2 MD1218 2019 Accountability Detail.csv EA_On-time_graduation B
## 3 MD1218 2019 Accountability Detail.csv EA_Postsecondary_readi… A
## 4 MD1218 2019 Accountability Detail.csv EA_Postsecondary_readi… B
## 5 MD1218 2019 Accountability Detail.csv K-12_Access_to_effecti… A
## 6 MD1218 2019 Accountability Detail.csv K-12_Access_to_effecti… B
Making sure that id/indicator combinations that were not rated are represented as zero.
MD_ratings_with_zeros <- MD_expanded_grid %>%
left_join(MD_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating),
by = c("id", "dataset_name", "indicator", "validation_round")) %>%
mutate(rating = replace_na(rating, 0))
MD_ab_long <- MD_ratings_with_zeros %>%
pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
rename(rating_a = rating_A, rating_b = rating_B) %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
head(MD_ab_long)
## # A tibble: 6 × 6
## id dataset_name indicator rating_a rating_b agreement
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 MD1218 2019 Accountability Detail.csv EA_On-time_… 3 3 1
## 2 MD1218 2019 Accountability Detail.csv EA_Postseco… 3 0 0
## 3 MD1218 2019 Accountability Detail.csv K-12_Access… 0 0 1
## 4 MD1218 2019 Accountability Detail.csv K-12_Access… 0 0 1
## 5 MD1218 2019 Accountability Detail.csv K-12_Access… 3 0 0
## 6 MD1218 2019 Accountability Detail.csv K-12_Curric… 0 3 0
MD_dataset_agreement <- MD_ab_long %>%
group_by(id) %>%
summarise(percent_agreement = mean(agreement) * 100)
MD_dataset_agreement <- MD_dataset_agreement %>%
ungroup()
n_distinct(MD_dataset_agreement$id)
## [1] 484
head(MD_dataset_agreement)
## # A tibble: 6 × 2
## id percent_agreement
## <chr> <dbl>
## 1 MD1218 75
## 2 MD1219 100
## 3 MD1220 87.5
## 4 MD1221 100
## 5 MD1222 100
## 6 MD1223 100
write.csv(MD_dataset_agreement, "20251113_percent_agreement_by_id_Maryland.csv", row.names = FALSE)
MD_indicators_mapped_per_id <- MD_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(id) %>%
summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
right_join(MD_ab_long %>% distinct(id), by = "id") %>%
mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))
MD_dataset_agreement_count <- MD_dataset_agreement %>%
left_join(MD_indicators_mapped_per_id, by = "id")
head(MD_dataset_agreement_count)
## # A tibble: 6 × 3
## id percent_agreement indicators_mapped_per_id
## <chr> <dbl> <int>
## 1 MD1218 75 7
## 2 MD1219 100 2
## 3 MD1220 87.5 3
## 4 MD1221 100 3
## 5 MD1222 100 1
## 6 MD1223 100 1
MD_dataset_agreement_count <- MD_dataset_agreement_count %>%
arrange(desc(percent_agreement), (indicators_mapped_per_id))
MD_dataset_agreement_count$id <- factor(MD_dataset_agreement_count$id, levels = unique(MD_dataset_agreement_count$id))
MD_dataset_agreement_count$indicators_mapped_per_id <-
factor(MD_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
MD_percentagreement_barplot <- ggplot(MD_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
geom_bar(stat = "identity") +
theme_minimal() +
scale_fill_viridis_d(direction = -1) +
scale_y_continuous(expand = c(0, 0)) +
labs(
title = "Maryland - Percentage Agreement by Dataset",
x = "Dataset",
y = "Agreement (%)",
fill = "Number of Indicators"
) +
theme(
axis.text.y = element_text(size = 1)
) +
coord_flip()
MD_percentagreement_barplot
ggsave("20251113_percentagreement_barplot_Maryland.svg", plot = MD_percentagreement_barplot, width = 6.5, height = 6.5)
Printing results for chart in findings section.
MD_percent_agreement <- MD_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
group_by(indicator) %>%
summarise(percent_agreement = mean(agreement) * 100)
MD_percent_agreement
## # A tibble: 16 × 2
## indicator percent_agreement
## <chr> <dbl>
## 1 EA_On-time_graduation 91.9
## 2 EA_Postsecondary_readiness 88.2
## 3 K-12_Access_to_effective_teaching 95.2
## 4 K-12_Access_to_high-quality_academic_supports 97.7
## 5 K-12_Access_to_rigorous_coursework 90.7
## 6 K-12_Curricular_breadth 91.1
## 7 K-12_Engagement_in_schooling 75.6
## 8 K-12_Nonacademic_supports_for_student_success 98.1
## 9 K-12_Nonexclusionary_discipline_practices 100
## 10 K-12_Performance_in_coursework 94.6
## 11 K-12_Performance_on_tests 81.4
## 12 K-12_School_climate 100
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre… 91.9
## 14 Pre-K_Academic_readiness 100
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr… 100
## 16 Pre-K_Self-regulation_and_attention_skills 99.4
MD_ab_long <- MD_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
Calculating Cohen’s kappa, first for the whole dataset, then by indicator.
MD_contingency_table <- data.frame(MD_ab_long$rating_a, MD_ab_long$rating_b)
MD_kappa_result <- kappa2(MD_contingency_table, weight = "unweighted")
print(MD_kappa_result)
## Cohen's Kappa for 2 Raters (Weights: unweighted)
##
## Subjects = 7744
## Raters = 2
## Kappa = 0.665
##
## z = 79.8
## p-value = 0
MD_kappa_by_indicator <- MD_ab_long %>%
group_by(indicator) %>%
summarise(
kappa_stat = {
MD_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
MD_kappa_result <- kappa2(MD_contingency_table, weight = "unweighted")
MD_kappa_result$value
},
.groups = "drop"
)
print(MD_kappa_by_indicator)
## # A tibble: 16 × 2
## indicator kappa_stat
## <chr> <dbl>
## 1 EA_On-time_graduation 0.819
## 2 EA_Postsecondary_readiness 0.742
## 3 K-12_Access_to_effective_teaching 0.637
## 4 K-12_Access_to_high-quality_academic_supports -0.00339
## 5 K-12_Access_to_rigorous_coursework 0.113
## 6 K-12_Curricular_breadth 0.0913
## 7 K-12_Engagement_in_schooling 0.504
## 8 K-12_Nonacademic_supports_for_student_success 0
## 9 K-12_Nonexclusionary_discipline_practices NaN
## 10 K-12_Performance_in_coursework 0.758
## 11 K-12_Performance_on_tests 0.574
## 12 K-12_School_climate 1
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation 0.562
## 14 Pre-K_Academic_readiness 1
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs 1
## 16 Pre-K_Self-regulation_and_attention_skills 0
Counting the number of datasets mapped to each indicator to create a plot afterwards.
MD_mapped_counts <- MD_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(indicator) %>%
summarise(mapped_counts = n_distinct(id), .groups = "drop")
MD_mapped_counts
## # A tibble: 15 × 2
## indicator mapped_counts
## <chr> <int>
## 1 EA_On-time_graduation 145
## 2 EA_Postsecondary_readiness 173
## 3 K-12_Access_to_effective_teaching 33
## 4 K-12_Access_to_high-quality_academic_supports 11
## 5 K-12_Access_to_rigorous_coursework 45
## 6 K-12_Curricular_breadth 43
## 7 K-12_Engagement_in_schooling 207
## 8 K-12_Nonacademic_supports_for_student_success 9
## 9 K-12_Performance_in_coursework 68
## 10 K-12_Performance_on_tests 164
## 11 K-12_School_climate 8
## 12 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati… 67
## 13 Pre-K_Academic_readiness 3
## 14 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra… 23
## 15 Pre-K_Self-regulation_and_attention_skills 3
MD_indicator_kappa_count_merge <- MD_kappa_by_indicator %>%
left_join(MD_mapped_counts, by = "indicator")
head(MD_indicator_kappa_count_merge)
## # A tibble: 6 × 3
## indicator kappa_stat mapped_counts
## <chr> <dbl> <int>
## 1 EA_On-time_graduation 0.819 145
## 2 EA_Postsecondary_readiness 0.742 173
## 3 K-12_Access_to_effective_teaching 0.637 33
## 4 K-12_Access_to_high-quality_academic_supports -0.00339 11
## 5 K-12_Access_to_rigorous_coursework 0.113 45
## 6 K-12_Curricular_breadth 0.0913 43
clean_labels <- sub("^[^_]+_", "", MD_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
MD_indicator_kappa_count_merge$indicator_cleaned <- factor(
wrapped_labels,
levels = wrapped_labels[order(MD_indicator_kappa_count_merge$kappa_stat)]
)
MD_balloonplot <- ggballoonplot(MD_indicator_kappa_count_merge,
x = "indicator_cleaned",
y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
scale_fill_viridis_c(option = "C") +
theme_minimal() +
ylim(-1,1) +
labs(
title = "Maryland Interrater Reliability by Indicator",
subtitle = "Overall Kappa is 0.665",
x = "Indicator",
y = "Cohen's Kappa",
fill = "Kappa",
size = "Number of Datasets"
) +
theme(
axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
) +
coord_flip()
MD_balloonplot
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
ggsave("20251113_IRR_kappa_balloonplot_Maryland.svg", plot = MD_balloonplot, width = 6.5, height = 6.5)
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
MD_heatmap_long <- MD_ab_long %>%
mutate(weight = case_when(
rating_a == 0 & rating_b == 0 ~ 0, # Both rated zero = Not Rated
(rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2, # One rated zero, other rated indirect = Moderate Disagreement
(rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3, # One rated zero, other rated direct = Serious Disagreement
(rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1, # Both rated indirect = Agreement
(rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1, # Both rated direct = Agreement
TRUE ~ NA_real_ # Handle any other cases (e.g., missing or invalid ratings)
))
clean_labels <- sub("^[^_]+_", "", MD_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
MD_heatmap_long$indicator_cleaned <- wrapped_labels
MD_heatmap_long_selected <- MD_heatmap_long %>%
select(id, indicator_cleaned, weight)
head(MD_heatmap_long_selected)
## # A tibble: 6 × 3
## id indicator_cleaned weight
## <chr> <chr> <dbl>
## 1 MD1218 "On-time graduation" 1
## 2 MD1218 "Postsecondary readiness" 3
## 3 MD1218 "Access to effective teaching" 0
## 4 MD1218 "Access to high-quality academic\nsupports" 0
## 5 MD1218 "Access to rigorous coursework" 3
## 6 MD1218 "Curricular breadth" 3
Dataset disagreement with marginal plots.
# 1. Set factor levels based on heatmap ordering
MD_indicator_levels <- levels(with(MD_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
MD_dataset_levels <- levels(with(MD_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
MD_agreement_heatmap <- ggplot(MD_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Maryland Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
axis.text.y = element_text(size = 1),
legend.text = element_text(size = 6),
plot.title = element_text(hjust = 0.5)
)
# 3. Extract the legend
MD_legend <- cowplot::get_legend(MD_agreement_heatmap)
MD_legend_plot <- wrap_elements(MD_legend)
# 4. Remove legend from main plot
MD_main_plot <- MD_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))
# 5. Create top marginal plot (stacked bar chart by indicator)
MD_top_plot <- MD_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = MD_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
scale_x_discrete(drop = FALSE) +
theme_minimal() +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "none"
)
# 6. Create right marginal plot (stacked bar chart by dataset)
MD_right_plot <- MD_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
mutate(id = factor(id, levels = MD_dataset_levels)) %>%
ggplot(aes(x = id, fill = factor(weight))) +
geom_bar(position = "stack") +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
theme_minimal() +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
legend.position = "none"
)
MD_final_plot <- MD_top_plot + MD_legend + MD_main_plot + MD_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))
MD_final_plot
ggsave("20251113_agreement_heatmap_with_marginals_Maryland.svg", plot = MD_final_plot, width = 8, height = 12)
Creating the png.
# 1. Set factor levels based on heatmap ordering
MD_indicator_levels <- levels(with(MD_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
MD_dataset_levels <- levels(with(MD_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
MD_agreement_heatmap <- ggplot(MD_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Maryland Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = element_text(size = 30),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20),
legend.text = element_text(size = 20),
legend.title = element_text(size = 20),
plot.title = element_text(size = 25, hjust = 0.5)
)
# 3. Extract the legend
MD_legend <- cowplot::get_legend(MD_agreement_heatmap)
# 4. Remove legend from main plot and adjust bar width
MD_main_plot <- MD_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20), hjust = 0.5)
# 5. Create top marginal plot
MD_top_plot <- MD_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = MD_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 20),
legend.position = "none")
# 6. Create right marginal plot
MD_right_data <- MD_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
count(id, weight) %>%
mutate(id = factor(id, levels = MD_dataset_levels))
MD_right_plot <- ggplot(MD_right_data, aes(x = id, y = n, fill = factor(weight))) +
geom_col(position = "stack", width = 1, colour = NA) +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
axis.title.x = element_text(size = 20),
legend.position = "none",
panel.spacing = unit(0, "lines")
)
MD_final_plot3 <- MD_top_plot + MD_legend + MD_main_plot + MD_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))
MD_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
ggsave("20251113_agreement_heatmap_with_marginals_Maryland.png", plot = MD_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.
MD_ab_directindirect <- MD_ab_long %>%
select(id, dataset_name, indicator, rating_a, rating_b) %>%
mutate(across(c(rating_a, rating_b), ~ case_when(
. == 2 ~ 1,
. %in% c(3,4) ~ 2,
TRUE ~ .
)))
MD_ratings_kappaweighted <- MD_ab_directindirect %>%
select(rating_a, rating_b)
MD_kappa_weighted <- kappa2(MD_ratings_kappaweighted, weight = "equal")
print(MD_kappa_weighted)
## Cohen's Kappa for 2 Raters (Weights: equal)
##
## Subjects = 7744
## Raters = 2
## Kappa = 0.719
##
## z = 70.4
## p-value = 0
MD_weighted_kappa_by_indicator <- MD_ab_directindirect %>%
group_by(indicator) %>%
summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
MD_weighted_kappa_by_indicator
## # A tibble: 16 × 2
## indicator weighted_kappa
## <chr> <dbl>
## 1 EA_On-time_graduation 0.856
## 2 EA_Postsecondary_readiness 0.768
## 3 K-12_Access_to_effective_teaching 0.777
## 4 K-12_Access_to_high-quality_academic_supports -0.00373
## 5 K-12_Access_to_rigorous_coursework 0.123
## 6 K-12_Curricular_breadth 0.136
## 7 K-12_Engagement_in_schooling 0.680
## 8 K-12_Nonacademic_supports_for_student_success 0
## 9 K-12_Nonexclusionary_discipline_practices NA
## 10 K-12_Performance_in_coursework 0.734
## 11 K-12_Performance_on_tests 0.613
## 12 K-12_School_climate 1
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat… 0.604
## 14 Pre-K_Academic_readiness 1
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr… 1
## 16 Pre-K_Self-regulation_and_attention_skills 0
IRRstatistics_Maryland <- MD_indicator_kappa_count_merge %>%
left_join(MD_weighted_kappa_by_indicator, by = "indicator") %>%
left_join(MD_percent_agreement, by = "indicator") %>%
select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)
indicator_order <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs",
"K-12_Access_to_effective_teaching",
"K-12_Access_to_rigorous_coursework",
"K-12_Curricular_breadth",
"K-12_Access_to_high-quality_academic_supports",
"K-12_School_climate",
"K-12_Nonexclusionary_discipline_practices",
"K-12_Nonacademic_supports_for_student_success",
"Pre-K_Academic_readiness",
"Pre-K_Self-regulation_and_attention_skills",
"K-12_Engagement_in_schooling",
"K-12_Performance_in_coursework",
"K-12_Performance_on_tests",
"EA_On-time_graduation",
"EA_Postsecondary_readiness"
)
indicator_name_mapping <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
"K-12_Access_to_effective_teaching" = "Effective Teaching",
"K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
"K-12_Curricular_breadth" = "Curricular Breadth",
"K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
"K-12_School_climate" = "School Climate",
"K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
"K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
"Pre-K_Academic_readiness" = "Academic Readiness",
"Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
"K-12_Engagement_in_schooling" = "Engagement in Schooling",
"K-12_Performance_in_coursework" = "Performance in Coursework",
"K-12_Performance_on_tests" = "Performance on Tests",
"EA_On-time_graduation" = "On-Time Graduation",
"EA_Postsecondary_readiness" = "Postsecondary Readiness"
)
IRRstatistics_Maryland_ordered <- IRRstatistics_Maryland %>%
arrange(match(indicator, indicator_order)) %>%
mutate(indicator_name = indicator_name_mapping[indicator]) %>%
select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
rename(
"Indicator" = indicator_name,
"Mapped Datasets" = mapped_counts,
"Percent Agreement" = percent_agreement,
"Cohen's Kappa" = kappa_stat,
"Cohen's Weighted Kappa" = weighted_kappa) %>%
mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))
kable(IRRstatistics_Maryland_ordered)
| Indicator | Mapped Datasets | Percent Agreement | Cohen’s Kappa | Cohen’s Weighted Kappa |
|---|---|---|---|---|
| Racial, Ethnic, and Economic Segregation | 67 | 91.94215 | 0.5624478 | 0.6038152 |
| High-Quality Pre-K Programs | 23 | 100.00000 | 1.0000000 | 1.0000000 |
| Effective Teaching | 33 | 95.24793 | 0.6372760 | 0.7769943 |
| Rigorous Coursework | 45 | 90.70248 | 0.1133366 | 0.1230334 |
| Curricular Breadth | 43 | 91.11570 | 0.0912584 | 0.1357675 |
| High-Quality Academic Supports | 11 | 97.72727 | -0.0033924 | -0.0037329 |
| School Climate | 8 | 100.00000 | 1.0000000 | 1.0000000 |
| Nonexclusionary Discipline Practices | NA | 100.00000 | NA | NA |
| Nonacademic Supports for Student Success | 9 | 98.14050 | 0.0000000 | 0.0000000 |
| Academic Readiness | 3 | 100.00000 | 1.0000000 | 1.0000000 |
| Self-Regulation and Attention Skills | 3 | 99.38017 | 0.0000000 | 0.0000000 |
| Engagement in Schooling | 207 | 75.61983 | 0.5042232 | 0.6798713 |
| Performance in Coursework | 68 | 94.62810 | 0.7580837 | 0.7344241 |
| Performance on Tests | 164 | 81.40496 | 0.5742310 | 0.6132684 |
| On-Time Graduation | 145 | 91.94215 | 0.8188362 | 0.8561597 |
| Postsecondary Readiness | 173 | 88.22314 | 0.7419608 | 0.7675692 |
write.csv(IRRstatistics_Maryland_ordered, "20251113_IRRstatistics_Maryland.csv", row.names = FALSE)
```
Filtering for only Oregon datasets.
OR_rater_ab <- rater_ab %>%
filter(str_starts(id, "OR"))
str(OR_rater_ab)
## tibble [2,068 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:2068] "OR1218" "OR1218" "OR1218" "OR1219" ...
## $ rater : chr [1:2068] "R625" "R625" "R514" "R625" ...
## $ validation_round: chr [1:2068] "A" "A" "B" "A" ...
## $ dataset_name : chr [1:2068] "OnTracktoELP_1617.xlsx" "OnTracktoELP_1617.xlsx" "OnTracktoELP_1617.xlsx" "OnTracktoELP_1718.xlsx" ...
## $ indicator : chr [1:2068] "K-12_Access_to_high-quality_academic_supports" "K-12_Performance_on_tests" "K-12_Performance_on_tests" "K-12_Access_to_high-quality_academic_supports" ...
## $ rating : num [1:2068] 2 2 2 2 2 2 2 2 2 2 ...
OR_duplicates_check <- OR_rater_ab %>%
group_by(id, indicator, validation_round) %>%
filter(n() > 1) %>%
ungroup()
# View the duplicates
print(OR_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## # dataset_name <chr>, indicator <chr>, rating <dbl>
Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).
OR_rater_ab <- OR_rater_ab %>%
filter(!(id %in% c("OR1450", "OR1451", "OR1452", "OR1453", "OR1454", "OR1463", "OR1464", "OR1465", "OR1466", "OR1467", "OR1468", "OR1469", "OR1470", "OR1471", "OR1472", "OR1473", "OR1474", "OR1475", "OR1476", "OR1477", "OR1478", "OR1479", "OR1480", "OR1481", "OR1482", "OR1483", "OR1484", "OR1485", "OR1486", "OR1487", "OR1488", "OR1489", "OR1490", "OR1491", "OR1492", "OR1493", "OR1494", "OR1495", "OR1496", "OR1497", "OR1498", "OR1499", "OR1500", "OR1501", "OR1502", "OR1503", "OR1504", "OR1505", "OR1506", "OR1507", "OR1508", "OR1509", "OR1510", "OR1511", "OR1512", "OR1513", "OR1514", "OR1515", "OR1516", "OR1517", "OR1518", "OR1519", "OR1520", "OR1521", "OR1522", "OR1523", "OR1524", "OR1525", "OR1526", "OR1527", "OR1528", "OR1529", "OR1530", "OR1531", "OR1532", "OR1533", "OR1534", "OR1535", "OR1536", "OR1537", "OR1538", "OR1539", "OR1540", "OR1541", "OR1542", "OR1543", "OR1544", "OR1545", "OR1546", "OR1547", "OR1548", "OR1549", "OR1551", "OR1552", "OR1553", "OR1554", "OR1558", "OR1576", "OR1577", "OR1580", "OR1597", "OR1598", "OR1599", "OR1601", "OR1606", "OR1609", "OR1612", "OR1614", "OR1617", "OR1766", "OR1767")
))
str(OR_rater_ab)
## tibble [1,833 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:1833] "OR1218" "OR1218" "OR1218" "OR1219" ...
## $ rater : chr [1:1833] "R625" "R625" "R514" "R625" ...
## $ validation_round: chr [1:1833] "A" "A" "B" "A" ...
## $ dataset_name : chr [1:1833] "OnTracktoELP_1617.xlsx" "OnTracktoELP_1617.xlsx" "OnTracktoELP_1617.xlsx" "OnTracktoELP_1718.xlsx" ...
## $ indicator : chr [1:1833] "K-12_Access_to_high-quality_academic_supports" "K-12_Performance_on_tests" "K-12_Performance_on_tests" "K-12_Access_to_high-quality_academic_supports" ...
## $ rating : num [1:1833] 2 2 2 2 2 2 2 2 2 2 ...
Making sure each id, indicator, validator combination is present.
OR_expanded_grid <- OR_rater_ab %>%
distinct(id, dataset_name) %>%
crossing(indicators_df, validation_round = c("A", "B"))
head(OR_expanded_grid)
## # A tibble: 6 × 4
## id dataset_name indicator validation_round
## <chr> <chr> <chr> <chr>
## 1 OR1218 OnTracktoELP_1617.xlsx EA_On-time_graduation A
## 2 OR1218 OnTracktoELP_1617.xlsx EA_On-time_graduation B
## 3 OR1218 OnTracktoELP_1617.xlsx EA_Postsecondary_readiness A
## 4 OR1218 OnTracktoELP_1617.xlsx EA_Postsecondary_readiness B
## 5 OR1218 OnTracktoELP_1617.xlsx K-12_Access_to_effective_teach… A
## 6 OR1218 OnTracktoELP_1617.xlsx K-12_Access_to_effective_teach… B
Making sure that id/indicator combinations that were not rated are represented as zero.
OR_ratings_with_zeros <- OR_expanded_grid %>%
left_join(OR_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating),
by = c("id", "dataset_name", "indicator", "validation_round")) %>%
mutate(rating = replace_na(rating, 0))
OR_ab_long <- OR_ratings_with_zeros %>%
pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
rename(rating_a = rating_A, rating_b = rating_B) %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
head(OR_ab_long)
## # A tibble: 6 × 6
## id dataset_name indicator rating_a rating_b agreement
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 OR1218 OnTracktoELP_1617.xlsx EA_On-time_graduati… 0 0 1
## 2 OR1218 OnTracktoELP_1617.xlsx EA_Postsecondary_re… 0 0 1
## 3 OR1218 OnTracktoELP_1617.xlsx K-12_Access_to_effe… 0 0 1
## 4 OR1218 OnTracktoELP_1617.xlsx K-12_Access_to_high… 2 0 0
## 5 OR1218 OnTracktoELP_1617.xlsx K-12_Access_to_rigo… 0 0 1
## 6 OR1218 OnTracktoELP_1617.xlsx K-12_Curricular_bre… 0 0 1
OR_dataset_agreement <- OR_ab_long %>%
group_by(id) %>%
summarise(percent_agreement = mean(agreement) * 100)
OR_dataset_agreement <- OR_dataset_agreement %>%
ungroup()
n_distinct(OR_dataset_agreement$id)
## [1] 466
head(OR_dataset_agreement)
## # A tibble: 6 × 2
## id percent_agreement
## <chr> <dbl>
## 1 OR1218 93.8
## 2 OR1219 93.8
## 3 OR1220 93.8
## 4 OR1221 93.8
## 5 OR1222 93.8
## 6 OR1223 100
write.csv(OR_dataset_agreement, "20251113_percent_agreement_by_id_Oregon.csv", row.names = FALSE)
OR_indicators_mapped_per_id <- OR_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(id) %>%
summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
right_join(OR_ab_long %>% distinct(id), by = "id") %>%
mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))
OR_dataset_agreement_count <- OR_dataset_agreement %>%
left_join(OR_indicators_mapped_per_id, by = "id")
head(OR_dataset_agreement_count)
## # A tibble: 6 × 3
## id percent_agreement indicators_mapped_per_id
## <chr> <dbl> <int>
## 1 OR1218 93.8 2
## 2 OR1219 93.8 2
## 3 OR1220 93.8 2
## 4 OR1221 93.8 2
## 5 OR1222 93.8 2
## 6 OR1223 100 1
OR_dataset_agreement_count <- OR_dataset_agreement_count %>%
arrange(desc(percent_agreement), (indicators_mapped_per_id))
OR_dataset_agreement_count$id <- factor(OR_dataset_agreement_count$id, levels = unique(OR_dataset_agreement_count$id))
OR_dataset_agreement_count$indicators_mapped_per_id <-
factor(OR_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
OR_percentagreement_barplot <- ggplot(OR_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
geom_bar(stat = "identity") +
theme_minimal() +
scale_fill_viridis_d(direction = -1) +
scale_y_continuous(expand = c(0, 0)) +
labs(
title = "Oregon - Percentage Agreement by Dataset",
x = "Dataset",
y = "Agreement (%)",
fill = "Number of Indicators"
) +
theme(
axis.text.y = element_text(size = 1)
) +
coord_flip()
OR_percentagreement_barplot
ggsave("20251113_percentagreement_barplot_Oregon.svg", plot = OR_percentagreement_barplot, width = 6.5, height = 6.5)
Printing results for chart in findings section.
OR_percent_agreement <- OR_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
group_by(indicator) %>%
summarise(percent_agreement = mean(agreement) * 100)
OR_percent_agreement
## # A tibble: 16 × 2
## indicator percent_agreement
## <chr> <dbl>
## 1 EA_On-time_graduation 94.2
## 2 EA_Postsecondary_readiness 89.7
## 3 K-12_Access_to_effective_teaching 96.6
## 4 K-12_Access_to_high-quality_academic_supports 91.2
## 5 K-12_Access_to_rigorous_coursework 100
## 6 K-12_Curricular_breadth 100
## 7 K-12_Engagement_in_schooling 79.4
## 8 K-12_Nonacademic_supports_for_student_success 97.9
## 9 K-12_Nonexclusionary_discipline_practices 97.0
## 10 K-12_Performance_in_coursework 94.4
## 11 K-12_Performance_on_tests 95.9
## 12 K-12_School_climate 100
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre… 84.1
## 14 Pre-K_Academic_readiness 100
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr… 97.9
## 16 Pre-K_Self-regulation_and_attention_skills 100
OR_ab_long <- OR_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
Calculating Cohen’s kappa, first for the whole dataset, then by indicator.
OR_contingency_table <- data.frame(OR_ab_long$rating_a, OR_ab_long$rating_b)
OR_kappa_result <- kappa2(OR_contingency_table, weight = "unweighted")
print(OR_kappa_result)
## Cohen's Kappa for 2 Raters (Weights: unweighted)
##
## Subjects = 7456
## Raters = 2
## Kappa = 0.771
##
## z = 95.2
## p-value = 0
OR_kappa_by_indicator <- OR_ab_long %>%
group_by(indicator) %>%
summarise(
kappa_stat = {
OR_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
OR_kappa_result <- kappa2(OR_contingency_table, weight = "unweighted")
OR_kappa_result$value
},
.groups = "drop"
)
print(OR_kappa_by_indicator)
## # A tibble: 16 × 2
## indicator kappa_stat
## <chr> <dbl>
## 1 EA_On-time_graduation 0.856
## 2 EA_Postsecondary_readiness 0.661
## 3 K-12_Access_to_effective_teaching 0.824
## 4 K-12_Access_to_high-quality_academic_supports 0.174
## 5 K-12_Access_to_rigorous_coursework 1
## 6 K-12_Curricular_breadth 1
## 7 K-12_Engagement_in_schooling 0.664
## 8 K-12_Nonacademic_supports_for_student_success 0.405
## 9 K-12_Nonexclusionary_discipline_practices 0.487
## 10 K-12_Performance_in_coursework 0.644
## 11 K-12_Performance_on_tests 0.936
## 12 K-12_School_climate 1
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation 0.544
## 14 Pre-K_Academic_readiness 1
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs 0
## 16 Pre-K_Self-regulation_and_attention_skills 1
Counting the number of datasets mapped to each indicator to create a plot afterwards.
OR_mapped_counts <- OR_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(indicator) %>%
summarise(mapped_counts = n_distinct(id), .groups = "drop")
OR_mapped_counts
## # A tibble: 16 × 2
## indicator mapped_counts
## <chr> <int>
## 1 EA_On-time_graduation 119
## 2 EA_Postsecondary_readiness 92
## 3 K-12_Access_to_effective_teaching 54
## 4 K-12_Access_to_high-quality_academic_supports 46
## 5 K-12_Access_to_rigorous_coursework 7
## 6 K-12_Curricular_breadth 4
## 7 K-12_Engagement_in_schooling 230
## 8 K-12_Nonacademic_supports_for_student_success 13
## 9 K-12_Nonexclusionary_discipline_practices 21
## 10 K-12_Performance_in_coursework 52
## 11 K-12_Performance_on_tests 241
## 12 K-12_School_climate 3
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati… 121
## 14 Pre-K_Academic_readiness 13
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra… 10
## 16 Pre-K_Self-regulation_and_attention_skills 13
OR_indicator_kappa_count_merge <- OR_kappa_by_indicator %>%
left_join(OR_mapped_counts, by = "indicator")
head(OR_indicator_kappa_count_merge)
## # A tibble: 6 × 3
## indicator kappa_stat mapped_counts
## <chr> <dbl> <int>
## 1 EA_On-time_graduation 0.856 119
## 2 EA_Postsecondary_readiness 0.661 92
## 3 K-12_Access_to_effective_teaching 0.824 54
## 4 K-12_Access_to_high-quality_academic_supports 0.174 46
## 5 K-12_Access_to_rigorous_coursework 1 7
## 6 K-12_Curricular_breadth 1 4
clean_labels <- sub("^[^_]+_", "", OR_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
OR_indicator_kappa_count_merge$indicator_cleaned <- factor(
wrapped_labels,
levels = wrapped_labels[order(OR_indicator_kappa_count_merge$kappa_stat)]
)
OR_balloonplot <- ggballoonplot(OR_indicator_kappa_count_merge,
x = "indicator_cleaned",
y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
scale_fill_viridis_c(option = "C") +
theme_minimal() +
ylim(-1,1) +
labs(
title = "Oregon Interrater Reliability by Indicator",
subtitle = "Overall Kappa is 0.771",
x = "Indicator",
y = "Cohen's Kappa",
fill = "Kappa",
size = "Number of Datasets"
) +
theme(
axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
) +
coord_flip()
OR_balloonplot
ggsave("20251113_IRR_kappa_balloonplot_Oregon.svg", plot = OR_balloonplot, width = 6.5, height = 6.5)
OR_heatmap_long <- OR_ab_long %>%
mutate(weight = case_when(
rating_a == 0 & rating_b == 0 ~ 0, # Both rated zero = Not Rated
(rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2, # One rated zero, other rated indirect = Moderate Disagreement
(rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3, # One rated zero, other rated direct = Serious Disagreement
(rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1, # Both rated indirect = Agreement
(rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1, # Both rated direct = Agreement
TRUE ~ NA_real_ # Handle any other cases (e.g., missing or invalid ratings)
))
clean_labels <- sub("^[^_]+_", "", OR_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
OR_heatmap_long$indicator_cleaned <- wrapped_labels
OR_heatmap_long_selected <- OR_heatmap_long %>%
select(id, indicator_cleaned, weight)
head(OR_heatmap_long_selected)
## # A tibble: 6 × 3
## id indicator_cleaned weight
## <chr> <chr> <dbl>
## 1 OR1218 "On-time graduation" 0
## 2 OR1218 "Postsecondary readiness" 0
## 3 OR1218 "Access to effective teaching" 0
## 4 OR1218 "Access to high-quality academic\nsupports" 2
## 5 OR1218 "Access to rigorous coursework" 0
## 6 OR1218 "Curricular breadth" 0
Dataset disagreement with marginal plots.
# 1. Set factor levels based on heatmap ordering
OR_indicator_levels <- levels(with(OR_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
OR_dataset_levels <- levels(with(OR_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
OR_agreement_heatmap <- ggplot(OR_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Oregon Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
axis.text.y = element_text(size = 1),
legend.text = element_text(size = 6),
plot.title = element_text(hjust = 0.5)
)
# 3. Extract the legend
OR_legend <- cowplot::get_legend(OR_agreement_heatmap)
OR_legend_plot <- wrap_elements(OR_legend)
# 4. Remove legend from main plot
OR_main_plot <- OR_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))
# 5. Create top marginal plot (stacked bar chart by indicator)
OR_top_plot <- OR_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = OR_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
scale_x_discrete(drop = FALSE) +
theme_minimal() +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "none"
)
# 6. Create right marginal plot (stacked bar chart by dataset)
OR_right_plot <- OR_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
mutate(id = factor(id, levels = OR_dataset_levels)) %>%
ggplot(aes(x = id, fill = factor(weight))) +
geom_bar(position = "stack") +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
theme_minimal() +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
legend.position = "none"
)
OR_final_plot <- OR_top_plot + OR_legend + OR_main_plot + OR_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))
OR_final_plot
ggsave("20251113_agreement_heatmap_with_marginals_Oregon.svg", plot = OR_final_plot, width = 8, height = 12)
Creating the png.
# 1. Set factor levels based on heatmap ordering
OR_indicator_levels <- levels(with(OR_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
OR_dataset_levels <- levels(with(OR_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
OR_agreement_heatmap <- ggplot(OR_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "Oregon Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = element_text(size = 30),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20),
legend.text = element_text(size = 20),
legend.title = element_text(size = 20),
plot.title = element_text(size = 25, hjust = 0.5)
)
# 3. Extract the legend
OR_legend <- cowplot::get_legend(OR_agreement_heatmap)
# 4. Remove legend from main plot and adjust bar width
OR_main_plot <- OR_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20), hjust = 0.5)
# 5. Create top marginal plot
OR_top_plot <- OR_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = OR_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 20),
legend.position = "none")
# 6. Create right marginal plot
OR_right_data <- OR_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
count(id, weight) %>%
mutate(id = factor(id, levels = OR_dataset_levels))
OR_right_plot <- ggplot(OR_right_data, aes(x = id, y = n, fill = factor(weight))) +
geom_col(position = "stack", width = 1, colour = NA) +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
axis.title.x = element_text(size = 20),
legend.position = "none",
panel.spacing = unit(0, "lines")
)
OR_final_plot3 <- OR_top_plot + OR_legend + OR_main_plot + OR_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))
OR_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
ggsave("20251113_agreement_heatmap_with_marginals_Oregon.png", plot = OR_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.
OR_ab_directindirect <- OR_ab_long %>%
select(id, dataset_name, indicator, rating_a, rating_b) %>%
mutate(across(c(rating_a, rating_b), ~ case_when(
. == 2 ~ 1,
. %in% c(3,4) ~ 2,
TRUE ~ .
)))
OR_ratings_kappaweighted <- OR_ab_directindirect %>%
select(rating_a, rating_b)
OR_kappa_weighted <- kappa2(OR_ratings_kappaweighted, weight = "equal")
print(OR_kappa_weighted)
## Cohen's Kappa for 2 Raters (Weights: equal)
##
## Subjects = 7456
## Raters = 2
## Kappa = 0.831
##
## z = 77.2
## p-value = 0
OR_weighted_kappa_by_indicator <- OR_ab_directindirect %>%
group_by(indicator) %>%
summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
OR_weighted_kappa_by_indicator
## # A tibble: 16 × 2
## indicator weighted_kappa
## <chr> <dbl>
## 1 EA_On-time_graduation 0.896
## 2 EA_Postsecondary_readiness 0.735
## 3 K-12_Access_to_effective_teaching 0.854
## 4 K-12_Access_to_high-quality_academic_supports 0.281
## 5 K-12_Access_to_rigorous_coursework 1
## 6 K-12_Curricular_breadth 1
## 7 K-12_Engagement_in_schooling 0.754
## 8 K-12_Nonacademic_supports_for_student_success 0.493
## 9 K-12_Nonexclusionary_discipline_practices 0.485
## 10 K-12_Performance_in_coursework 0.640
## 11 K-12_Performance_on_tests 0.959
## 12 K-12_School_climate 1
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat… 0.687
## 14 Pre-K_Academic_readiness 1
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr… 0
## 16 Pre-K_Self-regulation_and_attention_skills 1
IRRstatistics_Oregon <- OR_indicator_kappa_count_merge %>%
left_join(OR_weighted_kappa_by_indicator, by = "indicator") %>%
left_join(OR_percent_agreement, by = "indicator") %>%
select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)
indicator_order <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs",
"K-12_Access_to_effective_teaching",
"K-12_Access_to_rigorous_coursework",
"K-12_Curricular_breadth",
"K-12_Access_to_high-quality_academic_supports",
"K-12_School_climate",
"K-12_Nonexclusionary_discipline_practices",
"K-12_Nonacademic_supports_for_student_success",
"Pre-K_Academic_readiness",
"Pre-K_Self-regulation_and_attention_skills",
"K-12_Engagement_in_schooling",
"K-12_Performance_in_coursework",
"K-12_Performance_on_tests",
"EA_On-time_graduation",
"EA_Postsecondary_readiness"
)
indicator_name_mapping <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
"K-12_Access_to_effective_teaching" = "Effective Teaching",
"K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
"K-12_Curricular_breadth" = "Curricular Breadth",
"K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
"K-12_School_climate" = "School Climate",
"K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
"K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
"Pre-K_Academic_readiness" = "Academic Readiness",
"Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
"K-12_Engagement_in_schooling" = "Engagement in Schooling",
"K-12_Performance_in_coursework" = "Performance in Coursework",
"K-12_Performance_on_tests" = "Performance on Tests",
"EA_On-time_graduation" = "On-Time Graduation",
"EA_Postsecondary_readiness" = "Postsecondary Readiness"
)
IRRstatistics_Oregon_ordered <- IRRstatistics_Oregon %>%
arrange(match(indicator, indicator_order)) %>%
mutate(indicator_name = indicator_name_mapping[indicator]) %>%
select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
rename(
"Indicator" = indicator_name,
"Mapped Datasets" = mapped_counts,
"Percent Agreement" = percent_agreement,
"Cohen's Kappa" = kappa_stat,
"Cohen's Weighted Kappa" = weighted_kappa) %>%
mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))
kable(IRRstatistics_Oregon_ordered)
| Indicator | Mapped Datasets | Percent Agreement | Cohen’s Kappa | Cohen’s Weighted Kappa |
|---|---|---|---|---|
| Racial, Ethnic, and Economic Segregation | 121 | 84.12017 | 0.5443687 | 0.6872714 |
| High-Quality Pre-K Programs | 10 | 97.85408 | 0.0000000 | 0.0000000 |
| Effective Teaching | 54 | 96.56652 | 0.8241841 | 0.8542766 |
| Rigorous Coursework | 7 | 100.00000 | 1.0000000 | 1.0000000 |
| Curricular Breadth | 4 | 100.00000 | 1.0000000 | 1.0000000 |
| High-Quality Academic Supports | 46 | 91.20172 | 0.1741874 | 0.2808149 |
| School Climate | 3 | 100.00000 | 1.0000000 | 1.0000000 |
| Nonexclusionary Discipline Practices | 21 | 96.99571 | 0.4872682 | 0.4845947 |
| Nonacademic Supports for Student Success | 13 | 97.85408 | 0.4048531 | 0.4934783 |
| Academic Readiness | 13 | 100.00000 | 1.0000000 | 1.0000000 |
| Self-Regulation and Attention Skills | 13 | 100.00000 | 1.0000000 | 1.0000000 |
| Engagement in Schooling | 230 | 79.39914 | 0.6644993 | 0.7542482 |
| Performance in Coursework | 52 | 94.42060 | 0.6442748 | 0.6398764 |
| Performance on Tests | 241 | 95.92275 | 0.9364681 | 0.9591458 |
| On-Time Graduation | 119 | 94.20601 | 0.8557606 | 0.8959030 |
| Postsecondary Readiness | 92 | 89.69957 | 0.6614449 | 0.7354039 |
write.csv(IRRstatistics_Oregon_ordered, "20251113_IRRstatistics_Oregon.csv", row.names = FALSE)
Filtering for only California datasets.
CA_rater_ab <- rater_ab %>%
filter(str_starts(id, "CA"))
str(CA_rater_ab)
## tibble [1,163 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:1163] "CA1218" "CA1218" "CA1218" "CA1219" ...
## $ rater : chr [1:1163] "R514" "R562" "R562" "R514" ...
## $ validation_round: chr [1:1163] "A" "B" "B" "A" ...
## $ dataset_name : chr [1:1163] "expulsion12.txt" "expulsion12.txt" "expulsion12.txt" "expulsion13.txt" ...
## $ indicator : chr [1:1163] "K-12_Nonexclusionary_discipline_practices" "K-12_Nonexclusionary_discipline_practices" "K-12_School_climate" "K-12_Nonexclusionary_discipline_practices" ...
## $ rating : num [1:1163] 4 4 2 4 4 2 4 4 2 4 ...
CA_duplicates_check <- CA_rater_ab %>%
group_by(id, indicator, validation_round) %>%
filter(n() > 1) %>%
ungroup()
# View the duplicates
print(CA_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## # dataset_name <chr>, indicator <chr>, rating <dbl>
Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).
CA_rater_ab <- CA_rater_ab %>%
filter(!(id %in% c("CA1412")))
str(CA_rater_ab)
## tibble [1,163 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:1163] "CA1218" "CA1218" "CA1218" "CA1219" ...
## $ rater : chr [1:1163] "R514" "R562" "R562" "R514" ...
## $ validation_round: chr [1:1163] "A" "B" "B" "A" ...
## $ dataset_name : chr [1:1163] "expulsion12.txt" "expulsion12.txt" "expulsion12.txt" "expulsion13.txt" ...
## $ indicator : chr [1:1163] "K-12_Nonexclusionary_discipline_practices" "K-12_Nonexclusionary_discipline_practices" "K-12_School_climate" "K-12_Nonexclusionary_discipline_practices" ...
## $ rating : num [1:1163] 4 4 2 4 4 2 4 4 2 4 ...
Making sure each id, indicator, validator combination is present.
CA_expanded_grid <- CA_rater_ab %>%
distinct(id, dataset_name) %>%
crossing(indicators_df, validation_round = c("A", "B"))
head(CA_expanded_grid)
## # A tibble: 6 × 4
## id dataset_name indicator validation_round
## <chr> <chr> <chr> <chr>
## 1 CA1218 expulsion12.txt EA_On-time_graduation A
## 2 CA1218 expulsion12.txt EA_On-time_graduation B
## 3 CA1218 expulsion12.txt EA_Postsecondary_readiness A
## 4 CA1218 expulsion12.txt EA_Postsecondary_readiness B
## 5 CA1218 expulsion12.txt K-12_Access_to_effective_teaching A
## 6 CA1218 expulsion12.txt K-12_Access_to_effective_teaching B
Making sure that id/indicator combinations that were not rated are represented as zero.
CA_ratings_with_zeros <- CA_expanded_grid %>%
left_join(CA_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating),
by = c("id", "dataset_name", "indicator", "validation_round")) %>%
mutate(rating = replace_na(rating, 0))
CA_ab_long <- CA_ratings_with_zeros %>%
pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
rename(rating_a = rating_A, rating_b = rating_B) %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
head(CA_ab_long)
## # A tibble: 6 × 6
## id dataset_name indicator rating_a rating_b agreement
## <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 CA1218 expulsion12.txt EA_On-time_graduation 0 0 1
## 2 CA1218 expulsion12.txt EA_Postsecondary_readiness 0 0 1
## 3 CA1218 expulsion12.txt K-12_Access_to_effective_t… 0 0 1
## 4 CA1218 expulsion12.txt K-12_Access_to_high-qualit… 0 0 1
## 5 CA1218 expulsion12.txt K-12_Access_to_rigorous_co… 0 0 1
## 6 CA1218 expulsion12.txt K-12_Curricular_breadth 0 0 1
CA_dataset_agreement <- CA_ab_long %>%
group_by(id) %>%
summarise(percent_agreement = mean(agreement) * 100)
CA_dataset_agreement <- CA_dataset_agreement %>%
ungroup()
n_distinct(CA_dataset_agreement$id)
## [1] 313
head(CA_dataset_agreement)
## # A tibble: 6 × 2
## id percent_agreement
## <chr> <dbl>
## 1 CA1218 93.8
## 2 CA1219 93.8
## 3 CA1220 93.8
## 4 CA1221 93.8
## 5 CA1222 93.8
## 6 CA1223 93.8
write.csv(CA_dataset_agreement, "20251113_percent_agreement_by_id_California.csv", row.names = FALSE)
CA_indicators_mapped_per_id <- CA_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(id) %>%
summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
right_join(CA_ab_long %>% distinct(id), by = "id") %>%
mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))
CA_dataset_agreement_count <- CA_dataset_agreement %>%
left_join(CA_indicators_mapped_per_id, by = "id")
head(CA_dataset_agreement_count)
## # A tibble: 6 × 3
## id percent_agreement indicators_mapped_per_id
## <chr> <dbl> <int>
## 1 CA1218 93.8 2
## 2 CA1219 93.8 2
## 3 CA1220 93.8 2
## 4 CA1221 93.8 2
## 5 CA1222 93.8 2
## 6 CA1223 93.8 2
CA_dataset_agreement_count <- CA_dataset_agreement_count %>%
arrange(desc(percent_agreement), (indicators_mapped_per_id))
CA_dataset_agreement_count$id <- factor(CA_dataset_agreement_count$id, levels = unique(CA_dataset_agreement_count$id))
CA_dataset_agreement_count$indicators_mapped_per_id <-
factor(CA_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
CA_percentagreement_barplot <- ggplot(CA_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
geom_bar(stat = "identity") +
theme_minimal() +
scale_fill_viridis_d(direction = -1) +
scale_y_continuous(expand = c(0, 0)) +
labs(
title = "California - Percentage Agreement by Dataset",
x = "Dataset",
y = "Agreement (%)",
fill = "Number of Indicators"
) +
theme(
axis.text.y = element_text(size = 1)
) +
coord_flip()
CA_percentagreement_barplot
ggsave("20251113_percentagreement_barplot_California.svg", plot = CA_percentagreement_barplot, width = 6.5, height = 6.5)
Printing results for chart in findings section.
CA_percent_agreement <- CA_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
group_by(indicator) %>%
summarise(percent_agreement = mean(agreement) * 100)
CA_percent_agreement
## # A tibble: 16 × 2
## indicator percent_agreement
## <chr> <dbl>
## 1 EA_On-time_graduation 91.4
## 2 EA_Postsecondary_readiness 92.3
## 3 K-12_Access_to_effective_teaching 95.2
## 4 K-12_Access_to_high-quality_academic_supports 93.0
## 5 K-12_Access_to_rigorous_coursework 100
## 6 K-12_Curricular_breadth 92.7
## 7 K-12_Engagement_in_schooling 82.7
## 8 K-12_Nonacademic_supports_for_student_success 93.3
## 9 K-12_Nonexclusionary_discipline_practices 97.8
## 10 K-12_Performance_in_coursework 95.8
## 11 K-12_Performance_on_tests 93.6
## 12 K-12_School_climate 88.8
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre… 75.4
## 14 Pre-K_Academic_readiness 100
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr… 100
## 16 Pre-K_Self-regulation_and_attention_skills 100
CA_ab_long <- CA_ab_long %>%
mutate(agreement = ifelse(rating_a == rating_b, 1, 0))
Calculating Cohen’s kappa, first for the whole dataset, then by indicator.
CA_contingency_table <- data.frame(CA_ab_long$rating_a, CA_ab_long$rating_b)
CA_kappa_result <- kappa2(CA_contingency_table, weight = "unweighted")
print(CA_kappa_result)
## Cohen's Kappa for 2 Raters (Weights: unweighted)
##
## Subjects = 5008
## Raters = 2
## Kappa = 0.683
##
## z = 64.9
## p-value = 0
CA_kappa_by_indicator <- CA_ab_long %>%
group_by(indicator) %>%
summarise(
kappa_stat = {
CA_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
CA_kappa_result <- kappa2(CA_contingency_table, weight = "unweighted")
CA_kappa_result$value
},
.groups = "drop"
)
print(CA_kappa_by_indicator)
## # A tibble: 16 × 2
## indicator kappa_stat
## <chr> <dbl>
## 1 EA_On-time_graduation 0.746
## 2 EA_Postsecondary_readiness 0.773
## 3 K-12_Access_to_effective_teaching 0.739
## 4 K-12_Access_to_high-quality_academic_supports 0.269
## 5 K-12_Access_to_rigorous_coursework 1
## 6 K-12_Curricular_breadth 0.436
## 7 K-12_Engagement_in_schooling 0.548
## 8 K-12_Nonacademic_supports_for_student_success 0.572
## 9 K-12_Nonexclusionary_discipline_practices 0.893
## 10 K-12_Performance_in_coursework 0.742
## 11 K-12_Performance_on_tests 0.771
## 12 K-12_School_climate 0.199
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation 0.481
## 14 Pre-K_Academic_readiness NaN
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs 1
## 16 Pre-K_Self-regulation_and_attention_skills NaN
Counting the number of datasets mapped to each indicator to create a plot afterwards.
CA_mapped_counts <- CA_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(indicator) %>%
summarise(mapped_counts = n_distinct(id), .groups = "drop")
CA_mapped_counts
## # A tibble: 14 × 2
## indicator mapped_counts
## <chr> <int>
## 1 EA_On-time_graduation 73
## 2 EA_Postsecondary_readiness 72
## 3 K-12_Access_to_effective_teaching 38
## 4 K-12_Access_to_high-quality_academic_supports 23
## 5 K-12_Access_to_rigorous_coursework 28
## 6 K-12_Curricular_breadth 32
## 7 K-12_Engagement_in_schooling 96
## 8 K-12_Nonacademic_supports_for_student_success 33
## 9 K-12_Nonexclusionary_discipline_practices 40
## 10 K-12_Performance_in_coursework 34
## 11 K-12_Performance_on_tests 57
## 12 K-12_School_climate 40
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati… 154
## 14 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra… 9
CA_indicator_kappa_count_merge <- CA_kappa_by_indicator %>%
left_join(CA_mapped_counts, by = "indicator")
head(CA_indicator_kappa_count_merge)
## # A tibble: 6 × 3
## indicator kappa_stat mapped_counts
## <chr> <dbl> <int>
## 1 EA_On-time_graduation 0.746 73
## 2 EA_Postsecondary_readiness 0.773 72
## 3 K-12_Access_to_effective_teaching 0.739 38
## 4 K-12_Access_to_high-quality_academic_supports 0.269 23
## 5 K-12_Access_to_rigorous_coursework 1 28
## 6 K-12_Curricular_breadth 0.436 32
clean_labels <- sub("^[^_]+_", "", CA_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
CA_indicator_kappa_count_merge$indicator_cleaned <- factor(
wrapped_labels,
levels = wrapped_labels[order(CA_indicator_kappa_count_merge$kappa_stat)]
)
CA_balloonplot <- ggballoonplot(CA_indicator_kappa_count_merge,
x = "indicator_cleaned",
y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
scale_fill_viridis_c(option = "C") +
theme_minimal() +
ylim(-1,1) +
labs(
title = "California Interrater Reliability by Indicator",
subtitle = "Overall Kappa is 0.683",
x = "Indicator",
y = "Cohen's Kappa",
fill = "Kappa",
size = "Number of Datasets"
) +
theme(
axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
) +
coord_flip()
CA_balloonplot
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggsave("20251113_IRR_kappa_balloonplot_California.svg", plot = CA_balloonplot, width = 6.5, height = 6.5)
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
CA_heatmap_long <- CA_ab_long %>%
mutate(weight = case_when(
rating_a == 0 & rating_b == 0 ~ 0, # Both rated zero = Not Rated
(rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2, # One rated indirect, other rated direct = Moderate Disagreement
(rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2, # One rated zero, other rated indirect = Moderate Disagreement
(rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3, # One rated zero, other rated direct = Serious Disagreement
(rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1, # Both rated indirect = Agreement
(rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1, # Both rated direct = Agreement
TRUE ~ NA_real_ # Handle any other cases (e.g., missing or invalid ratings)
))
clean_labels <- sub("^[^_]+_", "", CA_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
CA_heatmap_long$indicator_cleaned <- wrapped_labels
CA_heatmap_long_selected <- CA_heatmap_long %>%
select(id, indicator_cleaned, weight)
head(CA_heatmap_long_selected)
## # A tibble: 6 × 3
## id indicator_cleaned weight
## <chr> <chr> <dbl>
## 1 CA1218 "On-time graduation" 0
## 2 CA1218 "Postsecondary readiness" 0
## 3 CA1218 "Access to effective teaching" 0
## 4 CA1218 "Access to high-quality academic\nsupports" 0
## 5 CA1218 "Access to rigorous coursework" 0
## 6 CA1218 "Curricular breadth" 0
Dataset disagreement with marginal plots.
# 1. Set factor levels based on heatmap ordering
CA_indicator_levels <- levels(with(CA_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
CA_dataset_levels <- levels(with(CA_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
CA_agreement_heatmap <- ggplot(CA_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "California Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
axis.text.y = element_text(size = 1),
legend.text = element_text(size = 6),
plot.title = element_text(hjust = 0.5)
)
# 3. Extract the legend
CA_legend <- cowplot::get_legend(CA_agreement_heatmap)
CA_legend_plot <- wrap_elements(CA_legend)
# 4. Remove legend from main plot
CA_main_plot <- CA_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))
# 5. Create top marginal plot (stacked bar chart by indicator)
CA_top_plot <- CA_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = CA_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
scale_x_discrete(drop = FALSE) +
theme_minimal() +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
legend.position = "none"
)
# 6. Create right marginal plot (stacked bar chart by dataset)
CA_right_plot <- CA_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
mutate(id = factor(id, levels = CA_dataset_levels)) %>%
ggplot(aes(x = id, fill = factor(weight))) +
geom_bar(position = "stack") +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
theme_minimal() +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
legend.position = "none"
)
CA_final_plot <- CA_top_plot + CA_legend + CA_main_plot + CA_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))
CA_final_plot
ggsave("20251113_agreement_heatmap_with_marginals_California.svg", plot = CA_final_plot, width = 8, height = 12)
Creating the png.
# 1. Set factor levels based on heatmap ordering
CA_indicator_levels <- levels(with(CA_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
CA_dataset_levels <- levels(with(CA_heatmap_long_selected, reorder(id, -weight)))
# 2. Create the main heatmap
CA_agreement_heatmap <- ggplot(CA_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
y = reorder(id, -weight),
fill = factor(weight))) +
geom_tile() +
scale_fill_manual(
values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
) +
labs(
title = "California Disagreement Heatmap",
x = "Indicator",
y = "Dataset Name",
fill = "Agreement Type"
) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
axis.text.y = element_text(size = 30),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20),
legend.text = element_text(size = 20),
legend.title = element_text(size = 20),
plot.title = element_text(size = 25, hjust = 0.5)
)
# 3. Extract the legend
CA_legend <- cowplot::get_legend(CA_agreement_heatmap)
# 4. Remove legend from main plot and adjust bar width
CA_main_plot <- CA_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
axis.title.y = element_text(size = 20),
axis.title.x = element_text(size = 20), hjust = 0.5)
# 5. Create top marginal plot
CA_top_plot <- CA_heatmap_long_selected %>%
filter(weight != 0) %>%
mutate(indicator_cleaned = factor(indicator_cleaned, levels = CA_indicator_levels)) %>%
ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
geom_bar(position = "stack", width = 1) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_text(size = 20),
legend.position = "none")
# 6. Create right marginal plot
CA_right_data <- CA_heatmap_long_selected %>%
filter(weight %in% c(1, 2, 3)) %>%
count(id, weight) %>%
mutate(id = factor(id, levels = CA_dataset_levels))
CA_right_plot <- ggplot(CA_right_data, aes(x = id, y = n, fill = factor(weight))) +
geom_col(position = "stack", width = 1, colour = NA) +
coord_flip() +
scale_y_continuous(breaks = seq(0, 14, by = 2)) +
scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
theme_minimal() +
labs(y = "Count") +
theme(
panel.background = element_blank(),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.title.y = element_blank(),
axis.title.x = element_text(size = 20),
legend.position = "none",
panel.spacing = unit(0, "lines")
)
CA_final_plot3 <- CA_top_plot + CA_legend + CA_main_plot + CA_right_plot +
plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))
CA_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
ggsave("20251113_agreement_heatmap_with_marginals_California.png", plot = CA_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.
Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.
CA_ab_directindirect <- CA_ab_long %>%
select(id, dataset_name, indicator, rating_a, rating_b) %>%
mutate(across(c(rating_a, rating_b), ~ case_when(
. == 2 ~ 1,
. %in% c(3,4) ~ 2,
TRUE ~ .
)))
CA_ratings_kappaweighted <- CA_ab_directindirect %>%
select(rating_a, rating_b)
CA_kappa_weighted <- kappa2(CA_ratings_kappaweighted, weight = "equal")
print(CA_kappa_weighted)
## Cohen's Kappa for 2 Raters (Weights: equal)
##
## Subjects = 5008
## Raters = 2
## Kappa = 0.722
##
## z = 56
## p-value = 0
CA_weighted_kappa_by_indicator <- CA_ab_directindirect %>%
group_by(indicator) %>%
summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
CA_weighted_kappa_by_indicator
## # A tibble: 16 × 2
## indicator weighted_kappa
## <chr> <dbl>
## 1 EA_On-time_graduation 0.812
## 2 EA_Postsecondary_readiness 0.872
## 3 K-12_Access_to_effective_teaching 0.816
## 4 K-12_Access_to_high-quality_academic_supports 0.406
## 5 K-12_Access_to_rigorous_coursework 1
## 6 K-12_Curricular_breadth 0.453
## 7 K-12_Engagement_in_schooling 0.560
## 8 K-12_Nonacademic_supports_for_student_success 0.585
## 9 K-12_Nonexclusionary_discipline_practices 0.936
## 10 K-12_Performance_in_coursework 0.742
## 11 K-12_Performance_on_tests 0.756
## 12 K-12_School_climate 0.199
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat… 0.472
## 14 Pre-K_Academic_readiness NA
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr… 1
## 16 Pre-K_Self-regulation_and_attention_skills NA
IRRstatistics_California <- CA_indicator_kappa_count_merge %>%
left_join(CA_weighted_kappa_by_indicator, by = "indicator") %>%
left_join(CA_percent_agreement, by = "indicator") %>%
select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)
indicator_order <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs",
"K-12_Access_to_effective_teaching",
"K-12_Access_to_rigorous_coursework",
"K-12_Curricular_breadth",
"K-12_Access_to_high-quality_academic_supports",
"K-12_School_climate",
"K-12_Nonexclusionary_discipline_practices",
"K-12_Nonacademic_supports_for_student_success",
"Pre-K_Academic_readiness",
"Pre-K_Self-regulation_and_attention_skills",
"K-12_Engagement_in_schooling",
"K-12_Performance_in_coursework",
"K-12_Performance_on_tests",
"EA_On-time_graduation",
"EA_Postsecondary_readiness"
)
indicator_name_mapping <- c(
"K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
"Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
"K-12_Access_to_effective_teaching" = "Effective Teaching",
"K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
"K-12_Curricular_breadth" = "Curricular Breadth",
"K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
"K-12_School_climate" = "School Climate",
"K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
"K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
"Pre-K_Academic_readiness" = "Academic Readiness",
"Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
"K-12_Engagement_in_schooling" = "Engagement in Schooling",
"K-12_Performance_in_coursework" = "Performance in Coursework",
"K-12_Performance_on_tests" = "Performance on Tests",
"EA_On-time_graduation" = "On-Time Graduation",
"EA_Postsecondary_readiness" = "Postsecondary Readiness"
)
IRRstatistics_California_ordered <- IRRstatistics_California %>%
arrange(match(indicator, indicator_order)) %>%
mutate(indicator_name = indicator_name_mapping[indicator]) %>%
select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
rename(
"Indicator" = indicator_name,
"Mapped Datasets" = mapped_counts,
"Percent Agreement" = percent_agreement,
"Cohen's Kappa" = kappa_stat,
"Cohen's Weighted Kappa" = weighted_kappa) %>%
mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))
kable(IRRstatistics_California_ordered)
| Indicator | Mapped Datasets | Percent Agreement | Cohen’s Kappa | Cohen’s Weighted Kappa |
|---|---|---|---|---|
| Racial, Ethnic, and Economic Segregation | 154 | 75.39936 | 0.4810293 | 0.4721985 |
| High-Quality Pre-K Programs | 9 | 100.00000 | 1.0000000 | 1.0000000 |
| Effective Teaching | 38 | 95.20767 | 0.7388330 | 0.8157523 |
| Rigorous Coursework | 28 | 100.00000 | 1.0000000 | 1.0000000 |
| Curricular Breadth | 32 | 92.65176 | 0.4361244 | 0.4529338 |
| High-Quality Academic Supports | 23 | 92.97125 | 0.2690021 | 0.4059782 |
| School Climate | 40 | 88.81789 | 0.1994885 | 0.1994885 |
| Nonexclusionary Discipline Practices | 40 | 97.76358 | 0.8926033 | 0.9356695 |
| Nonacademic Supports for Student Success | 33 | 93.29073 | 0.5717636 | 0.5846511 |
| Academic Readiness | NA | 100.00000 | NA | NA |
| Self-Regulation and Attention Skills | NA | 100.00000 | NA | NA |
| Engagement in Schooling | 96 | 82.74760 | 0.5483888 | 0.5596703 |
| Performance in Coursework | 34 | 95.84665 | 0.7422563 | 0.7422563 |
| Performance on Tests | 57 | 93.61022 | 0.7710147 | 0.7558992 |
| On-Time Graduation | 73 | 91.37380 | 0.7462543 | 0.8117440 |
| Postsecondary Readiness | 72 | 92.33227 | 0.7725842 | 0.8722232 |
write.csv(IRRstatistics_California_ordered, "20251113_IRRstatistics_California.csv", row.names = FALSE)
allstates_ab_long <- bind_rows(
OH_ab_long,
TX_ab_long,
NC_ab_long,
DC_ab_long,
KY_ab_long,
MD_ab_long,
OR_ab_long,
CA_ab_long
)
str(allstates_ab_long)
## tibble [59,184 × 6] (S3: tbl_df/tbl/data.frame)
## $ id : chr [1:59184] "OH1218" "OH1218" "OH1218" "OH1218" ...
## $ dataset_name: chr [1:59184] "21-22_Achievement_Building.xlsx" "21-22_Achievement_Building.xlsx" "21-22_Achievement_Building.xlsx" "21-22_Achievement_Building.xlsx" ...
## $ indicator : chr [1:59184] "EA_On-time_graduation" "EA_Postsecondary_readiness" "K-12_Access_to_effective_teaching" "K-12_Access_to_high-quality_academic_supports" ...
## $ rating_a : num [1:59184] 0 0 0 0 0 0 0 0 0 0 ...
## $ rating_b : num [1:59184] 0 0 0 0 0 0 0 0 0 0 ...
## $ agreement : num [1:59184] 1 1 1 1 1 1 1 1 1 1 ...
dataset_agreement <- allstates_ab_long %>%
group_by(id) %>%
summarise(percent_agreement = mean(agreement) * 100)
dataset_agreement <- dataset_agreement %>%
ungroup()
summary(dataset_agreement)
## id percent_agreement
## Length:3699 Min. : 25.00
## Class :character 1st Qu.: 87.50
## Mode :character Median : 93.75
## Mean : 93.55
## 3rd Qu.:100.00
## Max. :100.00
write.csv(dataset_agreement, "20251113_percent_agreement_by_id_8state.csv", row.names = FALSE)
indicators_mapped_per_id <- allstates_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(id) %>%
summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
right_join(allstates_ab_long %>% distinct(id), by = "id") %>%
mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))
summary(indicators_mapped_per_id)
## id indicators_mapped_per_id
## Length:3699 Min. : 0.000
## Class :character 1st Qu.: 1.000
## Mode :character Median : 2.000
## Mean : 2.169
## 3rd Qu.: 3.000
## Max. :13.000
contingency_table <- data.frame(allstates_ab_long$rating_a, allstates_ab_long$rating_b)
kappa_result <- kappa2(contingency_table, weight = "unweighted")
print(kappa_result)
## Cohen's Kappa for 2 Raters (Weights: unweighted)
##
## Subjects = 59184
## Raters = 2
## Kappa = 0.685
##
## z = 240
## p-value = 0
kappa_by_indicator <- allstates_ab_long %>%
group_by(indicator) %>%
summarise(
kappa_stat = {
contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
kappa_result <- kappa2(contingency_table, weight = "unweighted")
kappa_result$value
},
.groups = "drop"
)
kappa_by_indicator
## # A tibble: 16 × 2
## indicator kappa_stat
## <chr> <dbl>
## 1 EA_On-time_graduation 0.745
## 2 EA_Postsecondary_readiness 0.670
## 3 K-12_Access_to_effective_teaching 0.771
## 4 K-12_Access_to_high-quality_academic_supports 0.312
## 5 K-12_Access_to_rigorous_coursework 0.665
## 6 K-12_Curricular_breadth 0.456
## 7 K-12_Engagement_in_schooling 0.612
## 8 K-12_Nonacademic_supports_for_student_success 0.473
## 9 K-12_Nonexclusionary_discipline_practices 0.768
## 10 K-12_Performance_in_coursework 0.482
## 11 K-12_Performance_on_tests 0.758
## 12 K-12_School_climate 0.598
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation 0.598
## 14 Pre-K_Academic_readiness 0.862
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs 0.560
## 16 Pre-K_Self-regulation_and_attention_skills 0.674
mapped_counts <- allstates_ab_long %>%
filter(rating_a != 0 | rating_b != 0) %>%
group_by(indicator) %>%
summarise(n_mapped = n_distinct(id), .groups = "drop")
summary(mapped_counts)
## indicator n_mapped
## Length:16 Min. : 47.0
## Class :character 1st Qu.: 157.5
## Mode :character Median : 257.0
## Mean : 501.4
## 3rd Qu.: 861.8
## Max. :1514.0
kappa_with_ci <- kappa_by_indicator %>%
left_join(mapped_counts, by = "indicator") %>%
mutate(standard_error = sqrt((1 - kappa_stat^2) / n_mapped),
ci_lower = kappa_stat - 1.96 * standard_error,
ci_upper = kappa_stat + 1.96 * standard_error
)
kappa_with_ci
## # A tibble: 16 × 6
## indicator kappa_stat n_mapped standard_error ci_lower ci_upper
## <chr> <dbl> <int> <dbl> <dbl> <dbl>
## 1 EA_On-time_graduation 0.745 1094 0.0202 0.705 0.784
## 2 EA_Postsecondary_readin… 0.670 993 0.0236 0.623 0.716
## 3 K-12_Access_to_effectiv… 0.771 264 0.0392 0.695 0.848
## 4 K-12_Access_to_high-qua… 0.312 250 0.0601 0.194 0.430
## 5 K-12_Access_to_rigorous… 0.665 387 0.0380 0.591 0.740
## 6 K-12_Curricular_breadth 0.456 212 0.0611 0.336 0.576
## 7 K-12_Engagement_in_scho… 0.612 1205 0.0228 0.567 0.657
## 8 K-12_Nonacademic_suppor… 0.473 111 0.0836 0.309 0.637
## 9 K-12_Nonexclusionary_di… 0.768 163 0.0502 0.669 0.866
## 10 K-12_Performance_in_cou… 0.482 500 0.0392 0.405 0.559
## 11 K-12_Performance_on_tes… 0.758 1514 0.0168 0.725 0.791
## 12 K-12_School_climate 0.598 241 0.0516 0.497 0.699
## 13 K-12_Students_exposure_… 0.598 818 0.0280 0.544 0.653
## 14 Pre-K_Academic_readiness 0.862 83 0.0556 0.753 0.971
## 15 Pre-K_Access_to_and_par… 0.560 141 0.0697 0.424 0.697
## 16 Pre-K_Self-regulation_a… 0.674 47 0.108 0.462 0.885
indicator_kappa_count_merge <- kappa_by_indicator %>%
left_join(mapped_counts, by = "indicator")
indicator_kappa_count_merge
## # A tibble: 16 × 3
## indicator kappa_stat n_mapped
## <chr> <dbl> <int>
## 1 EA_On-time_graduation 0.745 1094
## 2 EA_Postsecondary_readiness 0.670 993
## 3 K-12_Access_to_effective_teaching 0.771 264
## 4 K-12_Access_to_high-quality_academic_supports 0.312 250
## 5 K-12_Access_to_rigorous_coursework 0.665 387
## 6 K-12_Curricular_breadth 0.456 212
## 7 K-12_Engagement_in_schooling 0.612 1205
## 8 K-12_Nonacademic_supports_for_student_success 0.473 111
## 9 K-12_Nonexclusionary_discipline_practices 0.768 163
## 10 K-12_Performance_in_coursework 0.482 500
## 11 K-12_Performance_on_tests 0.758 1514
## 12 K-12_School_climate 0.598 241
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_seg… 0.598 818
## 14 Pre-K_Academic_readiness 0.862 83
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_… 0.560 141
## 16 Pre-K_Self-regulation_and_attention_skills 0.674 47
clean_labels <- sub("^[^_]+_", "", indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)
indicator_kappa_count_merge$indicator_cleaned <- factor(
wrapped_labels,
levels = wrapped_labels[order(indicator_kappa_count_merge$kappa_stat)]
)
kappa_with_ci$indicator_cleaned <- factor(wrapped_labels,
levels = wrapped_labels[order(indicator_kappa_count_merge$kappa_stat)]
)
kappaforestplot_8state <- ggplot(kappa_with_ci, aes(x = kappa_stat, y = indicator_cleaned)) +
geom_point(aes(size = n_mapped), color = "lightblue") +
geom_errorbarh(aes(xmin = ci_lower, xmax = ci_upper), height = 0.2) +
scale_x_continuous(limits = c(-1, 1), breaks = seq(-1, 1, by = 0.2), expand = c(0,0)) +
scale_size_continuous(breaks = c(250, 500, 750, 1000, 1250, 1500)) +
theme_minimal() +
theme(
axis.text.y = element_text(size = 8),
panel.grid.minor = element_blank(),
axis.title.y = element_text(margin = margin(r = 10)),
axis.title.x = element_text(margin = margin(t = 10)),
axis.line.x = element_line(color = "black"),
axis.line.y = element_line(color = "black"),
panel.border = element_blank(),
) +
labs(
title = "Interrater Reliability Across All States",
subtitle = "Cohen's Kappa with 95% Confidence Interval",
x = "Cohen's Kappa",
y = "Indicator",
size = "Number of \nDatasets"
)
## Warning: `geom_errobarh()` was deprecated in ggplot2 4.0.0.
## ℹ Please use the `orientation` argument of `geom_errorbar()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
kappaforestplot_8state
## `height` was translated to `width`.
ggsave("20251113_kappaforestplot_8state.svg", plot = kappaforestplot_8state, width = 6.5, height = 6.5)
## `height` was translated to `width`.
allstates_ab_long <- allstates_ab_long %>%
mutate(state = str_sub(id, 1, 2))
kappa_by_state <- allstates_ab_long %>%
group_by(state) %>%
summarise(
kappa_stat = {
contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
kappa_result <- kappa2(contingency_table, weight = "unweighted")
kappa_result$value
},
.groups = "drop"
)
kappa_by_state
## # A tibble: 8 × 2
## state kappa_stat
## <chr> <dbl>
## 1 CA 0.683
## 2 DC 0.547
## 3 KY 0.682
## 4 MD 0.665
## 5 NC 0.657
## 6 OH 0.654
## 7 OR 0.771
## 8 TX 0.707
mapped_counts_state <- allstates_ab_long %>%
filter(rating_a != 0 | rating_b !=0) %>%
group_by(state) %>%
summarise(n_mapped = n_distinct(id), .groups = "drop")
summary(mapped_counts_state)
## state n_mapped
## Length:8 Min. :267.0
## Class :character 1st Qu.:306.0
## Mode :character Median :454.0
## Mean :452.4
## 3rd Qu.:500.0
## Max. :804.0
kappa_with_ci_state <- kappa_by_state %>%
left_join(mapped_counts_state, by = "state") %>%
mutate(standard_error = sqrt((1 - kappa_stat^2) / n_mapped),
ci_lower = kappa_stat - 1.96 * standard_error,
ci_upper = kappa_stat + 1.96 * standard_error
)
kappa_with_ci_state
## # A tibble: 8 × 6
## state kappa_stat n_mapped standard_error ci_lower ci_upper
## <chr> <dbl> <int> <dbl> <dbl> <dbl>
## 1 CA 0.683 313 0.0413 0.602 0.764
## 2 DC 0.547 285 0.0496 0.449 0.644
## 3 KY 0.682 563 0.0308 0.622 0.743
## 4 MD 0.665 479 0.0341 0.598 0.732
## 5 NC 0.657 267 0.0461 0.567 0.748
## 6 OH 0.654 444