Alex J. Bowers

Teachers College, Columbia University

Maeghan Sill

Teachers College, Columbia University

November 1, 2025


Online Supplement 12 Appendix for Bowers et al. (2025) Mapping Public Open Access K-12 State Education Indicator Data Across 7 States and Washington D.C. Using the FAIR Data Principles: Interrater Reliability Visualizations and Code

Alex J. Bowers, Maeghan Sill
Teachers College, Columbia University

This document is the online R markdown supplement to the report titled “Mapping Public Open Access K-12 State Education Indicator Data Across 7 States and Washington D.C. Using the FAIR Data Principles” https://doi.org/10.7916/c0jk-5e64

This document contains the R code and visualizations for each of the Interrater Reliability (IRR) figures in the report.

Note: For more information on each of the visualizations, please refer to the full report published online.

This research was made possible by support from The Wallace Foundation. The opinions described herein are solely those of the authors.

To cite this online supplement appendix: Bowers, A.J., Sill, M. (2025). Online Supplement 12 Appendix for Bowers et al. (2025) Mapping Public Open Access K-12 State Education Indicator Data Across 7 States and Washington D.C. Using the FAIR Data Principles: Interrater Reliability Visualizations and Code. Teachers College, Columbia University. New York, NY. https://doi.org/10.7916/3vny-6b48

Copyright © The Authors, 2025.

Text, images, and data in this work are licensed under Creative Commons BY 4.0 (CC-BY:4.0)
https://creativecommons.org/licenses/by/4.0/

Code, markdowns, and scripts are released under the MIT License
https://opensource.org/licenses/MIT


Interrater Reliability Visualizations and Code

This Markdown

This Markdown document serves as a comprehensive online supplement that provides complete transparency and reproducibility for the interrater reliability (IRR) analysis conducted in the FAIR Data Principles study of K-12 state education indicator data. The document contains all R code, statistical computations, and data visualizations used to assess agreement between raters who evaluated to what extent state education datasets could be mapped to the National Academies of Sciences, Engineering, and Medicine (NASEM) framework indicators. By including detailed code for Cohen’s kappa calculations, percentage agreement metrics, weighted kappa statistics, and sophisticated visualizations (including heatmaps with marginal plots and balloon plots), this supplement enables other researchers to fully understand the methodology, reproduce the findings, and potentially adapt the approach for similar data mapping reliability studies in educational research.

The file begins with loading the necessary packages and cleaning the data. Please see the sections below for more information about each of these. Next, the state-level analysis and data visualizations are presented in the following order: Ohio, Texas, North Carolina, Washington, D.C., Kentucky, Maryland, Oregon, and California. The final section presents the cross-state analysis and data visualizations.

Data

This analysis is using the Interrater Reliability Tracker (IRRT) dataset that contains ratings from multiple validators who assessed how well state education datasets could be mapped to the National Academies of Sciences, Engineering, and Medicine (NASEM) framework indicators. Two independent raters (identified as “A” and “B”) evaluated each identified and screened state education dataset from Ohio, Texas, North Carolina, Washington D.C., Kentucky, Maryland, Oregon, and California according to the coding schema described in the full report. Then, the raters conferenced about disagreements, recording each of their reasoning and how they made their final decision. These final decisions are what is coded in the “nasem” columns in each state’s Metadata Megatable (MDMT). These MDMT/IRRT datasets are available through Columbia University’s Academic Commons.

The actual file name for the Interrater Reliability Tracker is “20250501_CALL-ECL-MEI_IRRT_Combined_8_States.xlsx”. This data can be downloaded from: Online Supplement 10 Appendix for Bowers et al. (2025) Mapping Public Open Access K-12 State Education Indicator Data Across 7 States and Washington D.C. Using the FAIR Data Principles: Interrater Reliability Tracker Data. https://doi.org/10.7916/0bv2-zm35

The file has four tabs: ABY_Merge, Conference_Notes, Final_Decisions, and Data_Dictionary. ABY_Merge includes two raters’ ratings for each dataset assessed for eligibility in the Screening phase of the project. Each row is for a unique rater and indicator combination and includes the following variables: id, validation_round, rating, quote, quote_page, reasoning, year, supplemental_links, dataset_name, and rater. Conference_Notes includes one row for each dataset screened for eligibility and includes notes about whether the raters conferenced about disagreements in their ratings. The validator_conference, state_captain_conference, and pi_conference columns are all binary, with 0 indicating that level of conferencing did not take place and a 1 indicating that it did. The variables for this tab include: id, conference_notes, validator_conference, state_captain_conference, pi_conference, year, dataset_name, rater_a, rater_b. The Final_Decisions tab includes the results that the two raters agreed upon after any necessary conferencing. There is one row for each dataset and indicator combination. It includes the following variables: id, indicator, rating, quote, quote_page, reasoning, year, supplemental_links, and dataset_name. The Data_Dictionary tab includes information about the other tabs in the file. It includes the following variables: sheet, variable_name, variable_label, variable_type, data_values, assigned_missing_value, and notes. For more information about the screening process and its results, please refer to the full “Mapping Public Open Access K-12 State Education Indicator Data Across 7 States and Washington D.C. Using the FAIR Data Principles” report.



Packages

We used R version 4.5.1 [@base] and the following R packages for this analysis: tidyverse v. 2.0.0 [@tidyverse] for comprehensive data manipulation, visualization, and the overall analytical workflow; irr v. 0.84.1 [@irr] for computing Cohen’s kappa and other interrater reliability statistics; ggpubr v. 0.6.2 [@ggpubr] for creating publication-ready statistical graphics including balloon plots; viridis v. 0.6.5 [@viridis] for perceptually uniform color scales in visualizations; ComplexHeatmap v. 2.25.2 [@ComplexHeatmap2016; @ComplexHeatmap2022] along with circlize v. 0.4.16 [@circlize] and BiocManager v. 1.30.26 [@BiocManager] for creating sophisticated heatmap visualizations with clustering capabilities; hopach v. 2.70.0 [@hopach] for hierarchical clustering algorithms; cowplot v. 1.2.0 [@cowplot] and patchwork v. 1.3.2 [@patchwork] for combining and arranging multiple plots into complex figure layouts; scales v. 1.4.0 [@scales] for axis scaling and formatting; svglite v. 2.2.2 [@svglite] and rsvg v. 2.7.0 [@rsvg] for high-quality vector graphics output; visdat v. 0.6.0 [@visdat] for data structure visualization; devtools v. 2.4.6 [@devtools] for package development utilities; Kendall v. 2.2.1 [@Kendall] for non-parametric correlation analysis; and rmarkdown v. 2.30 [@rmarkdown2018; @rmarkdown2020; @rmarkdown2025] for creating this reproducible research document. These packages can be found on the Comprehensive R Archive Network (CRAN) at https://cran.r-project.org/ or through Bioconductor at https://www.bioconductor.org/ for specialized bioinformatics packages.




Preparation


Loading the packages.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.2
## ✔ ggplot2   4.0.0     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(irr)
## Warning: package 'irr' was built under R version 4.5.2
## Loading required package: lpSolve
## Warning: package 'lpSolve' was built under R version 4.5.2
library(ggplot2)
library(ggpubr)
## Warning: package 'ggpubr' was built under R version 4.5.2
library(readxl)
library(Kendall)
## Warning: package 'Kendall' was built under R version 4.5.2
library(dplyr)
library(viridis)
## Loading required package: viridisLite
library(ComplexHeatmap)
## Loading required package: grid
## ========================================
## ComplexHeatmap version 2.25.2
## Bioconductor page: http://bioconductor.org/packages/ComplexHeatmap/
## Github page: https://github.com/jokergoo/ComplexHeatmap
## Documentation: http://jokergoo.github.io/ComplexHeatmap-reference
## 
## If you use it in published research, please cite either one:
## - Gu, Z. Complex Heatmap Visualization. iMeta 2022.
## - Gu, Z. Complex heatmaps reveal patterns and correlations in multidimensional 
##     genomic data. Bioinformatics 2016.
## 
## 
## The new InteractiveComplexHeatmap package can directly export static 
## complex heatmaps into an interactive Shiny app with zero effort. Have a try!
## 
## This message can be suppressed by:
##   suppressPackageStartupMessages(library(ComplexHeatmap))
## ========================================
library(BiocManager)
## Warning: package 'BiocManager' was built under R version 4.5.2
library(hopach)
## Loading required package: cluster
## Loading required package: Biobase
## Loading required package: BiocGenerics
## Loading required package: generics
## 
## Attaching package: 'generics'
## 
## The following object is masked from 'package:lubridate':
## 
##     as.difftime
## 
## The following object is masked from 'package:dplyr':
## 
##     explain
## 
## The following objects are masked from 'package:base':
## 
##     as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
##     setequal, union
## 
## 
## Attaching package: 'BiocGenerics'
## 
## The following object is masked from 'package:dplyr':
## 
##     combine
## 
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## 
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, aperm, append, as.data.frame, basename, cbind,
##     colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
##     get, grep, grepl, is.unsorted, lapply, Map, mapply, match, mget,
##     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
##     rbind, Reduce, rownames, sapply, saveRDS, table, tapply, unique,
##     unsplit, which.max, which.min
## 
## Welcome to Bioconductor
## 
##     Vignettes contain introductory material; view with
##     'browseVignettes()'. To cite Bioconductor, see
##     'citation("Biobase")', and for packages 'citation("pkgname")'.
## 
## 
## Attaching package: 'hopach'
## 
## The following object is masked from 'package:generics':
## 
##     prune
library(circlize)
## Warning: package 'circlize' was built under R version 4.5.2
## ========================================
## circlize version 0.4.16
## CRAN page: https://cran.r-project.org/package=circlize
## Github page: https://github.com/jokergoo/circlize
## Documentation: https://jokergoo.github.io/circlize_book/book/
## 
## If you use it in published research, please cite:
## Gu, Z. circlize implements and enhances circular visualization
##   in R. Bioinformatics 2014.
## 
## This message can be suppressed by:
##   suppressPackageStartupMessages(library(circlize))
## ========================================
library(tibble)
library(devtools)
## Warning: package 'devtools' was built under R version 4.5.2
## Loading required package: usethis
## Warning: package 'usethis' was built under R version 4.5.2
## 
## Attaching package: 'devtools'
## 
## The following object is masked from 'package:BiocManager':
## 
##     install
library(svglite)
## Warning: package 'svglite' was built under R version 4.5.2
library(stringr)
library(scales)
## 
## Attaching package: 'scales'
## 
## The following object is masked from 'package:viridis':
## 
##     viridis_pal
## 
## The following object is masked from 'package:purrr':
## 
##     discard
## 
## The following object is masked from 'package:readr':
## 
##     col_factor
library(cowplot)
## 
## Attaching package: 'cowplot'
## 
## The following object is masked from 'package:ggpubr':
## 
##     get_legend
## 
## The following object is masked from 'package:lubridate':
## 
##     stamp
library(patchwork)
## 
## Attaching package: 'patchwork'
## 
## The following object is masked from 'package:cowplot':
## 
##     align_plots
library(visdat)
## Warning: package 'visdat' was built under R version 4.5.2
library(knitr)
library(grateful)

Loading in the ABY Merge dataset (the first sheet of the excel file “20250501_CALL-ECL-MEI_IRRT_Combined_8_States.xlsx”).
This dataset can be downloaded from: Online Supplement 10 Appendix for Bowers et al. (2025) Mapping Public Open Access K-12 State Education Indicator Data Across 7 States and Washington D.C. Using the FAIR Data Principles: Interrater Reliability Tracker Data. https://doi.org/10.7916/0bv2-zm35.
To use, download the file into the same folder as this R studio code or project and then read the file as in the markdown below.

abymerge <- read_excel("C:/Users/Public/Downloads/20250501_CALL-ECL-MEI_IRRT_Combined_8_States.xlsx",
  sheet = 1,
  col_types = "text"  
) %>%
  select(id, rater, validation_round, dataset_name, indicator, rating) %>%
  mutate(rating = as.numeric(rating))

head(abymerge)
## # A tibble: 6 × 6
##   id     rater validation_round dataset_name    indicator                 rating
##   <chr>  <chr> <chr>            <chr>           <chr>                      <dbl>
## 1 CA1218 R514  A                expulsion12.txt K-12_Nonexclusionary_dis…      4
## 2 CA1218 R562  B                expulsion12.txt K-12_Nonexclusionary_dis…      4
## 3 CA1218 R562  B                expulsion12.txt K-12_School_climate            2
## 4 CA1219 R514  A                expulsion13.txt K-12_Nonexclusionary_dis…      4
## 5 CA1219 R562  B                expulsion13.txt K-12_Nonexclusionary_dis…      4
## 6 CA1219 R562  B                expulsion13.txt K-12_School_climate            2
str(abymerge)
## tibble [16,470 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:16470] "CA1218" "CA1218" "CA1218" "CA1219" ...
##  $ rater           : chr [1:16470] "R514" "R562" "R562" "R514" ...
##  $ validation_round: chr [1:16470] "A" "B" "B" "A" ...
##  $ dataset_name    : chr [1:16470] "expulsion12.txt" "expulsion12.txt" "expulsion12.txt" "expulsion13.txt" ...
##  $ indicator       : chr [1:16470] "K-12_Nonexclusionary_discipline_practices" "K-12_Nonexclusionary_discipline_practices" "K-12_School_climate" "K-12_Nonexclusionary_discipline_practices" ...
##  $ rating          : num [1:16470] 4 4 2 4 4 2 4 4 2 4 ...

Filter only raters A and B.

rater_ab <- abymerge %>%
  filter(validation_round %in% c("A", "B")) %>%
  mutate(id = as.character(id),
         indicator = trimws(indicator))

str(rater_ab)
## tibble [13,736 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:13736] "CA1218" "CA1218" "CA1218" "CA1219" ...
##  $ rater           : chr [1:13736] "R514" "R562" "R562" "R514" ...
##  $ validation_round: chr [1:13736] "A" "B" "B" "A" ...
##  $ dataset_name    : chr [1:13736] "expulsion12.txt" "expulsion12.txt" "expulsion12.txt" "expulsion13.txt" ...
##  $ indicator       : chr [1:13736] "K-12_Nonexclusionary_discipline_practices" "K-12_Nonexclusionary_discipline_practices" "K-12_School_climate" "K-12_Nonexclusionary_discipline_practices" ...
##  $ rating          : num [1:13736] 4 4 2 4 4 2 4 4 2 4 ...

This step creates a universal list of indicator names and saves them as a dataframe for use later.

indicator_names <- c(
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs",
  "Pre-K_Academic_readiness",
  "Pre-K_Self-regulation_and_attention_skills",
  "K-12_Access_to_effective_teaching",
  "K-12_Access_to_rigorous_coursework",
  "K-12_Curricular_breadth",
  "K-12_Access_to_high-quality_academic_supports",
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
  "K-12_School_climate",
  "K-12_Nonexclusionary_discipline_practices",
  "K-12_Nonacademic_supports_for_student_success",
  "K-12_Engagement_in_schooling",
  "K-12_Performance_in_coursework",
  "K-12_Performance_on_tests",
  "EA_On-time_graduation",
  "EA_Postsecondary_readiness"
)

indicators_df <- data.frame(indicator = indicator_names)

OHIO

Filtering for only Ohio datasets.

OH_rater_ab <- rater_ab %>%
  filter(str_starts(id, "OH"))

str(OH_rater_ab)
## tibble [2,158 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:2158] "OH1218" "OH1218" "OH1219" "OH1219" ...
##  $ rater           : chr [1:2158] "R562" "R294" "R294" "R562" ...
##  $ validation_round: chr [1:2158] "A" "B" "B" "A" ...
##  $ dataset_name    : chr [1:2158] "21-22_Achievement_Building.xlsx" "21-22_Achievement_Building.xlsx" "20-21_Achievement_Building.xlsx" "20-21_Achievement_Building.xlsx" ...
##  $ indicator       : chr [1:2158] "K-12_Performance_on_tests" "K-12_Performance_on_tests" "K-12_Engagement_in_schooling" "K-12_Performance_on_tests" ...
##  $ rating          : num [1:2158] 3 1 3 3 2 3 3 2 3 3 ...
OH_duplicates_check <- OH_rater_ab %>%
  group_by(id, indicator, validation_round) %>%
  filter(n() > 1) %>%
  ungroup()

# View the duplicates
print(OH_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## #   dataset_name <chr>, indicator <chr>, rating <dbl>

Making sure each id, indicator, validator combination is present.

OH_expanded_grid <- OH_rater_ab %>%
  distinct(id, dataset_name) %>%
  crossing(indicators_df, validation_round = c("A", "B"))

head(OH_expanded_grid)
## # A tibble: 6 × 4
##   id     dataset_name                    indicator              validation_round
##   <chr>  <chr>                           <chr>                  <chr>           
## 1 OH1218 21-22_Achievement_Building.xlsx EA_On-time_graduation  A               
## 2 OH1218 21-22_Achievement_Building.xlsx EA_On-time_graduation  B               
## 3 OH1218 21-22_Achievement_Building.xlsx EA_Postsecondary_read… A               
## 4 OH1218 21-22_Achievement_Building.xlsx EA_Postsecondary_read… B               
## 5 OH1218 21-22_Achievement_Building.xlsx K-12_Access_to_effect… A               
## 6 OH1218 21-22_Achievement_Building.xlsx K-12_Access_to_effect… B

Making sure that id/indicator combinations that were not rated are represented as zero.

OH_ratings_with_zeros <- OH_expanded_grid %>%
  left_join(OH_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating), 
            by = c("id", "dataset_name", "indicator", "validation_round")) %>%
  mutate(rating = replace_na(rating, 0))
OH_ab_long <- OH_ratings_with_zeros %>%
  pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
  rename(rating_a = rating_A, rating_b = rating_B) %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

head(OH_ab_long)
## # A tibble: 6 × 6
##   id     dataset_name                    indicator   rating_a rating_b agreement
##   <chr>  <chr>                           <chr>          <dbl>    <dbl>     <dbl>
## 1 OH1218 21-22_Achievement_Building.xlsx EA_On-time…        0        0         1
## 2 OH1218 21-22_Achievement_Building.xlsx EA_Postsec…        0        0         1
## 3 OH1218 21-22_Achievement_Building.xlsx K-12_Acces…        0        0         1
## 4 OH1218 21-22_Achievement_Building.xlsx K-12_Acces…        0        0         1
## 5 OH1218 21-22_Achievement_Building.xlsx K-12_Acces…        0        0         1
## 6 OH1218 21-22_Achievement_Building.xlsx K-12_Curri…        0        0         1
OH_dataset_agreement <- OH_ab_long %>%
  group_by(id) %>%
  summarise(percent_agreement = mean(agreement) * 100)

OH_dataset_agreement <- OH_dataset_agreement %>%
  ungroup()

n_distinct(OH_dataset_agreement$id)
## [1] 446
head(OH_dataset_agreement)
## # A tibble: 6 × 2
##   id     percent_agreement
##   <chr>              <dbl>
## 1 OH1218              93.8
## 2 OH1219              87.5
## 3 OH1220              87.5
## 4 OH1221              87.5
## 5 OH1222              93.8
## 6 OH1223              93.8
write.csv(OH_dataset_agreement, "20251113_percent_agreement_by_id_Ohio.csv", row.names = FALSE)
OH_indicators_mapped_per_id <- OH_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(id) %>%
  summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
  right_join(OH_ab_long %>% distinct(id), by = "id") %>%
  mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))

OH_dataset_agreement_count <- OH_dataset_agreement %>%
  left_join(OH_indicators_mapped_per_id, by = "id")

head(OH_dataset_agreement_count)
## # A tibble: 6 × 3
##   id     percent_agreement indicators_mapped_per_id
##   <chr>              <dbl>                    <int>
## 1 OH1218              93.8                        1
## 2 OH1219              87.5                        2
## 3 OH1220              87.5                        2
## 4 OH1221              87.5                        2
## 5 OH1222              93.8                        1
## 6 OH1223              93.8                        1
OH_dataset_agreement_count <- OH_dataset_agreement_count %>%
  arrange(desc(percent_agreement), (indicators_mapped_per_id))

OH_dataset_agreement_count$id <- factor(OH_dataset_agreement_count$id, levels = unique(OH_dataset_agreement_count$id))

OH_dataset_agreement_count$indicators_mapped_per_id <- 
   factor(OH_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
OH_percentagreement_barplot <- ggplot(OH_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
  geom_bar(stat = "identity") +
  theme_minimal() + 
  scale_fill_viridis_d(direction = -1) +
  scale_y_continuous(expand = c(0, 0)) +
  labs(
    title = "Ohio - Percentage Agreement by Dataset",
    x = "Dataset",
    y = "Agreement (%)",
    fill = "Number of Indicators"
  ) +
  theme(
    axis.text.y = element_text(size = 1)
  ) +
  coord_flip()

OH_percentagreement_barplot

ggsave("20251113_percentagreement_barplot_Ohio.svg", plot = OH_percentagreement_barplot, width = 6.5, height = 6.5)

Printing results for chart in findings section.

OH_percent_agreement <- OH_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
  group_by(indicator) %>%
  summarise(percent_agreement = mean(agreement) * 100)

OH_percent_agreement
## # A tibble: 16 × 2
##    indicator                                                   percent_agreement
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                    74.4
##  2 EA_Postsecondary_readiness                                               97.8
##  3 K-12_Access_to_effective_teaching                                        98.4
##  4 K-12_Access_to_high-quality_academic_supports                            85.2
##  5 K-12_Access_to_rigorous_coursework                                       90.1
##  6 K-12_Curricular_breadth                                                  93.0
##  7 K-12_Engagement_in_schooling                                             71.1
##  8 K-12_Nonacademic_supports_for_student_success                           100  
##  9 K-12_Nonexclusionary_discipline_practices                               100  
## 10 K-12_Performance_in_coursework                                           83.9
## 11 K-12_Performance_on_tests                                                66.4
## 12 K-12_School_climate                                                     100  
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre…              95.7
## 14 Pre-K_Academic_readiness                                                 97.8
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr…             100  
## 16 Pre-K_Self-regulation_and_attention_skills                               97.3
OH_ab_long <- OH_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

Now calculating Cohen’s kappa, first for the whole dataset, then by indicator.

OH_contingency_table <- data.frame(OH_ab_long$rating_a, OH_ab_long$rating_b)
OH_kappa_result <- kappa2(OH_contingency_table, weight = "unweighted")
print(OH_kappa_result)
##  Cohen's Kappa for 2 Raters (Weights: unweighted)
## 
##  Subjects = 7136 
##    Raters = 2 
##     Kappa = 0.654 
## 
##         z = 76 
##   p-value = 0
OH_kappa_by_indicator <- OH_ab_long %>%
  group_by(indicator) %>%
  summarise(
    kappa_stat = {
      OH_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
      OH_kappa_result <- kappa2(OH_contingency_table, weight = "unweighted")
      OH_kappa_result$value  
    },
    .groups = "drop"
  )
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `kappa_stat = { ... }`.
## ℹ In group 16: `indicator = "Pre-K_Self-regulation_and_attention_skills"`.
## Caused by warning in `sqrt()`:
## ! NaNs produced
print(OH_kappa_by_indicator)
## # A tibble: 16 × 2
##    indicator                                                        kappa_stat
##    <chr>                                                                 <dbl>
##  1 EA_On-time_graduation                                                0.605 
##  2 EA_Postsecondary_readiness                                           0.819 
##  3 K-12_Access_to_effective_teaching                                    0.879 
##  4 K-12_Access_to_high-quality_academic_supports                       -0.0335
##  5 K-12_Access_to_rigorous_coursework                                   0.445 
##  6 K-12_Curricular_breadth                                              0.546 
##  7 K-12_Engagement_in_schooling                                         0.553 
##  8 K-12_Nonacademic_supports_for_student_success                        1     
##  9 K-12_Nonexclusionary_discipline_practices                            1     
## 10 K-12_Performance_in_coursework                                      -0.0527
## 11 K-12_Performance_on_tests                                            0.515 
## 12 K-12_School_climate                                                  1     
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation     0.758 
## 14 Pre-K_Academic_readiness                                             0.854 
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs   NaN     
## 16 Pre-K_Self-regulation_and_attention_skills                           0

Counting the number of datasets mapped to each indicator to create a plot afterwards.

OH_mapped_counts <- OH_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(indicator) %>%
  summarise(mapped_counts = n_distinct(id), .groups = "drop")

OH_mapped_counts
## # A tibble: 15 × 2
##    indicator                                                       mapped_counts
##    <chr>                                                                   <int>
##  1 EA_On-time_graduation                                                     278
##  2 EA_Postsecondary_readiness                                                 34
##  3 K-12_Access_to_effective_teaching                                          34
##  4 K-12_Access_to_high-quality_academic_supports                              66
##  5 K-12_Access_to_rigorous_coursework                                         64
##  6 K-12_Curricular_breadth                                                    49
##  7 K-12_Engagement_in_schooling                                              268
##  8 K-12_Nonacademic_supports_for_student_success                               1
##  9 K-12_Nonexclusionary_discipline_practices                                   6
## 10 K-12_Performance_in_coursework                                             72
## 11 K-12_Performance_on_tests                                                 338
## 12 K-12_School_climate                                                         6
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati…            53
## 14 Pre-K_Academic_readiness                                                   42
## 15 Pre-K_Self-regulation_and_attention_skills                                 12
OH_indicator_kappa_count_merge <- OH_kappa_by_indicator %>%
  left_join(OH_mapped_counts, by = "indicator")

head(OH_indicator_kappa_count_merge)
## # A tibble: 6 × 3
##   indicator                                     kappa_stat mapped_counts
##   <chr>                                              <dbl>         <int>
## 1 EA_On-time_graduation                             0.605            278
## 2 EA_Postsecondary_readiness                        0.819             34
## 3 K-12_Access_to_effective_teaching                 0.879             34
## 4 K-12_Access_to_high-quality_academic_supports    -0.0335            66
## 5 K-12_Access_to_rigorous_coursework                0.445             64
## 6 K-12_Curricular_breadth                           0.546             49
clean_labels <- sub("^[^_]+_", "", OH_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)


OH_indicator_kappa_count_merge$indicator_cleaned <- factor(
  wrapped_labels,
  levels = wrapped_labels[order(OH_indicator_kappa_count_merge$kappa_stat)]
)
OH_balloonplot <- ggballoonplot(OH_indicator_kappa_count_merge, 
x = "indicator_cleaned", 
y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
  scale_fill_viridis_c(option = "C") +  
  theme_minimal() +
  ylim(-1,1) +
  labs(
    title = "Ohio Interrater Reliability by Indicator",
    subtitle = "Overall Kappa is 0.654",
    x = "Indicator",
    y = "Cohen's Kappa",
    fill = "Kappa",
    size = "Number of Datasets"
  ) +
  theme(
    axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
  ) +
  coord_flip()

OH_balloonplot
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

ggsave("20251113_IRR_kappa_balloonplot_Ohio.svg", plot = OH_balloonplot, width = 6.5, height = 6.5)
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
OH_heatmap_long <- OH_ab_long %>%
  mutate(weight = case_when(
    rating_a == 0 & rating_b == 0 ~ 0,  # Both rated zero = Not Rated
    (rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2,  # One rated zero, other rated indirect = Moderate Disagreement
    (rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3,  # One rated zero, other rated direct = Serious Disagreement
    (rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1,  # Both rated indirect = Agreement
    (rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1,  # Both rated direct = Agreement
    TRUE ~ NA_real_  # Handle any other cases (e.g., missing or invalid ratings)
  ))

clean_labels <- sub("^[^_]+_", "", OH_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)

OH_heatmap_long$indicator_cleaned <- wrapped_labels

OH_heatmap_long_selected <- OH_heatmap_long %>%
  select(id, indicator_cleaned, weight)

head(OH_heatmap_long_selected)
## # A tibble: 6 × 3
##   id     indicator_cleaned                           weight
##   <chr>  <chr>                                        <dbl>
## 1 OH1218 "On-time graduation"                             0
## 2 OH1218 "Postsecondary readiness"                        0
## 3 OH1218 "Access to effective teaching"                   0
## 4 OH1218 "Access to high-quality academic\nsupports"      0
## 5 OH1218 "Access to rigorous coursework"                  0
## 6 OH1218 "Curricular breadth"                             0

Dataset disagreement with marginal plots.

# 1. Set factor levels based on heatmap ordering
OH_indicator_levels <- levels(with(OH_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
OH_dataset_levels <- levels(with(OH_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
OH_agreement_heatmap <- ggplot(OH_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                         y = reorder(id, -weight),
                                                         fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "Ohio Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
    axis.text.y = element_text(size = 1),
    legend.text = element_text(size = 6),
    plot.title = element_text(hjust = 0.5)  
  )

# 3. Extract the legend
OH_legend <- cowplot::get_legend(OH_agreement_heatmap)
OH_legend_plot <- wrap_elements(OH_legend)

# 4. Remove legend from main plot
OH_main_plot <- OH_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))

# 5. Create top marginal plot (stacked bar chart by indicator)
OH_top_plot <- OH_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = OH_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  scale_x_discrete(drop = FALSE) +
  theme_minimal() +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    legend.position = "none"
  )

# 6. Create right marginal plot (stacked bar chart by dataset)
OH_right_plot <- OH_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  mutate(id = factor(id, levels = OH_dataset_levels)) %>%
  ggplot(aes(x = id, fill = factor(weight))) +
  geom_bar(position = "stack") +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
  theme_minimal() +
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )

OH_final_plot <- OH_top_plot + OH_legend + OH_main_plot + OH_right_plot + 
              plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))

OH_final_plot

ggsave("20251113_agreement_heatmap_with_marginals_Ohio.svg", plot = OH_final_plot, width = 8, height = 12)

Creating the png.

# 1. Set factor levels based on heatmap ordering
OH_indicator_levels <- levels(with(OH_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
OH_dataset_levels <- levels(with(OH_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
OH_agreement_heatmap <- ggplot(OH_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                         y = reorder(id, -weight),
                                                         fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "Ohio Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
    axis.text.y = element_text(size = 30),
    axis.title.y = element_text(size = 20),
    axis.title.x = element_text(size = 20),
    legend.text = element_text(size = 20),
    legend.title = element_text(size = 20),
    plot.title = element_text(size = 25, hjust = 0.5)  # ✅ Center the title
  )

# 3. Extract the legend
OH_legend <- cowplot::get_legend(OH_agreement_heatmap)

# 4. Remove legend from main plot and adjust bar width
OH_main_plot <- OH_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
    axis.title.y = element_text(size = 20),
    axis.title.x = element_text(size = 20), hjust = 0.5)

# 5. Create top marginal plot 
OH_top_plot <- OH_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = OH_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_text(size = 20),
    legend.position = "none")

# 6. Create right marginal plot
OH_right_data <- OH_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  count(id, weight) %>%
  mutate(id = factor(id, levels = OH_dataset_levels))

OH_right_plot <- ggplot(OH_right_data, aes(x = id, y = n, fill = factor(weight))) +
  geom_col(position = "stack", width = 1, colour = NA) +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +  # <-- horizontal axis after flip
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),  
    axis.title.x = element_text(size = 20), 
    legend.position = "none",
    panel.spacing = unit(0, "lines")
  )

OH_final_plot3 <- OH_top_plot + OH_legend + OH_main_plot + OH_right_plot + 
              plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))

OH_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

ggsave("20251113_agreement_heatmap_with_marginals_Ohio.png", plot = OH_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.

OH_ab_directindirect <- OH_ab_long %>%
  select(id, dataset_name, indicator, rating_a, rating_b) %>%
  mutate(across(c(rating_a, rating_b), ~ case_when(
    . == 2 ~ 1,
    . %in% c(3,4) ~ 2,
    TRUE ~ .
    )))

OH_ratings_kappaweighted <- OH_ab_directindirect %>%
  select(rating_a, rating_b)
OH_kappa_weighted <- kappa2(OH_ratings_kappaweighted, weight = "equal")
print(OH_kappa_weighted)
##  Cohen's Kappa for 2 Raters (Weights: equal)
## 
##  Subjects = 7136 
##    Raters = 2 
##     Kappa = 0.716 
## 
##         z = 62.2 
##   p-value = 0
OH_weighted_kappa_by_indicator <- OH_ab_directindirect %>%
  group_by(indicator) %>%
  summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
## Warning: There were 2 warnings in `summarise()`.
## The first warning was:
## ℹ In argument: `weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a,
##   rating_b)), weight = "equal")$value`.
## ℹ In group 1: `indicator = "EA_On-time_graduation"`.
## Caused by warning:
## ! `cur_data()` was deprecated in dplyr 1.1.0.
## ℹ Please use `pick()` instead.
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
OH_weighted_kappa_by_indicator
## # A tibble: 16 × 2
##    indicator                                                      weighted_kappa
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                  0.633 
##  2 EA_Postsecondary_readiness                                             0.789 
##  3 K-12_Access_to_effective_teaching                                      0.877 
##  4 K-12_Access_to_high-quality_academic_supports                         -0.0426
##  5 K-12_Access_to_rigorous_coursework                                     0.459 
##  6 K-12_Curricular_breadth                                                0.647 
##  7 K-12_Engagement_in_schooling                                           0.603 
##  8 K-12_Nonacademic_supports_for_student_success                          1     
##  9 K-12_Nonexclusionary_discipline_practices                              1     
## 10 K-12_Performance_in_coursework                                        -0.0711
## 11 K-12_Performance_on_tests                                              0.547 
## 12 K-12_School_climate                                                    1     
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat…         0.758 
## 14 Pre-K_Academic_readiness                                               0.893 
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr…        NA     
## 16 Pre-K_Self-regulation_and_attention_skills                             0
IRRstatistics_Ohio <- OH_indicator_kappa_count_merge %>%
  left_join(OH_weighted_kappa_by_indicator, by = "indicator") %>%
  left_join(OH_percent_agreement, by = "indicator") %>%
  select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)

indicator_order <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs", 
  "K-12_Access_to_effective_teaching",
  "K-12_Access_to_rigorous_coursework",
  "K-12_Curricular_breadth",
  "K-12_Access_to_high-quality_academic_supports",
  "K-12_School_climate",
  "K-12_Nonexclusionary_discipline_practices",
  "K-12_Nonacademic_supports_for_student_success",
  "Pre-K_Academic_readiness",
  "Pre-K_Self-regulation_and_attention_skills",
  "K-12_Engagement_in_schooling",
  "K-12_Performance_in_coursework",
  "K-12_Performance_on_tests",
  "EA_On-time_graduation",
  "EA_Postsecondary_readiness"
)

# Create a mapping of old names to new names
indicator_name_mapping <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
  "K-12_Access_to_effective_teaching" = "Effective Teaching",
  "K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
  "K-12_Curricular_breadth" = "Curricular Breadth",
  "K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
  "K-12_School_climate" = "School Climate",
  "K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
  "K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
  "Pre-K_Academic_readiness" = "Academic Readiness",
  "Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
  "K-12_Engagement_in_schooling" = "Engagement in Schooling",
  "K-12_Performance_in_coursework" = "Performance in Coursework",
  "K-12_Performance_on_tests" = "Performance on Tests",
  "EA_On-time_graduation" = "On-Time Graduation",
  "EA_Postsecondary_readiness" = "Postsecondary Readiness"
)

IRRstatistics_Ohio_ordered <- IRRstatistics_Ohio %>%
  arrange(match(indicator, indicator_order)) %>%
  mutate(indicator_name = indicator_name_mapping[indicator]) %>%
  select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
    rename(
      "Indicator" = indicator_name,
      "Mapped Datasets" = mapped_counts,
      "Percent Agreement" = percent_agreement,
      "Cohen's Kappa" = kappa_stat,
      "Cohen's Weighted Kappa" = weighted_kappa) %>%
  mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))

kable(IRRstatistics_Ohio_ordered)
Indicator Mapped Datasets Percent Agreement Cohen’s Kappa Cohen’s Weighted Kappa
Racial, Ethnic, and Economic Segregation 53 95.73991 0.7580378 0.7580378
High-Quality Pre-K Programs NA 100.00000 NA NA
Effective Teaching 34 98.43049 0.8786630 0.8769413
Rigorous Coursework 64 90.13453 0.4451795 0.4592629
Curricular Breadth 49 93.04933 0.5463911 0.6468111
High-Quality Academic Supports 66 85.20179 -0.0334948 -0.0425627
School Climate 6 100.00000 1.0000000 1.0000000
Nonexclusionary Discipline Practices 6 100.00000 1.0000000 1.0000000
Nonacademic Supports for Student Success 1 100.00000 1.0000000 1.0000000
Academic Readiness 42 97.75785 0.8538088 0.8929722
Self-Regulation and Attention Skills 12 97.30942 0.0000000 0.0000000
Engagement in Schooling 268 71.07623 0.5527206 0.6031111
Performance in Coursework 72 83.85650 -0.0527144 -0.0710500
Performance on Tests 338 66.36771 0.5153473 0.5474093
On-Time Graduation 278 74.43946 0.6045300 0.6331089
Postsecondary Readiness 34 97.75785 0.8190816 0.7887146
write.csv(IRRstatistics_Ohio_ordered, "20251113_IRRstatistics_Ohio.csv", row.names = FALSE)

TEXAS

Filtering for only Texas datasets.

TX_rater_ab <- rater_ab %>%
  filter(str_starts(id, "TX"))

str(TX_rater_ab)
## tibble [2,920 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:2920] "TX1218" "TX1218" "TX1218" "TX1218" ...
##  $ rater           : chr [1:2920] "R294" "R398" "R294" "R398" ...
##  $ validation_round: chr [1:2920] "A" "B" "A" "B" ...
##  $ dataset_name    : chr [1:2920] "CREF.dat" "CREF.dat" "CREF.dat" "CREF.dat" ...
##  $ indicator       : chr [1:2920] "EA_Postsecondary_readiness" "EA_Postsecondary_readiness" "K-12_Performance_on_tests" "K-12_Performance_on_tests" ...
##  $ rating          : num [1:2920] 1 1 1 3 4 4 4 4 4 4 ...
TX_duplicates_check <- TX_rater_ab %>%
  group_by(id, indicator, validation_round) %>%
  filter(n() > 1) %>%
  ungroup()

# View the duplicates
print(TX_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## #   dataset_name <chr>, indicator <chr>, rating <dbl>

Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).

TX_rater_ab <- TX_rater_ab %>%
  filter(!(id %in% c("TX1252")))

Making sure each id, indicator, validator combination is present.

TX_expanded_grid <- TX_rater_ab %>%
  distinct(id, dataset_name) %>%
  crossing(indicators_df, validation_round = c("A", "B"))

head(TX_expanded_grid)
## # A tibble: 6 × 4
##   id     dataset_name indicator                         validation_round
##   <chr>  <chr>        <chr>                             <chr>           
## 1 TX1218 CREF.dat     EA_On-time_graduation             A               
## 2 TX1218 CREF.dat     EA_On-time_graduation             B               
## 3 TX1218 CREF.dat     EA_Postsecondary_readiness        A               
## 4 TX1218 CREF.dat     EA_Postsecondary_readiness        B               
## 5 TX1218 CREF.dat     K-12_Access_to_effective_teaching A               
## 6 TX1218 CREF.dat     K-12_Access_to_effective_teaching B

Making sure that id/indicator combinations that were not rated are represented as zero.

TX_ratings_with_zeros <- TX_expanded_grid %>%
  left_join(TX_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating), 
            by = c("id", "dataset_name", "indicator", "validation_round")) %>%
  mutate(rating = replace_na(rating, 0))
TX_ab_long <- TX_ratings_with_zeros %>%
  pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
  rename(rating_a = rating_A, rating_b = rating_B) %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

head(TX_ab_long)
## # A tibble: 6 × 6
##   id     dataset_name indicator                      rating_a rating_b agreement
##   <chr>  <chr>        <chr>                             <dbl>    <dbl>     <dbl>
## 1 TX1218 CREF.dat     EA_On-time_graduation                 0        0         1
## 2 TX1218 CREF.dat     EA_Postsecondary_readiness            1        1         1
## 3 TX1218 CREF.dat     K-12_Access_to_effective_teac…        0        0         1
## 4 TX1218 CREF.dat     K-12_Access_to_high-quality_a…        0        0         1
## 5 TX1218 CREF.dat     K-12_Access_to_rigorous_cours…        0        0         1
## 6 TX1218 CREF.dat     K-12_Curricular_breadth               0        0         1
TX_dataset_agreement <- TX_ab_long %>%
  group_by(id) %>%
  summarise(percent_agreement = mean(agreement) * 100)

TX_dataset_agreement <- TX_dataset_agreement %>%
  ungroup()

n_distinct(TX_dataset_agreement$id)
## [1] 825
head(TX_dataset_agreement)
## # A tibble: 6 × 2
##   id     percent_agreement
##   <chr>              <dbl>
## 1 TX1218              93.8
## 2 TX1219              81.2
## 3 TX1220              81.2
## 4 TX1221              87.5
## 5 TX1222              93.8
## 6 TX1223              81.2
write.csv(TX_dataset_agreement, "20251113_percent_agreement_by_id_Texas.csv", row.names = FALSE)
TX_indicators_mapped_per_id <- TX_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(id) %>%
  summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
  right_join(TX_ab_long %>% distinct(id), by = "id") %>%
  mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))

TX_dataset_agreement_count <- TX_dataset_agreement %>%
  left_join(TX_indicators_mapped_per_id, by = "id")

head(TX_dataset_agreement_count)
## # A tibble: 6 × 3
##   id     percent_agreement indicators_mapped_per_id
##   <chr>              <dbl>                    <int>
## 1 TX1218              93.8                        2
## 2 TX1219              81.2                        3
## 3 TX1220              81.2                        3
## 4 TX1221              87.5                        3
## 5 TX1222              93.8                        1
## 6 TX1223              81.2                        3
TX_dataset_agreement_count <- TX_dataset_agreement_count %>%
  arrange(desc(percent_agreement), (indicators_mapped_per_id))

TX_dataset_agreement_count$id <- factor(TX_dataset_agreement_count$id, levels = unique(TX_dataset_agreement_count$id))

TX_dataset_agreement_count$indicators_mapped_per_id <- 
  factor(TX_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
TX_percentagreement_barplot <- ggplot(TX_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
  geom_bar(stat = "identity") +
  theme_minimal() + 
  scale_fill_viridis_d(direction = -1) +
  scale_y_continuous(expand = c(0, 0)) +
  labs(
    title = "Texas - Percentage Agreement by Dataset",
    x = "Dataset",
    y = "Agreement (%)",
    fill = "Number of Indicators"
  ) +
  theme(
    axis.text.y = element_text(size = 1)
  ) +
  coord_flip()

TX_percentagreement_barplot

ggsave("20251113_percentagreement_barplot_Texas.svg", plot = TX_percentagreement_barplot, width = 6.5, height = 6.5)

Printing results for chart in findings section.

TX_percent_agreement <- TX_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
  group_by(indicator) %>%
  summarise(percent_agreement = mean(agreement) * 100)

TX_percent_agreement
## # A tibble: 16 × 2
##    indicator                                                   percent_agreement
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                    88.4
##  2 EA_Postsecondary_readiness                                               81.7
##  3 K-12_Access_to_effective_teaching                                       100  
##  4 K-12_Access_to_high-quality_academic_supports                            98.5
##  5 K-12_Access_to_rigorous_coursework                                       94.8
##  6 K-12_Curricular_breadth                                                  99.5
##  7 K-12_Engagement_in_schooling                                             90.2
##  8 K-12_Nonacademic_supports_for_student_success                            99.5
##  9 K-12_Nonexclusionary_discipline_practices                                98.4
## 10 K-12_Performance_in_coursework                                           88.1
## 11 K-12_Performance_on_tests                                                87.3
## 12 K-12_School_climate                                                      98.4
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre…              87.0
## 14 Pre-K_Academic_readiness                                                 99.6
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr…              96.6
## 16 Pre-K_Self-regulation_and_attention_skills                               99.3
TX_ab_long <- TX_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

Calculating Cohen’s kappa, first for the whole dataset, then by indicator.

TX_contingency_table <- data.frame(TX_ab_long$rating_a, TX_ab_long$rating_b)
TX_kappa_result <- kappa2(TX_contingency_table, weight = "unweighted")
print(TX_kappa_result)
##  Cohen's Kappa for 2 Raters (Weights: unweighted)
## 
##  Subjects = 13200 
##    Raters = 2 
##     Kappa = 0.707 
## 
##         z = 96.6 
##   p-value = 0
TX_kappa_by_indicator <- TX_ab_long %>%
  group_by(indicator) %>%
  summarise(
    kappa_stat = {
      TX_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
      TX_kappa_result <- kappa2(TX_contingency_table, weight = "unweighted")
      TX_kappa_result$value  
    },
    .groups = "drop"
  )

print(TX_kappa_by_indicator)
## # A tibble: 16 × 2
##    indicator                                                        kappa_stat
##    <chr>                                                                 <dbl>
##  1 EA_On-time_graduation                                               0.705  
##  2 EA_Postsecondary_readiness                                          0.514  
##  3 K-12_Access_to_effective_teaching                                   1      
##  4 K-12_Access_to_high-quality_academic_supports                       0.744  
##  5 K-12_Access_to_rigorous_coursework                                  0.809  
##  6 K-12_Curricular_breadth                                             0.748  
##  7 K-12_Engagement_in_schooling                                        0.662  
##  8 K-12_Nonacademic_supports_for_student_success                       0      
##  9 K-12_Nonexclusionary_discipline_practices                           0.734  
## 10 K-12_Performance_in_coursework                                      0.372  
## 11 K-12_Performance_on_tests                                           0.769  
## 12 K-12_School_climate                                                -0.00393
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation    0.614  
## 14 Pre-K_Academic_readiness                                            0.798  
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs   -0.00746
## 16 Pre-K_Self-regulation_and_attention_skills                          0

Counting the number of datasets mapped to each indicator to create a plot afterwards.

TX_mapped_counts <- TX_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(indicator) %>%
  summarise(mapped_counts = n_distinct(id), .groups = "drop")

TX_mapped_counts
## # A tibble: 16 × 2
##    indicator                                                       mapped_counts
##    <chr>                                                                   <int>
##  1 EA_On-time_graduation                                                     217
##  2 EA_Postsecondary_readiness                                                254
##  3 K-12_Access_to_effective_teaching                                          20
##  4 K-12_Access_to_high-quality_academic_supports                              30
##  5 K-12_Access_to_rigorous_coursework                                        148
##  6 K-12_Curricular_breadth                                                    10
##  7 K-12_Engagement_in_schooling                                              171
##  8 K-12_Nonacademic_supports_for_student_success                               4
##  9 K-12_Nonexclusionary_discipline_practices                                  28
## 10 K-12_Performance_in_coursework                                            126
## 11 K-12_Performance_on_tests                                                 440
## 12 K-12_School_climate                                                        13
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati…           230
## 14 Pre-K_Academic_readiness                                                    9
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra…            28
## 16 Pre-K_Self-regulation_and_attention_skills                                  6
TX_indicator_kappa_count_merge <- TX_kappa_by_indicator %>%
  left_join(TX_mapped_counts, by = "indicator")

head(TX_indicator_kappa_count_merge)
## # A tibble: 6 × 3
##   indicator                                     kappa_stat mapped_counts
##   <chr>                                              <dbl>         <int>
## 1 EA_On-time_graduation                              0.705           217
## 2 EA_Postsecondary_readiness                         0.514           254
## 3 K-12_Access_to_effective_teaching                  1                20
## 4 K-12_Access_to_high-quality_academic_supports      0.744            30
## 5 K-12_Access_to_rigorous_coursework                 0.809           148
## 6 K-12_Curricular_breadth                            0.748            10
clean_labels <- sub("^[^_]+_", "", TX_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)


TX_indicator_kappa_count_merge$indicator_cleaned <- factor(
  wrapped_labels,
  levels = wrapped_labels[order(TX_indicator_kappa_count_merge$kappa_stat)]
)
TX_balloonplot <- ggballoonplot(TX_indicator_kappa_count_merge, 
                                x = "indicator_cleaned", 
                                y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
  scale_fill_viridis_c(option = "C") +  
  theme_minimal() +
  ylim(-1,1) +
  labs(
    title = "Texas Interrater Reliability by Indicator",
    subtitle = "Overall Kappa is 0.707",
    x = "Indicator",
    y = "Cohen's Kappa",
    fill = "Kappa",
    size = "Number of Datasets"
  ) +
  theme(
    axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
  ) +
  coord_flip()

TX_balloonplot

ggsave("20251113_IRR_kappa_balloonplot_Texas.svg", plot = TX_balloonplot, width = 6.5, height = 6.5)
TX_heatmap_long <- TX_ab_long %>%
  mutate(weight = case_when(
    rating_a == 0 & rating_b == 0 ~ 0,  # Both rated zero = Not Rated
    (rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2,  # One rated zero, other rated indirect = Moderate Disagreement
    (rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3,  # One rated zero, other rated direct = Serious Disagreement
    (rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1,  # Both rated indirect = Agreement
    (rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1,  # Both rated direct = Agreement
    TRUE ~ NA_real_  # Handle any other cases (e.g., missing or invalid ratings)
  ))

clean_labels <- sub("^[^_]+_", "", TX_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)

TX_heatmap_long$indicator_cleaned <- wrapped_labels

TX_heatmap_long_selected <- TX_heatmap_long %>%
  select(id, indicator_cleaned, weight)

head(TX_heatmap_long_selected)
## # A tibble: 6 × 3
##   id     indicator_cleaned                           weight
##   <chr>  <chr>                                        <dbl>
## 1 TX1218 "On-time graduation"                             0
## 2 TX1218 "Postsecondary readiness"                        1
## 3 TX1218 "Access to effective teaching"                   0
## 4 TX1218 "Access to high-quality academic\nsupports"      0
## 5 TX1218 "Access to rigorous coursework"                  0
## 6 TX1218 "Curricular breadth"                             0

Dataset disagreement with marginal plots.

# 1. Set factor levels based on heatmap ordering
TX_indicator_levels <- levels(with(TX_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
TX_dataset_levels <- levels(with(TX_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
TX_agreement_heatmap <- ggplot(TX_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "Texas Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
    axis.text.y = element_text(size = 1),
    legend.text = element_text(size = 6),
    plot.title = element_text(hjust = 0.5)  
  )

# 3. Extract the legend
TX_legend <- cowplot::get_legend(TX_agreement_heatmap)
TX_legend_plot <- wrap_elements(TX_legend)

# 4. Remove legend from main plot
TX_main_plot <- TX_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))

# 5. Create top marginal plot (stacked bar chart by indicator)
TX_top_plot <- TX_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = TX_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  scale_x_discrete(drop = FALSE) +
  theme_minimal() +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    legend.position = "none"
  )

# 6. Create right marginal plot (stacked bar chart by dataset)
TX_right_plot <- TX_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  mutate(id = factor(id, levels = TX_dataset_levels)) %>%
  ggplot(aes(x = id, fill = factor(weight))) +
  geom_bar(position = "stack") +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
  theme_minimal() +
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )

TX_final_plot <- TX_top_plot + TX_legend + TX_main_plot + TX_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))

TX_final_plot

ggsave("20251113_agreement_heatmap_with_marginals_Texas.svg", plot = TX_final_plot, width = 8, height = 12)

Creating the png.

# 1. Set factor levels based on heatmap ordering
TX_indicator_levels <- levels(with(TX_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
TX_dataset_levels <- levels(with(TX_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
TX_agreement_heatmap <- ggplot(TX_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "Texas Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
    axis.text.y = element_text(size = 30),
    axis.title.y = element_text(size = 20),
    axis.title.x = element_text(size = 20),
    legend.text = element_text(size = 20),
    legend.title = element_text(size = 20),
    plot.title = element_text(size = 25, hjust = 0.5)  
  )

# 3. Extract the legend
TX_legend <- cowplot::get_legend(TX_agreement_heatmap)

# 4. Remove legend from main plot and adjust bar width
TX_main_plot <- TX_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
                                             axis.title.y = element_text(size = 20),
                                             axis.title.x = element_text(size = 20), hjust = 0.5)

# 5. Create top marginal plot 
TX_top_plot <- TX_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = TX_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_text(size = 20),
    legend.position = "none")

# 6. Create right marginal plot
TX_right_data <- TX_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  count(id, weight) %>%
  mutate(id = factor(id, levels = TX_dataset_levels))

TX_right_plot <- ggplot(TX_right_data, aes(x = id, y = n, fill = factor(weight))) +
  geom_col(position = "stack", width = 1, colour = NA) +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +  
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),  
    axis.title.x = element_text(size = 20), 
    legend.position = "none",
    panel.spacing = unit(0, "lines")
  )

TX_final_plot3 <- TX_top_plot + TX_legend + TX_main_plot + TX_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))

TX_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

ggsave("20251113_agreement_heatmap_with_marginals_Texas.png", plot = TX_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.

TX_ab_directindirect <- TX_ab_long %>%
  select(id, dataset_name, indicator, rating_a, rating_b) %>%
  mutate(across(c(rating_a, rating_b), ~ case_when(
    . == 2 ~ 1,
    . %in% c(3,4) ~ 2,
    TRUE ~ .
  )))

TX_ratings_kappaweighted <- TX_ab_directindirect %>%
  select(rating_a, rating_b)
TX_kappa_weighted <- kappa2(TX_ratings_kappaweighted, weight = "equal")
print(TX_kappa_weighted)
##  Cohen's Kappa for 2 Raters (Weights: equal)
## 
##  Subjects = 13200 
##    Raters = 2 
##     Kappa = 0.741 
## 
##         z = 89 
##   p-value = 0
TX_weighted_kappa_by_indicator <- TX_ab_directindirect %>%
  group_by(indicator) %>%
  summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)

TX_weighted_kappa_by_indicator
## # A tibble: 16 × 2
##    indicator                                                      weighted_kappa
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                 0.815  
##  2 EA_Postsecondary_readiness                                            0.535  
##  3 K-12_Access_to_effective_teaching                                     1      
##  4 K-12_Access_to_high-quality_academic_supports                         0.743  
##  5 K-12_Access_to_rigorous_coursework                                    0.820  
##  6 K-12_Curricular_breadth                                               0.748  
##  7 K-12_Engagement_in_schooling                                          0.679  
##  8 K-12_Nonacademic_supports_for_student_success                         0      
##  9 K-12_Nonexclusionary_discipline_practices                             0.790  
## 10 K-12_Performance_in_coursework                                        0.355  
## 11 K-12_Performance_on_tests                                             0.793  
## 12 K-12_School_climate                                                  -0.00789
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat…        0.614  
## 14 Pre-K_Academic_readiness                                              0.798  
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr…       -0.0107 
## 16 Pre-K_Self-regulation_and_attention_skills                            0
IRRstatistics_Texas <- TX_indicator_kappa_count_merge %>%
  left_join(TX_weighted_kappa_by_indicator, by = "indicator") %>%
  left_join(TX_percent_agreement, by = "indicator") %>%
  select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)

indicator_order <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs", 
  "K-12_Access_to_effective_teaching",
  "K-12_Access_to_rigorous_coursework",
  "K-12_Curricular_breadth",
  "K-12_Access_to_high-quality_academic_supports",
  "K-12_School_climate",
  "K-12_Nonexclusionary_discipline_practices",
  "K-12_Nonacademic_supports_for_student_success",
  "Pre-K_Academic_readiness",
  "Pre-K_Self-regulation_and_attention_skills",
  "K-12_Engagement_in_schooling",
  "K-12_Performance_in_coursework",
  "K-12_Performance_on_tests",
  "EA_On-time_graduation",
  "EA_Postsecondary_readiness"
)

indicator_name_mapping <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
  "K-12_Access_to_effective_teaching" = "Effective Teaching",
  "K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
  "K-12_Curricular_breadth" = "Curricular Breadth",
  "K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
  "K-12_School_climate" = "School Climate",
  "K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
  "K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
  "Pre-K_Academic_readiness" = "Academic Readiness",
  "Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
  "K-12_Engagement_in_schooling" = "Engagement in Schooling",
  "K-12_Performance_in_coursework" = "Performance in Coursework",
  "K-12_Performance_on_tests" = "Performance on Tests",
  "EA_On-time_graduation" = "On-Time Graduation",
  "EA_Postsecondary_readiness" = "Postsecondary Readiness"
)

IRRstatistics_Texas_ordered <- IRRstatistics_Texas %>%
  arrange(match(indicator, indicator_order)) %>%
  mutate(indicator_name = indicator_name_mapping[indicator]) %>%
  select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
    rename(
      "Indicator" = indicator_name,
      "Mapped Datasets" = mapped_counts,
      "Percent Agreement" = percent_agreement,
      "Cohen's Kappa" = kappa_stat,
      "Cohen's Weighted Kappa" = weighted_kappa) %>%
  mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))

kable(IRRstatistics_Texas_ordered)
Indicator Mapped Datasets Percent Agreement Cohen’s Kappa Cohen’s Weighted Kappa
Racial, Ethnic, and Economic Segregation 230 87.03030 0.6144270 0.6144270
High-Quality Pre-K Programs 28 96.60606 -0.0074578 -0.0107436
Effective Teaching 20 100.00000 1.0000000 1.0000000
Rigorous Coursework 148 94.78788 0.8094207 0.8199913
Curricular Breadth 10 99.51515 0.7477064 0.7477064
High-Quality Academic Supports 30 98.54545 0.7441860 0.7429907
School Climate 13 98.42424 -0.0039315 -0.0078940
Nonexclusionary Discipline Practices 28 98.42424 0.7342534 0.7897920
Nonacademic Supports for Student Success 4 99.51515 0.0000000 0.0000000
Academic Readiness 9 99.63636 0.7982392 0.7982392
Self-Regulation and Attention Skills 6 99.27273 0.0000000 0.0000000
Engagement in Schooling 171 90.18182 0.6623073 0.6789389
Performance in Coursework 126 88.12121 0.3718974 0.3547443
Performance on Tests 440 87.27273 0.7692080 0.7928960
On-Time Graduation 217 88.36364 0.7046297 0.8150208
Postsecondary Readiness 254 81.69697 0.5142195 0.5345932
write.csv(IRRstatistics_Texas_ordered, "20251113_IRRstatistics_Texas.csv", row.names = FALSE)

NORTH CAROLINA

Filtering for only North Carolina datasets.

NC_rater_ab <- rater_ab %>%
  filter(str_starts(id, "NC"))

str(NC_rater_ab)
## tibble [995 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:995] "NC1218" "NC1218" "NC1218" "NC1218" ...
##  $ rater           : chr [1:995] "R562" "R917" "R562" "R917" ...
##  $ validation_round: chr [1:995] "A" "B" "A" "B" ...
##  $ dataset_name    : chr [1:995] "rcd_161.xlsx" "rcd_161.xlsx" "rcd_161.xlsx" "rcd_161.xlsx" ...
##  $ indicator       : chr [1:995] "EA_Postsecondary_readiness" "EA_Postsecondary_readiness" "K-12_Access_to_rigorous_coursework" "K-12_Access_to_rigorous_coursework" ...
##  $ rating          : num [1:995] 2 2 4 4 2 4 4 4 4 4 ...
NC_duplicates_check <- NC_rater_ab %>%
  group_by(id, indicator, validation_round) %>%
  filter(n() > 1) %>%
  ungroup()

# View the duplicates
print(NC_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## #   dataset_name <chr>, indicator <chr>, rating <dbl>

Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).

NC_rater_ab <- NC_rater_ab %>%
  filter(!(id %in% c("NC1318", "NC1319")))

Making sure each id, indicator, validator combination is present.

NC_expanded_grid <- NC_rater_ab %>%
  distinct(id, dataset_name) %>%
  crossing(indicators_df, validation_round = c("A", "B"))

head(NC_expanded_grid)
## # A tibble: 6 × 4
##   id     dataset_name indicator                         validation_round
##   <chr>  <chr>        <chr>                             <chr>           
## 1 NC1218 rcd_161.xlsx EA_On-time_graduation             A               
## 2 NC1218 rcd_161.xlsx EA_On-time_graduation             B               
## 3 NC1218 rcd_161.xlsx EA_Postsecondary_readiness        A               
## 4 NC1218 rcd_161.xlsx EA_Postsecondary_readiness        B               
## 5 NC1218 rcd_161.xlsx K-12_Access_to_effective_teaching A               
## 6 NC1218 rcd_161.xlsx K-12_Access_to_effective_teaching B

Making sure that id/indicator combinations that were not rated are represented as zero.

NC_ratings_with_zeros <- NC_expanded_grid %>%
  left_join(NC_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating), 
            by = c("id", "dataset_name", "indicator", "validation_round")) %>%
  mutate(rating = replace_na(rating, 0))
NC_ab_long <- NC_ratings_with_zeros %>%
  pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
  rename(rating_a = rating_A, rating_b = rating_B) %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

head(NC_ab_long)
## # A tibble: 6 × 6
##   id     dataset_name indicator                      rating_a rating_b agreement
##   <chr>  <chr>        <chr>                             <dbl>    <dbl>     <dbl>
## 1 NC1218 rcd_161.xlsx EA_On-time_graduation                 0        0         1
## 2 NC1218 rcd_161.xlsx EA_Postsecondary_readiness            2        2         1
## 3 NC1218 rcd_161.xlsx K-12_Access_to_effective_teac…        0        0         1
## 4 NC1218 rcd_161.xlsx K-12_Access_to_high-quality_a…        0        0         1
## 5 NC1218 rcd_161.xlsx K-12_Access_to_rigorous_cours…        4        4         1
## 6 NC1218 rcd_161.xlsx K-12_Curricular_breadth               0        0         1
NC_dataset_agreement <- NC_ab_long %>%
  group_by(id) %>%
  summarise(percent_agreement = mean(agreement) * 100)

NC_dataset_agreement <- NC_dataset_agreement %>%
  ungroup()

n_distinct(NC_dataset_agreement$id)
## [1] 275
head(NC_dataset_agreement)
## # A tibble: 6 × 2
##   id     percent_agreement
##   <chr>              <dbl>
## 1 NC1218             100  
## 2 NC1219              93.8
## 3 NC1220             100  
## 4 NC1221             100  
## 5 NC1222             100  
## 6 NC1223              93.8
write.csv(NC_dataset_agreement, "20251113_percent_agreement_by_id_North_Carolina.csv", row.names = FALSE)
NC_indicators_mapped_per_id <- NC_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(id) %>%
  summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
  right_join(NC_ab_long %>% distinct(id), by = "id") %>%
  mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))

NC_dataset_agreement_count <- NC_dataset_agreement %>%
  left_join(NC_indicators_mapped_per_id, by = "id")

head(NC_dataset_agreement_count)
## # A tibble: 6 × 3
##   id     percent_agreement indicators_mapped_per_id
##   <chr>              <dbl>                    <int>
## 1 NC1218             100                          2
## 2 NC1219              93.8                        1
## 3 NC1220             100                          1
## 4 NC1221             100                          1
## 5 NC1222             100                          1
## 6 NC1223              93.8                        2
NC_dataset_agreement_count <- NC_dataset_agreement_count %>%
  arrange(desc(percent_agreement), (indicators_mapped_per_id))

NC_dataset_agreement_count$id <- factor(NC_dataset_agreement_count$id, levels = unique(NC_dataset_agreement_count$id))

NC_dataset_agreement_count$indicators_mapped_per_id <- 
  factor(NC_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
NC_percentagreement_barplot <- ggplot(NC_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
  geom_bar(stat = "identity") +
  theme_minimal() + 
  scale_fill_viridis_d(direction = -1) +
  scale_y_continuous(expand = c(0, 0)) +
  labs(
    title = "North Carolina - Percentage Agreement by Dataset",
    x = "Dataset",
    y = "Agreement (%)",
    fill = "Number of Indicators"
  ) +
  theme(
    axis.text.y = element_text(size = 1)
  ) +
  coord_flip()

NC_percentagreement_barplot

ggsave("20251113_percentagreement_barplot_North_Carolina.svg", plot = NC_percentagreement_barplot, width = 6.5, height = 6.5)

Printing results for chart in findings section.

NC_percent_agreement <- NC_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
  group_by(indicator) %>%
  summarise(percent_agreement = mean(agreement) * 100)

NC_percent_agreement
## # A tibble: 16 × 2
##    indicator                                                   percent_agreement
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                    87.6
##  2 EA_Postsecondary_readiness                                               79.3
##  3 K-12_Access_to_effective_teaching                                        97.5
##  4 K-12_Access_to_high-quality_academic_supports                            97.8
##  5 K-12_Access_to_rigorous_coursework                                       94.9
##  6 K-12_Curricular_breadth                                                  98.2
##  7 K-12_Engagement_in_schooling                                             76  
##  8 K-12_Nonacademic_supports_for_student_success                            99.6
##  9 K-12_Nonexclusionary_discipline_practices                                98.9
## 10 K-12_Performance_in_coursework                                           88.7
## 11 K-12_Performance_on_tests                                                89.1
## 12 K-12_School_climate                                                      88.4
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre…              92.4
## 14 Pre-K_Academic_readiness                                                 99.6
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr…             100  
## 16 Pre-K_Self-regulation_and_attention_skills                               99.6
NC_ab_long <- NC_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

Calculating Cohen’s kappa, first for the whole dataset, then by indicator.

NC_contingency_table <- data.frame(NC_ab_long$rating_a, NC_ab_long$rating_b)
NC_kappa_result <- kappa2(NC_contingency_table, weight = "unweighted")
print(NC_kappa_result)
##  Cohen's Kappa for 2 Raters (Weights: unweighted)
## 
##  Subjects = 4400 
##    Raters = 2 
##     Kappa = 0.657 
## 
##         z = 66.5 
##   p-value = 0
NC_kappa_by_indicator <- NC_ab_long %>%
  group_by(indicator) %>%
  summarise(
    kappa_stat = {
      NC_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
      NC_kappa_result <- kappa2(NC_contingency_table, weight = "unweighted")
      NC_kappa_result$value  
    },
    .groups = "drop"
  )
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `kappa_stat = { ... }`.
## ℹ In group 4: `indicator = "K-12_Access_to_high-quality_academic_supports"`.
## Caused by warning in `sqrt()`:
## ! NaNs produced
print(NC_kappa_by_indicator)
## # A tibble: 16 × 2
##    indicator                                                        kappa_stat
##    <chr>                                                                 <dbl>
##  1 EA_On-time_graduation                                                 0.759
##  2 EA_Postsecondary_readiness                                            0.644
##  3 K-12_Access_to_effective_teaching                                     0.525
##  4 K-12_Access_to_high-quality_academic_supports                         0    
##  5 K-12_Access_to_rigorous_coursework                                    0.639
##  6 K-12_Curricular_breadth                                               0.607
##  7 K-12_Engagement_in_schooling                                          0.408
##  8 K-12_Nonacademic_supports_for_student_success                         0    
##  9 K-12_Nonexclusionary_discipline_practices                             0.945
## 10 K-12_Performance_in_coursework                                        0.421
## 11 K-12_Performance_on_tests                                             0.708
## 12 K-12_School_climate                                                   0.566
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation      0.295
## 14 Pre-K_Academic_readiness                                              0.499
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs      1    
## 16 Pre-K_Self-regulation_and_attention_skills                            0

Counting the number of datasets mapped to each indicator to create a plot afterwards.

NC_mapped_counts <- NC_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(indicator) %>%
  summarise(mapped_counts = n_distinct(id), .groups = "drop")

NC_mapped_counts
## # A tibble: 16 × 2
##    indicator                                                       mapped_counts
##    <chr>                                                                   <int>
##  1 EA_On-time_graduation                                                     104
##  2 EA_Postsecondary_readiness                                                130
##  3 K-12_Access_to_effective_teaching                                           9
##  4 K-12_Access_to_high-quality_academic_supports                               6
##  5 K-12_Access_to_rigorous_coursework                                         27
##  6 K-12_Curricular_breadth                                                     8
##  7 K-12_Engagement_in_schooling                                               99
##  8 K-12_Nonacademic_supports_for_student_success                               1
##  9 K-12_Nonexclusionary_discipline_practices                                  30
## 10 K-12_Performance_in_coursework                                             45
## 11 K-12_Performance_on_tests                                                  70
## 12 K-12_School_climate                                                        56
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati…            26
## 14 Pre-K_Academic_readiness                                                    1
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra…             1
## 16 Pre-K_Self-regulation_and_attention_skills                                  1
NC_indicator_kappa_count_merge <- NC_kappa_by_indicator %>%
  left_join(NC_mapped_counts, by = "indicator")

head(NC_indicator_kappa_count_merge)
## # A tibble: 6 × 3
##   indicator                                     kappa_stat mapped_counts
##   <chr>                                              <dbl>         <int>
## 1 EA_On-time_graduation                              0.759           104
## 2 EA_Postsecondary_readiness                         0.644           130
## 3 K-12_Access_to_effective_teaching                  0.525             9
## 4 K-12_Access_to_high-quality_academic_supports      0                 6
## 5 K-12_Access_to_rigorous_coursework                 0.639            27
## 6 K-12_Curricular_breadth                            0.607             8
clean_labels <- sub("^[^_]+_", "", NC_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)


NC_indicator_kappa_count_merge$indicator_cleaned <- factor(
  wrapped_labels,
  levels = wrapped_labels[order(NC_indicator_kappa_count_merge$kappa_stat)]
)
NC_balloonplot <- ggballoonplot(NC_indicator_kappa_count_merge, 
                                x = "indicator_cleaned", 
                                y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
  scale_fill_viridis_c(option = "C") +  
  theme_minimal() +
  ylim(-1,1) +
  labs(
    title = "North Carolina Interrater Reliability by Indicator",
    subtitle = "Overall Kappa is 0.657",
    x = "Indicator",
    y = "Cohen's Kappa",
    fill = "Kappa",
    size = "Number of Datasets"
  ) +
  theme(
    axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
  ) +
  coord_flip()

NC_balloonplot

ggsave("20251113_IRR_kappa_balloonplot_North_Carolina.svg", plot = NC_balloonplot, width = 6.5, height = 6.5)
NC_heatmap_long <- NC_ab_long %>%
  mutate(weight = case_when(
    rating_a == 0 & rating_b == 0 ~ 0,  # Both rated zero = Not Rated
    (rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2,  # One rated zero, other rated indirect = Moderate Disagreement
    (rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3,  # One rated zero, other rated direct = Serious Disagreement
    (rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1,  # Both rated indirect = Agreement
    (rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1,  # Both rated direct = Agreement
    TRUE ~ NA_real_  # Handle any other cases (e.g., missing or invalid ratings)
  ))

clean_labels <- sub("^[^_]+_", "", NC_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)

NC_heatmap_long$indicator_cleaned <- wrapped_labels

NC_heatmap_long_selected <- NC_heatmap_long %>%
  select(id, indicator_cleaned, weight)

head(NC_heatmap_long_selected)
## # A tibble: 6 × 3
##   id     indicator_cleaned                           weight
##   <chr>  <chr>                                        <dbl>
## 1 NC1218 "On-time graduation"                             0
## 2 NC1218 "Postsecondary readiness"                        1
## 3 NC1218 "Access to effective teaching"                   0
## 4 NC1218 "Access to high-quality academic\nsupports"      0
## 5 NC1218 "Access to rigorous coursework"                  1
## 6 NC1218 "Curricular breadth"                             0

Dataset disagreement with marginal plots

# 1. Set factor levels based on heatmap ordering
NC_indicator_levels <- levels(with(NC_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
NC_dataset_levels <- levels(with(NC_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
NC_agreement_heatmap <- ggplot(NC_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "North Carolina Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
    axis.text.y = element_text(size = 1),
    legend.text = element_text(size = 6),
    plot.title = element_text(hjust = 0.5)  
  )

# 3. Extract the legend
NC_legend <- cowplot::get_legend(NC_agreement_heatmap)
NC_legend_plot <- wrap_elements(NC_legend)

# 4. Remove legend from main plot
NC_main_plot <- NC_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))

# 5. Create top marginal plot (stacked bar chart by indicator)
NC_top_plot <- NC_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = NC_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  scale_x_discrete(drop = FALSE) +
  theme_minimal() +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    legend.position = "none"
  )

# 6. Create right marginal plot (stacked bar chart by dataset)
NC_right_plot <- NC_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  mutate(id = factor(id, levels = NC_dataset_levels)) %>%
  ggplot(aes(x = id, fill = factor(weight))) +
  geom_bar(position = "stack") +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
  theme_minimal() +
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )

NC_final_plot <- NC_top_plot + NC_legend + NC_main_plot + NC_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))

NC_final_plot

ggsave("20251113_agreement_heatmap_with_marginals_North_Carolina.svg", plot = NC_final_plot, width = 8, height = 12)

Creating the png.

# 1. Set factor levels based on heatmap ordering
NC_indicator_levels <- levels(with(NC_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
NC_dataset_levels <- levels(with(NC_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
NC_agreement_heatmap <- ggplot(NC_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "North Carolina Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
    axis.text.y = element_text(size = 30),
    axis.title.y = element_text(size = 20),
    axis.title.x = element_text(size = 20),
    legend.text = element_text(size = 20),
    legend.title = element_text(size = 20),
    plot.title = element_text(size = 25, hjust = 0.5)  
  )

# 3. Extract the legend
NC_legend <- cowplot::get_legend(NC_agreement_heatmap)

# 4. Remove legend from main plot and adjust bar width
NC_main_plot <- NC_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
                                             axis.title.y = element_text(size = 20),
                                             axis.title.x = element_text(size = 20), hjust = 0.5)

# 5. Create top marginal plot 
NC_top_plot <- NC_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = NC_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_text(size = 20),
    legend.position = "none")

# 6. Create right marginal plot
NC_right_data <- NC_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  count(id, weight) %>%
  mutate(id = factor(id, levels = NC_dataset_levels))

NC_right_plot <- ggplot(NC_right_data, aes(x = id, y = n, fill = factor(weight))) +
  geom_col(position = "stack", width = 1, colour = NA) +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +  
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),  
    axis.title.x = element_text(size = 20), 
    legend.position = "none",
    panel.spacing = unit(0, "lines")
  )

NC_final_plot3 <- NC_top_plot + NC_legend + NC_main_plot + NC_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))

NC_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

ggsave("20251113_agreement_heatmap_with_marginals_North_Carolina.png", plot = NC_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.

NC_ab_directindirect <- NC_ab_long %>%
  select(id, dataset_name, indicator, rating_a, rating_b) %>%
  mutate(across(c(rating_a, rating_b), ~ case_when(
    . == 2 ~ 1,
    . %in% c(3,4) ~ 2,
    TRUE ~ .
  )))

NC_ratings_kappaweighted <- NC_ab_directindirect %>%
  select(rating_a, rating_b)
NC_kappa_weighted <- kappa2(NC_ratings_kappaweighted, weight = "equal")
print(NC_kappa_weighted)
##  Cohen's Kappa for 2 Raters (Weights: equal)
## 
##  Subjects = 4400 
##    Raters = 2 
##     Kappa = 0.744 
## 
##         z = 54.5 
##   p-value = 0
NC_weighted_kappa_by_indicator <- NC_ab_directindirect %>%
  group_by(indicator) %>%
  summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a,
##   rating_b)), weight = "equal")$value`.
## ℹ In group 4: `indicator = "K-12_Access_to_high-quality_academic_supports"`.
## Caused by warning in `sqrt()`:
## ! NaNs produced
NC_weighted_kappa_by_indicator
## # A tibble: 16 × 2
##    indicator                                                      weighted_kappa
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                   0.802
##  2 EA_Postsecondary_readiness                                              0.768
##  3 K-12_Access_to_effective_teaching                                       0.764
##  4 K-12_Access_to_high-quality_academic_supports                           0    
##  5 K-12_Access_to_rigorous_coursework                                      0.661
##  6 K-12_Curricular_breadth                                                 0.659
##  7 K-12_Engagement_in_schooling                                            0.410
##  8 K-12_Nonacademic_supports_for_student_success                           0    
##  9 K-12_Nonexclusionary_discipline_practices                               0.991
## 10 K-12_Performance_in_coursework                                          0.423
## 11 K-12_Performance_on_tests                                               0.828
## 12 K-12_School_climate                                                     0.723
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat…          0.315
## 14 Pre-K_Academic_readiness                                                1    
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr…          1    
## 16 Pre-K_Self-regulation_and_attention_skills                              0
IRRstatistics_North_Carolina <- NC_indicator_kappa_count_merge %>%
  left_join(NC_weighted_kappa_by_indicator, by = "indicator") %>%
  left_join(NC_percent_agreement, by = "indicator") %>%
  select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)

indicator_order <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs", 
  "K-12_Access_to_effective_teaching",
  "K-12_Access_to_rigorous_coursework",
  "K-12_Curricular_breadth",
  "K-12_Access_to_high-quality_academic_supports",
  "K-12_School_climate",
  "K-12_Nonexclusionary_discipline_practices",
  "K-12_Nonacademic_supports_for_student_success",
  "Pre-K_Academic_readiness",
  "Pre-K_Self-regulation_and_attention_skills",
  "K-12_Engagement_in_schooling",
  "K-12_Performance_in_coursework",
  "K-12_Performance_on_tests",
  "EA_On-time_graduation",
  "EA_Postsecondary_readiness"
)

indicator_name_mapping <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
  "K-12_Access_to_effective_teaching" = "Effective Teaching",
  "K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
  "K-12_Curricular_breadth" = "Curricular Breadth",
  "K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
  "K-12_School_climate" = "School Climate",
  "K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
  "K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
  "Pre-K_Academic_readiness" = "Academic Readiness",
  "Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
  "K-12_Engagement_in_schooling" = "Engagement in Schooling",
  "K-12_Performance_in_coursework" = "Performance in Coursework",
  "K-12_Performance_on_tests" = "Performance on Tests",
  "EA_On-time_graduation" = "On-Time Graduation",
  "EA_Postsecondary_readiness" = "Postsecondary Readiness"
)

IRRstatistics_North_Carolina_ordered <- IRRstatistics_North_Carolina %>%
  arrange(match(indicator, indicator_order)) %>%
  mutate(indicator_name = indicator_name_mapping[indicator]) %>%
  select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
    rename(
      "Indicator" = indicator_name,
      "Mapped Datasets" = mapped_counts,
      "Percent Agreement" = percent_agreement,
      "Cohen's Kappa" = kappa_stat,
      "Cohen's Weighted Kappa" = weighted_kappa) %>%
  mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))

kable(IRRstatistics_North_Carolina_ordered)
Indicator Mapped Datasets Percent Agreement Cohen’s Kappa Cohen’s Weighted Kappa
Racial, Ethnic, and Economic Segregation 26 92.36364 0.2951300 0.3152298
High-Quality Pre-K Programs 1 100.00000 1.0000000 1.0000000
Effective Teaching 9 97.45455 0.5248087 0.7636780
Rigorous Coursework 27 94.90909 0.6388368 0.6605776
Curricular Breadth 8 98.18182 0.6074793 0.6592317
High-Quality Academic Supports 6 97.81818 0.0000000 0.0000000
School Climate 56 88.36364 0.5664811 0.7228346
Nonexclusionary Discipline Practices 30 98.90909 0.9454473 0.9905774
Nonacademic Supports for Student Success 1 99.63636 0.0000000 0.0000000
Academic Readiness 1 99.63636 0.4990893 1.0000000
Self-Regulation and Attention Skills 1 99.63636 0.0000000 0.0000000
Engagement in Schooling 99 76.00000 0.4084287 0.4103579
Performance in Coursework 45 88.72727 0.4212492 0.4225115
Performance on Tests 70 89.09091 0.7077474 0.8277325
On-Time Graduation 104 87.63636 0.7585601 0.8017844
Postsecondary Readiness 130 79.27273 0.6435313 0.7679129
write.csv(IRRstatistics_North_Carolina_ordered, "20251113_IRRstatistics_North_Carolina.csv", row.names = FALSE)

WASHINGTON, D.C.

Filtering for only Washington D.C. datasets.

DC_rater_ab <- rater_ab %>%
  filter(str_starts(id, "DC"))

str(DC_rater_ab)
## tibble [929 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:929] "DC1218" "DC1218" "DC1218" "DC1218" ...
##  $ rater           : chr [1:929] "R562" "R217" "R562" "R217" ...
##  $ validation_round: chr [1:929] "A" "B" "A" "B" ...
##  $ dataset_name    : chr [1:929] "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" ...
##  $ indicator       : chr [1:929] "K-12_Engagement_in_schooling" "K-12_Engagement_in_schooling" "K-12_Performance_on_tests" "K-12_Performance_on_tests" ...
##  $ rating          : num [1:929] 2 2 4 4 2 2 4 4 2 2 ...
DC_duplicates_check <- DC_rater_ab %>%
  group_by(id, indicator, validation_round) %>%
  filter(n() > 1) %>%
  ungroup()

# View the duplicates
print(DC_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## #   dataset_name <chr>, indicator <chr>, rating <dbl>

Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).

DC_rater_ab <- DC_rater_ab %>%
  filter(!(id %in% c("DC1268", "DC1270", "DC1360", "DC1446", "DC1447", "DC1452", "DC1462", "DC1463", "DC1472", "DC1485", "DC1494", "DC1496", "DC1516", "DC1523", "DC1525", "DC1535", "DC1536", "DC1544", "DC1546", "DC1553")
))

str(DC_rater_ab)
## tibble [889 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:889] "DC1218" "DC1218" "DC1218" "DC1218" ...
##  $ rater           : chr [1:889] "R562" "R217" "R562" "R217" ...
##  $ validation_round: chr [1:889] "A" "B" "A" "B" ...
##  $ dataset_name    : chr [1:889] "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" "2023-24 School Level DCCAPE and MSAA Data 8.20.2024.xlsx" ...
##  $ indicator       : chr [1:889] "K-12_Engagement_in_schooling" "K-12_Engagement_in_schooling" "K-12_Performance_on_tests" "K-12_Performance_on_tests" ...
##  $ rating          : num [1:889] 2 2 4 4 2 2 4 4 2 2 ...

Making sure each id, indicator, validator combination is present.

DC_expanded_grid <- DC_rater_ab %>%
  distinct(id, dataset_name) %>%
  crossing(indicators_df, validation_round = c("A", "B"))

head(DC_expanded_grid)
## # A tibble: 6 × 4
##   id     dataset_name                                 indicator validation_round
##   <chr>  <chr>                                        <chr>     <chr>           
## 1 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… EA_On-ti… A               
## 2 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… EA_On-ti… B               
## 3 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… EA_Posts… A               
## 4 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… EA_Posts… B               
## 5 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… K-12_Acc… A               
## 6 DC1218 2023-24 School Level DCCAPE and MSAA Data 8… K-12_Acc… B

Making sure that id/indicator combinations that were not rated are represented as zero.

DC_ratings_with_zeros <- DC_expanded_grid %>%
  left_join(DC_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating), 
            by = c("id", "dataset_name", "indicator", "validation_round")) %>%
  mutate(rating = replace_na(rating, 0))
DC_ab_long <- DC_ratings_with_zeros %>%
  pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
  rename(rating_a = rating_A, rating_b = rating_B) %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

head(DC_ab_long)
## # A tibble: 6 × 6
##   id     dataset_name                      indicator rating_a rating_b agreement
##   <chr>  <chr>                             <chr>        <dbl>    <dbl>     <dbl>
## 1 DC1218 2023-24 School Level DCCAPE and … EA_On-ti…        0        0         1
## 2 DC1218 2023-24 School Level DCCAPE and … EA_Posts…        0        0         1
## 3 DC1218 2023-24 School Level DCCAPE and … K-12_Acc…        0        0         1
## 4 DC1218 2023-24 School Level DCCAPE and … K-12_Acc…        0        0         1
## 5 DC1218 2023-24 School Level DCCAPE and … K-12_Acc…        0        0         1
## 6 DC1218 2023-24 School Level DCCAPE and … K-12_Cur…        0        0         1
DC_dataset_agreement <- DC_ab_long %>%
  group_by(id) %>%
  summarise(percent_agreement = mean(agreement) * 100)

DC_dataset_agreement <- DC_dataset_agreement %>%
  ungroup()

n_distinct(DC_dataset_agreement$id)
## [1] 314
head(DC_dataset_agreement)
## # A tibble: 6 × 2
##   id     percent_agreement
##   <chr>              <dbl>
## 1 DC1218               100
## 2 DC1219               100
## 3 DC1220               100
## 4 DC1221               100
## 5 DC1222               100
## 6 DC1223               100
write.csv(DC_dataset_agreement, "20251113_percent_agreement_by_id_Washington_DC.csv", row.names = FALSE)
DC_indicators_mapped_per_id <- DC_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(id) %>%
  summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
  right_join(DC_ab_long %>% distinct(id), by = "id") %>%
  mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))

DC_dataset_agreement_count <- DC_dataset_agreement %>%
  left_join(DC_indicators_mapped_per_id, by = "id")

head(DC_dataset_agreement_count)
## # A tibble: 6 × 3
##   id     percent_agreement indicators_mapped_per_id
##   <chr>              <dbl>                    <int>
## 1 DC1218               100                        2
## 2 DC1219               100                        2
## 3 DC1220               100                        2
## 4 DC1221               100                        2
## 5 DC1222               100                        1
## 6 DC1223               100                        1
DC_dataset_agreement_count <- DC_dataset_agreement_count %>%
  arrange(desc(percent_agreement), (indicators_mapped_per_id))

DC_dataset_agreement_count$id <- factor(DC_dataset_agreement_count$id, levels = unique(DC_dataset_agreement_count$id))

DC_dataset_agreement_count$indicators_mapped_per_id <- 
  factor(DC_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
DC_percentagreement_barplot <- ggplot(DC_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
  geom_bar(stat = "identity") +
  theme_minimal() + 
  scale_fill_viridis_d(direction = -1) +
  scale_y_continuous(expand = c(0, 0)) +
  labs(
    title = "Washington D.C. - Percentage Agreement by Dataset",
    x = "Dataset",
    y = "Agreement (%)",
    fill = "Number of Indicators"
  ) +
  theme(
    axis.text.y = element_text(size = 1)
  ) +
  coord_flip()

DC_percentagreement_barplot

ggsave("20251113_percentagreement_barplot_Washington_DC.svg", plot = DC_percentagreement_barplot, width = 6.5, height = 6.5)

Printing results for chart in findings section.

DC_percent_agreement <- DC_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
  group_by(indicator) %>%
  summarise(percent_agreement = mean(agreement) * 100)

DC_percent_agreement
## # A tibble: 16 × 2
##    indicator                                                   percent_agreement
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                    90.1
##  2 EA_Postsecondary_readiness                                               86.0
##  3 K-12_Access_to_effective_teaching                                        97.5
##  4 K-12_Access_to_high-quality_academic_supports                            88.2
##  5 K-12_Access_to_rigorous_coursework                                       93.9
##  6 K-12_Curricular_breadth                                                  96.8
##  7 K-12_Engagement_in_schooling                                             86.3
##  8 K-12_Nonacademic_supports_for_student_success                            93.6
##  9 K-12_Nonexclusionary_discipline_practices                                98.4
## 10 K-12_Performance_in_coursework                                           93.6
## 11 K-12_Performance_on_tests                                                93.0
## 12 K-12_School_climate                                                      94.6
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre…              88.5
## 14 Pre-K_Academic_readiness                                                100  
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr…              90.4
## 16 Pre-K_Self-regulation_and_attention_skills                               99.7
DC_ab_long <- DC_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

Calculating Cohen’s kappa, first for the whole dataset, then by indicator.

DC_contingency_table <- data.frame(DC_ab_long$rating_a, DC_ab_long$rating_b)
DC_kappa_result <- kappa2(DC_contingency_table, weight = "unweighted")
print(DC_kappa_result)
##  Cohen's Kappa for 2 Raters (Weights: unweighted)
## 
##  Subjects = 5024 
##    Raters = 2 
##     Kappa = 0.547 
## 
##         z = 56.7 
##   p-value = 0
DC_kappa_by_indicator <- DC_ab_long %>%
  group_by(indicator) %>%
  summarise(
    kappa_stat = {
      DC_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
      DC_kappa_result <- kappa2(DC_contingency_table, weight = "unweighted")
      DC_kappa_result$value  
    },
    .groups = "drop"
  )
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `kappa_stat = { ... }`.
## ℹ In group 16: `indicator = "Pre-K_Self-regulation_and_attention_skills"`.
## Caused by warning in `sqrt()`:
## ! NaNs produced
print(DC_kappa_by_indicator)
## # A tibble: 16 × 2
##    indicator                                                        kappa_stat
##    <chr>                                                                 <dbl>
##  1 EA_On-time_graduation                                               0.664  
##  2 EA_Postsecondary_readiness                                          0.582  
##  3 K-12_Access_to_effective_teaching                                   0.459  
##  4 K-12_Access_to_high-quality_academic_supports                       0.245  
##  5 K-12_Access_to_rigorous_coursework                                  0.220  
##  6 K-12_Curricular_breadth                                            -0.00512
##  7 K-12_Engagement_in_schooling                                        0.490  
##  8 K-12_Nonacademic_supports_for_student_success                       0.466  
##  9 K-12_Nonexclusionary_discipline_practices                           0.731  
## 10 K-12_Performance_in_coursework                                      0.181  
## 11 K-12_Performance_on_tests                                           0.780  
## 12 K-12_School_climate                                                 0.435  
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation    0.555  
## 14 Pre-K_Academic_readiness                                          NaN      
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs    0.444  
## 16 Pre-K_Self-regulation_and_attention_skills                          0

Counting the number of datasets mapped to each indicator to create a plot afterwards.

DC_mapped_counts <- DC_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(indicator) %>%
  summarise(mapped_counts = n_distinct(id), .groups = "drop")

DC_mapped_counts
## # A tibble: 15 × 2
##    indicator                                                       mapped_counts
##    <chr>                                                                   <int>
##  1 EA_On-time_graduation                                                      61
##  2 EA_Postsecondary_readiness                                                 79
##  3 K-12_Access_to_effective_teaching                                           9
##  4 K-12_Access_to_high-quality_academic_supports                              38
##  5 K-12_Access_to_rigorous_coursework                                         21
##  6 K-12_Curricular_breadth                                                    10
##  7 K-12_Engagement_in_schooling                                               62
##  8 K-12_Nonacademic_supports_for_student_success                              28
##  9 K-12_Nonexclusionary_discipline_practices                                  12
## 10 K-12_Performance_in_coursework                                             21
## 11 K-12_Performance_on_tests                                                  60
## 12 K-12_School_climate                                                        23
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati…            61
## 14 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra…            37
## 15 Pre-K_Self-regulation_and_attention_skills                                  1
DC_indicator_kappa_count_merge <- DC_kappa_by_indicator %>%
  left_join(DC_mapped_counts, by = "indicator")

head(DC_indicator_kappa_count_merge)
## # A tibble: 6 × 3
##   indicator                                     kappa_stat mapped_counts
##   <chr>                                              <dbl>         <int>
## 1 EA_On-time_graduation                            0.664              61
## 2 EA_Postsecondary_readiness                       0.582              79
## 3 K-12_Access_to_effective_teaching                0.459               9
## 4 K-12_Access_to_high-quality_academic_supports    0.245              38
## 5 K-12_Access_to_rigorous_coursework               0.220              21
## 6 K-12_Curricular_breadth                         -0.00512            10
clean_labels <- sub("^[^_]+_", "", DC_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)


DC_indicator_kappa_count_merge$indicator_cleaned <- factor(
  wrapped_labels,
  levels = wrapped_labels[order(DC_indicator_kappa_count_merge$kappa_stat)]
)
DC_balloonplot <- ggballoonplot(DC_indicator_kappa_count_merge, 
                                x = "indicator_cleaned", 
                                y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
  scale_fill_viridis_c(option = "C") +  
  theme_minimal() +
  ylim(-1,1) +
  labs(
    title = "Washington D.C. Interrater Reliability by Indicator",
    subtitle = "Overall Kappa is 0.547",
    x = "Indicator",
    y = "Cohen's Kappa",
    fill = "Kappa",
    size = "Number of Datasets"
  ) +
  theme(
    axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
  ) +
  coord_flip()

DC_balloonplot
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

ggsave("20251113_IRR_kappa_balloonplot_Washington_DC.svg", plot = DC_balloonplot, width = 6.5, height = 6.5)
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
DC_heatmap_long <- DC_ab_long %>%
  mutate(weight = case_when(
    rating_a == 0 & rating_b == 0 ~ 0,  # Both rated zero = Not Rated
    (rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2,  # One rated zero, other rated indirect = Moderate Disagreement
    (rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3,  # One rated zero, other rated direct = Serious Disagreement
    (rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1,  # Both rated indirect = Agreement
    (rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1,  # Both rated direct = Agreement
    TRUE ~ NA_real_  # Handle any other cases (e.g., missing or invalid ratings)
  ))

clean_labels <- sub("^[^_]+_", "", DC_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)

DC_heatmap_long$indicator_cleaned <- wrapped_labels

DC_heatmap_long_selected <- DC_heatmap_long %>%
  select(id, indicator_cleaned, weight)

head(DC_heatmap_long_selected)
## # A tibble: 6 × 3
##   id     indicator_cleaned                           weight
##   <chr>  <chr>                                        <dbl>
## 1 DC1218 "On-time graduation"                             0
## 2 DC1218 "Postsecondary readiness"                        0
## 3 DC1218 "Access to effective teaching"                   0
## 4 DC1218 "Access to high-quality academic\nsupports"      0
## 5 DC1218 "Access to rigorous coursework"                  0
## 6 DC1218 "Curricular breadth"                             0

Dataset disagreement with marginal plots.

# 1. Set factor levels based on heatmap ordering
DC_indicator_levels <- levels(with(DC_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
DC_dataset_levels <- levels(with(DC_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
DC_agreement_heatmap <- ggplot(DC_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "Washington D.C. Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
    axis.text.y = element_text(size = 1),
    legend.text = element_text(size = 6),
    plot.title = element_text(hjust = 0.5)  
  )

# 3. Extract the legend
DC_legend <- cowplot::get_legend(DC_agreement_heatmap)
DC_legend_plot <- wrap_elements(DC_legend)

# 4. Remove legend from main plot
DC_main_plot <- DC_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))

# 5. Create top marginal plot (stacked bar chart by indicator)
DC_top_plot <- DC_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = DC_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  scale_x_discrete(drop = FALSE) +
  theme_minimal() +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    legend.position = "none"
  )

# 6. Create right marginal plot (stacked bar chart by dataset)
DC_right_plot <- DC_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  mutate(id = factor(id, levels = DC_dataset_levels)) %>%
  ggplot(aes(x = id, fill = factor(weight))) +
  geom_bar(position = "stack") +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
  theme_minimal() +
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )

DC_final_plot <- DC_top_plot + DC_legend + DC_main_plot + DC_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))

DC_final_plot

ggsave("20251113_agreement_heatmap_with_marginals_Washington_DC.svg", plot = DC_final_plot, width = 8, height = 12)

Creating the png.

# 1. Set factor levels based on heatmap ordering
DC_indicator_levels <- levels(with(DC_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
DC_dataset_levels <- levels(with(DC_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
DC_agreement_heatmap <- ggplot(DC_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "Washington D.C. Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
    axis.text.y = element_text(size = 30),
    axis.title.y = element_text(size = 20),
    axis.title.x = element_text(size = 20),
    legend.text = element_text(size = 20),
    legend.title = element_text(size = 20),
    plot.title = element_text(size = 25, hjust = 0.5)  
  )

# 3. Extract the legend
DC_legend <- cowplot::get_legend(DC_agreement_heatmap)

# 4. Remove legend from main plot and adjust bar width
DC_main_plot <- DC_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
                                             axis.title.y = element_text(size = 20),
                                             axis.title.x = element_text(size = 20), hjust = 0.5)

# 5. Create top marginal plot 
DC_top_plot <- DC_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = DC_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_text(size = 20),
    legend.position = "none")

# 6. Create right marginal plot
DC_right_data <- DC_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  count(id, weight) %>%
  mutate(id = factor(id, levels = DC_dataset_levels))

DC_right_plot <- ggplot(DC_right_data, aes(x = id, y = n, fill = factor(weight))) +
  geom_col(position = "stack", width = 1, colour = NA) +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +  
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),  
    axis.title.x = element_text(size = 20), 
    legend.position = "none",
    panel.spacing = unit(0, "lines")
  )

DC_final_plot3 <- DC_top_plot + DC_legend + DC_main_plot + DC_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))

DC_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

ggsave("20251113_agreement_heatmap_with_marginals_Washington_DC.png", plot = DC_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.

DC_ab_directindirect <- DC_ab_long %>%
  select(id, dataset_name, indicator, rating_a, rating_b) %>%
  mutate(across(c(rating_a, rating_b), ~ case_when(
    . == 2 ~ 1,
    . %in% c(3,4) ~ 2,
    TRUE ~ .
  )))

DC_ratings_kappaweighted <- DC_ab_directindirect %>%
  select(rating_a, rating_b)
DC_kappa_weighted <- kappa2(DC_ratings_kappaweighted, weight = "equal")
print(DC_kappa_weighted)
##  Cohen's Kappa for 2 Raters (Weights: equal)
## 
##  Subjects = 5024 
##    Raters = 2 
##     Kappa = 0.638 
## 
##         z = 48.9 
##   p-value = 0
DC_weighted_kappa_by_indicator <- DC_ab_directindirect %>%
  group_by(indicator) %>%
  summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
## Warning: There was 1 warning in `summarise()`.
## ℹ In argument: `weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a,
##   rating_b)), weight = "equal")$value`.
## ℹ In group 16: `indicator = "Pre-K_Self-regulation_and_attention_skills"`.
## Caused by warning in `sqrt()`:
## ! NaNs produced
DC_weighted_kappa_by_indicator
## # A tibble: 16 × 2
##    indicator                                                      weighted_kappa
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                 0.787  
##  2 EA_Postsecondary_readiness                                            0.708  
##  3 K-12_Access_to_effective_teaching                                     0.795  
##  4 K-12_Access_to_high-quality_academic_supports                         0.475  
##  5 K-12_Access_to_rigorous_coursework                                    0.239  
##  6 K-12_Curricular_breadth                                              -0.00569
##  7 K-12_Engagement_in_schooling                                          0.552  
##  8 K-12_Nonacademic_supports_for_student_success                         0.574  
##  9 K-12_Nonexclusionary_discipline_practices                             0.729  
## 10 K-12_Performance_in_coursework                                        0.202  
## 11 K-12_Performance_on_tests                                             0.852  
## 12 K-12_School_climate                                                   0.444  
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat…        0.602  
## 14 Pre-K_Academic_readiness                                             NA      
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr…        0.475  
## 16 Pre-K_Self-regulation_and_attention_skills                            0
IRRstatistics_Washington_DC <- DC_indicator_kappa_count_merge %>%
  left_join(DC_weighted_kappa_by_indicator, by = "indicator") %>%
  left_join(DC_percent_agreement, by = "indicator") %>%
  select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)

indicator_order <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs", 
  "K-12_Access_to_effective_teaching",
  "K-12_Access_to_rigorous_coursework",
  "K-12_Curricular_breadth",
  "K-12_Access_to_high-quality_academic_supports",
  "K-12_School_climate",
  "K-12_Nonexclusionary_discipline_practices",
  "K-12_Nonacademic_supports_for_student_success",
  "Pre-K_Academic_readiness",
  "Pre-K_Self-regulation_and_attention_skills",
  "K-12_Engagement_in_schooling",
  "K-12_Performance_in_coursework",
  "K-12_Performance_on_tests",
  "EA_On-time_graduation",
  "EA_Postsecondary_readiness"
)

indicator_name_mapping <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
  "K-12_Access_to_effective_teaching" = "Effective Teaching",
  "K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
  "K-12_Curricular_breadth" = "Curricular Breadth",
  "K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
  "K-12_School_climate" = "School Climate",
  "K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
  "K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
  "Pre-K_Academic_readiness" = "Academic Readiness",
  "Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
  "K-12_Engagement_in_schooling" = "Engagement in Schooling",
  "K-12_Performance_in_coursework" = "Performance in Coursework",
  "K-12_Performance_on_tests" = "Performance on Tests",
  "EA_On-time_graduation" = "On-Time Graduation",
  "EA_Postsecondary_readiness" = "Postsecondary Readiness"
)

IRRstatistics_Washington_DC_ordered <- IRRstatistics_Washington_DC %>%
  arrange(match(indicator, indicator_order)) %>%
  mutate(indicator_name = indicator_name_mapping[indicator]) %>%
  select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
    rename(
      "Indicator" = indicator_name,
      "Mapped Datasets" = mapped_counts,
      "Percent Agreement" = percent_agreement,
      "Cohen's Kappa" = kappa_stat,
      "Cohen's Weighted Kappa" = weighted_kappa) %>%
  mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))

kable(IRRstatistics_Washington_DC_ordered)
Indicator Mapped Datasets Percent Agreement Cohen’s Kappa Cohen’s Weighted Kappa
Racial, Ethnic, and Economic Segregation 61 88.53503 0.5546450 0.6016207
High-Quality Pre-K Programs 37 90.44586 0.4438541 0.4750064
Effective Teaching 9 97.45223 0.4589705 0.7953064
Rigorous Coursework 21 93.94904 0.2197227 0.2393844
Curricular Breadth 10 96.81529 -0.0051216 -0.0056940
High-Quality Academic Supports 38 88.21656 0.2451923 0.4748088
School Climate 23 94.58599 0.4351323 0.4436913
Nonexclusionary Discipline Practices 12 98.40764 0.7308879 0.7288428
Nonacademic Supports for Student Success 28 93.63057 0.4655774 0.5742266
Academic Readiness NA 100.00000 NA NA
Self-Regulation and Attention Skills 1 99.68153 0.0000000 0.0000000
Engagement in Schooling 62 86.30573 0.4896625 0.5515271
Performance in Coursework 21 93.63057 0.1811188 0.2022358
Performance on Tests 60 92.99363 0.7795507 0.8524543
On-Time Graduation 61 90.12739 0.6636489 0.7872374
Postsecondary Readiness 79 85.98726 0.5821437 0.7083194
write.csv(IRRstatistics_Washington_DC_ordered, "20251113_IRRstatistics_Washington_DC.csv", row.names = FALSE)

KENTUCKY

Filtering for only Kentucky datasets.

KY_rater_ab <- rater_ab %>%
  filter(str_starts(id, "KY"))

str(KY_rater_ab)
## tibble [1,784 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:1784] "KY1218" "KY1218" "KY1219" "KY1219" ...
##  $ rater           : chr [1:1784] "R562" "R625" "R562" "R625" ...
##  $ validation_round: chr [1:1784] "A" "B" "A" "B" ...
##  $ dataset_name    : chr [1:1784] NA NA "english_learners_attainment_2021.csv" "english_learners_attainment_2021.csv" ...
##  $ indicator       : chr [1:1784] "none" NA "K-12_Performance_on_tests" "K-12_Performance_on_tests" ...
##  $ rating          : num [1:1784] 0 0 1 1 4 4 4 4 4 4 ...
KY_duplicates_check <- KY_rater_ab %>%
  group_by(id, indicator, validation_round) %>%
  filter(n() > 1) %>%
  ungroup()

# View the duplicates
print(KY_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## #   dataset_name <chr>, indicator <chr>, rating <dbl>

Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).

KY_rater_ab <- KY_rater_ab %>%
  filter(!(id %in% c("KY1772", "KY1773")
  ))

str(KY_rater_ab)
## tibble [1,780 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:1780] "KY1218" "KY1218" "KY1219" "KY1219" ...
##  $ rater           : chr [1:1780] "R562" "R625" "R562" "R625" ...
##  $ validation_round: chr [1:1780] "A" "B" "A" "B" ...
##  $ dataset_name    : chr [1:1780] NA NA "english_learners_attainment_2021.csv" "english_learners_attainment_2021.csv" ...
##  $ indicator       : chr [1:1780] "none" NA "K-12_Performance_on_tests" "K-12_Performance_on_tests" ...
##  $ rating          : num [1:1780] 0 0 1 1 4 4 4 4 4 4 ...

Making sure each id, indicator, validator combination is present.

KY_expanded_grid <- KY_rater_ab %>%
  distinct(id, dataset_name) %>%
  crossing(indicators_df, validation_round = c("A", "B"))

head(KY_expanded_grid)
## # A tibble: 6 × 4
##   id     dataset_name indicator                         validation_round
##   <chr>  <chr>        <chr>                             <chr>           
## 1 KY1218 <NA>         EA_On-time_graduation             A               
## 2 KY1218 <NA>         EA_On-time_graduation             B               
## 3 KY1218 <NA>         EA_Postsecondary_readiness        A               
## 4 KY1218 <NA>         EA_Postsecondary_readiness        B               
## 5 KY1218 <NA>         K-12_Access_to_effective_teaching A               
## 6 KY1218 <NA>         K-12_Access_to_effective_teaching B

Making sure that id/indicator combinations that were not rated are represented as zero.

KY_ratings_with_zeros <- KY_expanded_grid %>%
  left_join(KY_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating), 
            by = c("id", "dataset_name", "indicator", "validation_round")) %>%
  mutate(rating = replace_na(rating, 0))
KY_ab_long <- KY_ratings_with_zeros %>%
  pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
  rename(rating_a = rating_A, rating_b = rating_B) %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

head(KY_ab_long)
## # A tibble: 6 × 6
##   id     dataset_name indicator                      rating_a rating_b agreement
##   <chr>  <chr>        <chr>                             <dbl>    <dbl>     <dbl>
## 1 KY1218 <NA>         EA_On-time_graduation                 0        0         1
## 2 KY1218 <NA>         EA_Postsecondary_readiness            0        0         1
## 3 KY1218 <NA>         K-12_Access_to_effective_teac…        0        0         1
## 4 KY1218 <NA>         K-12_Access_to_high-quality_a…        0        0         1
## 5 KY1218 <NA>         K-12_Access_to_rigorous_cours…        0        0         1
## 6 KY1218 <NA>         K-12_Curricular_breadth               0        0         1
KY_dataset_agreement <- KY_ab_long %>%
  group_by(id) %>%
  summarise(percent_agreement = mean(agreement) * 100)

KY_dataset_agreement <- KY_dataset_agreement %>%
  ungroup()

n_distinct(KY_dataset_agreement$id)
## [1] 576
head(KY_dataset_agreement)
## # A tibble: 6 × 2
##   id     percent_agreement
##   <chr>              <dbl>
## 1 KY1218             100  
## 2 KY1219             100  
## 3 KY1220              81.2
## 4 KY1221             100  
## 5 KY1222              87.5
## 6 KY1223              87.5
write.csv(KY_dataset_agreement, "20251113_percent_agreement_by_id_Kentucky.csv", row.names = FALSE)
KY_indicators_mapped_per_id <- KY_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(id) %>%
  summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
  right_join(KY_ab_long %>% distinct(id), by = "id") %>%
  mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))

KY_dataset_agreement_count <- KY_dataset_agreement %>%
  left_join(KY_indicators_mapped_per_id, by = "id")

head(KY_dataset_agreement_count)
## # A tibble: 6 × 3
##   id     percent_agreement indicators_mapped_per_id
##   <chr>              <dbl>                    <int>
## 1 KY1218             100                          0
## 2 KY1219             100                          1
## 3 KY1220              81.2                        4
## 4 KY1221             100                          1
## 5 KY1222              87.5                        2
## 6 KY1223              87.5                        2
KY_dataset_agreement_count <- KY_dataset_agreement_count %>%
  arrange(desc(percent_agreement), (indicators_mapped_per_id))

KY_dataset_agreement_count$id <- factor(KY_dataset_agreement_count$id, levels = unique(KY_dataset_agreement_count$id))

KY_dataset_agreement_count$indicators_mapped_per_id <- 
  factor(KY_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
KY_percentagreement_barplot <- ggplot(KY_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
  geom_bar(stat = "identity") +
  theme_minimal() + 
  scale_fill_viridis_d(direction = -1) +
  scale_y_continuous(expand = c(0, 0)) +
  labs(
    title = "Kentucky - Percentage Agreement by Dataset",
    x = "Dataset",
    y = "Agreement (%)",
    fill = "Number of Indicators"
  ) +
  theme(
    axis.text.y = element_text(size = 1)
  ) +
  coord_flip()

KY_percentagreement_barplot

ggsave("20251113_percentagreement_barplot_Kentucky.svg", plot = KY_percentagreement_barplot, width = 6.5, height = 6.5)

Printing results for chart in findings section.

KY_percent_agreement <- KY_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
  group_by(indicator) %>%
  summarise(percent_agreement = mean(agreement) * 100)

KY_percent_agreement
## # A tibble: 16 × 2
##    indicator                                                   percent_agreement
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                    92.0
##  2 EA_Postsecondary_readiness                                               88.2
##  3 K-12_Access_to_effective_teaching                                        95.1
##  4 K-12_Access_to_high-quality_academic_supports                            97.6
##  5 K-12_Access_to_rigorous_coursework                                       95.7
##  6 K-12_Curricular_breadth                                                  93.1
##  7 K-12_Engagement_in_schooling                                             93.9
##  8 K-12_Nonacademic_supports_for_student_success                            97.2
##  9 K-12_Nonexclusionary_discipline_practices                                96.5
## 10 K-12_Performance_in_coursework                                           91.1
## 11 K-12_Performance_on_tests                                                91.8
## 12 K-12_School_climate                                                      92.9
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre…              89.9
## 14 Pre-K_Academic_readiness                                                 99.0
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr…              96.4
## 16 Pre-K_Self-regulation_and_attention_skills                              100
KY_ab_long <- KY_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

Calculating Cohen’s kappa, first for the whole dataset, then by indicator.

KY_contingency_table <- data.frame(KY_ab_long$rating_a, KY_ab_long$rating_b)
KY_kappa_result <- kappa2(KY_contingency_table, weight = "unweighted")
print(KY_kappa_result)
##  Cohen's Kappa for 2 Raters (Weights: unweighted)
## 
##  Subjects = 9216 
##    Raters = 2 
##     Kappa = 0.682 
## 
##         z = 98 
##   p-value = 0
KY_kappa_by_indicator <- KY_ab_long %>%
  group_by(indicator) %>%
  summarise(
    kappa_stat = {
      KY_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
      KY_kappa_result <- kappa2(KY_contingency_table, weight = "unweighted")
      KY_kappa_result$value  
    },
    .groups = "drop"
  )

print(KY_kappa_by_indicator)
## # A tibble: 16 × 2
##    indicator                                                        kappa_stat
##    <chr>                                                                 <dbl>
##  1 EA_On-time_graduation                                                 0.700
##  2 EA_Postsecondary_readiness                                            0.693
##  3 K-12_Access_to_effective_teaching                                     0.744
##  4 K-12_Access_to_high-quality_academic_supports                         0.688
##  5 K-12_Access_to_rigorous_coursework                                    0.636
##  6 K-12_Curricular_breadth                                               0.498
##  7 K-12_Engagement_in_schooling                                          0.679
##  8 K-12_Nonacademic_supports_for_student_success                         0.489
##  9 K-12_Nonexclusionary_discipline_practices                             0.363
## 10 K-12_Performance_in_coursework                                        0.545
## 11 K-12_Performance_on_tests                                             0.789
## 12 K-12_School_climate                                                   0.700
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation      0.620
## 14 Pre-K_Academic_readiness                                              0.745
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs      0.521
## 16 Pre-K_Self-regulation_and_attention_skills                            1

Counting the number of datasets mapped to each indicator to create a plot afterwards.

KY_mapped_counts <- KY_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(indicator) %>%
  summarise(mapped_counts = n_distinct(id), .groups = "drop")

KY_mapped_counts
## # A tibble: 16 × 2
##    indicator                                                       mapped_counts
##    <chr>                                                                   <int>
##  1 EA_On-time_graduation                                                      97
##  2 EA_Postsecondary_readiness                                                159
##  3 K-12_Access_to_effective_teaching                                          67
##  4 K-12_Access_to_high-quality_academic_supports                              30
##  5 K-12_Access_to_rigorous_coursework                                         47
##  6 K-12_Curricular_breadth                                                    56
##  7 K-12_Engagement_in_schooling                                               72
##  8 K-12_Nonacademic_supports_for_student_success                              22
##  9 K-12_Nonexclusionary_discipline_practices                                  26
## 10 K-12_Performance_in_coursework                                             82
## 11 K-12_Performance_on_tests                                                 144
## 12 K-12_School_climate                                                        92
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati…           106
## 14 Pre-K_Academic_readiness                                                   15
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra…            33
## 16 Pre-K_Self-regulation_and_attention_skills                                 11
KY_indicator_kappa_count_merge <- KY_kappa_by_indicator %>%
  left_join(KY_mapped_counts, by = "indicator")

head(KY_indicator_kappa_count_merge)
## # A tibble: 6 × 3
##   indicator                                     kappa_stat mapped_counts
##   <chr>                                              <dbl>         <int>
## 1 EA_On-time_graduation                              0.700            97
## 2 EA_Postsecondary_readiness                         0.693           159
## 3 K-12_Access_to_effective_teaching                  0.744            67
## 4 K-12_Access_to_high-quality_academic_supports      0.688            30
## 5 K-12_Access_to_rigorous_coursework                 0.636            47
## 6 K-12_Curricular_breadth                            0.498            56
clean_labels <- sub("^[^_]+_", "", KY_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)


KY_indicator_kappa_count_merge$indicator_cleaned <- factor(
  wrapped_labels,
  levels = wrapped_labels[order(KY_indicator_kappa_count_merge$kappa_stat)]
)
KY_balloonplot <- ggballoonplot(KY_indicator_kappa_count_merge, 
                                x = "indicator_cleaned", 
                                y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
  scale_fill_viridis_c(option = "C") +  
  theme_minimal() +
  ylim(-1,1) +
  labs(
    title = "Kentucky Interrater Reliability by Indicator",
    subtitle = "Overall Kappa is 0.682",
    x = "Indicator",
    y = "Cohen's Kappa",
    fill = "Kappa",
    size = "Number of Datasets"
  ) +
  theme(
    axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
  ) +
  coord_flip()

KY_balloonplot

ggsave("20251113_IRR_kappa_balloonplot_Kentucky.svg", plot = KY_balloonplot, width = 6.5, height = 6.5)
KY_heatmap_long <- KY_ab_long %>%
  mutate(weight = case_when(
    rating_a == 0 & rating_b == 0 ~ 0,  # Both rated zero = Not Rated
    (rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2,  # One rated zero, other rated indirect = Moderate Disagreement
    (rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3,  # One rated zero, other rated direct = Serious Disagreement
    (rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1,  # Both rated indirect = Agreement
    (rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1,  # Both rated direct = Agreement
    TRUE ~ NA_real_  # Handle any other cases (e.g., missing or invalid ratings)
  ))

clean_labels <- sub("^[^_]+_", "", KY_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)

KY_heatmap_long$indicator_cleaned <- wrapped_labels

KY_heatmap_long_selected <- KY_heatmap_long %>%
  select(id, indicator_cleaned, weight)

head(KY_heatmap_long_selected)
## # A tibble: 6 × 3
##   id     indicator_cleaned                           weight
##   <chr>  <chr>                                        <dbl>
## 1 KY1218 "On-time graduation"                             0
## 2 KY1218 "Postsecondary readiness"                        0
## 3 KY1218 "Access to effective teaching"                   0
## 4 KY1218 "Access to high-quality academic\nsupports"      0
## 5 KY1218 "Access to rigorous coursework"                  0
## 6 KY1218 "Curricular breadth"                             0

Dataset disagreement with marginal plots.

# 1. Set factor levels based on heatmap ordering
KY_indicator_levels <- levels(with(KY_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
KY_dataset_levels <- levels(with(KY_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
KY_agreement_heatmap <- ggplot(KY_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "Kentucky Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
    axis.text.y = element_text(size = 1),
    legend.text = element_text(size = 6),
    plot.title = element_text(hjust = 0.5)  
  )

# 3. Extract the legend
KY_legend <- cowplot::get_legend(KY_agreement_heatmap)
KY_legend_plot <- wrap_elements(KY_legend)

# 4. Remove legend from main plot
KY_main_plot <- KY_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))

# 5. Create top marginal plot (stacked bar chart by indicator)
KY_top_plot <- KY_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = KY_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  scale_x_discrete(drop = FALSE) +
  theme_minimal() +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    legend.position = "none"
  )

# 6. Create right marginal plot (stacked bar chart by dataset)
KY_right_plot <- KY_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  mutate(id = factor(id, levels = KY_dataset_levels)) %>%
  ggplot(aes(x = id, fill = factor(weight))) +
  geom_bar(position = "stack") +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
  theme_minimal() +
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )

KY_final_plot <- KY_top_plot + KY_legend + KY_main_plot + KY_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))

KY_final_plot

ggsave("20251113_agreement_heatmap_with_marginals_Kentucky.svg", plot = KY_final_plot, width = 8, height = 12)

Creating the png.

    # 1. Set factor levels based on heatmap ordering
    KY_indicator_levels <- levels(with(KY_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
    KY_dataset_levels <- levels(with(KY_heatmap_long_selected, reorder(id, -weight)))
    
    # 2. Create the main heatmap
    KY_agreement_heatmap <- ggplot(KY_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                                 y = reorder(id, -weight),
                                                                 fill = factor(weight))) +
      geom_tile() +
      scale_fill_manual(
        values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
        labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
      ) +
      labs(
        title = "Kentucky Disagreement Heatmap",
        x = "Indicator",
        y = "Dataset Name",
        fill = "Agreement Type"
      ) +
      theme_minimal() +
      theme(
        axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
        axis.text.y = element_text(size = 30),
        axis.title.y = element_text(size = 20),
        axis.title.x = element_text(size = 20),
        legend.text = element_text(size = 20),
        legend.title = element_text(size = 20),
        plot.title = element_text(size = 25, hjust = 0.5)  
      )
    
    # 3. Extract the legend
    KY_legend <- cowplot::get_legend(KY_agreement_heatmap)
    
    # 4. Remove legend from main plot and adjust bar width
    KY_main_plot <- KY_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
                                                 axis.title.y = element_text(size = 20),
                                                 axis.title.x = element_text(size = 20), hjust = 0.5)
    
    # 5. Create top marginal plot 
    KY_top_plot <- KY_heatmap_long_selected %>%
      filter(weight != 0) %>%
      mutate(indicator_cleaned = factor(indicator_cleaned, levels = KY_indicator_levels)) %>%
      ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
      geom_bar(position = "stack", width = 1) +  
      scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
      theme_minimal() +
      labs(y = "Count") +
      theme(
        axis.text.x = element_blank(),
        axis.ticks.x = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_text(size = 20),
        legend.position = "none")
    
    # 6. Create right marginal plot
    KY_right_data <- KY_heatmap_long_selected %>%
      filter(weight %in% c(1, 2, 3)) %>%
      count(id, weight) %>%
      mutate(id = factor(id, levels = KY_dataset_levels))
    
    KY_right_plot <- ggplot(KY_right_data, aes(x = id, y = n, fill = factor(weight))) +
      geom_col(position = "stack", width = 1, colour = NA) +
      coord_flip() +
      scale_y_continuous(breaks = seq(0, 14, by = 2)) +
      scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
      theme_minimal() +
      labs(y = "Count") +  
      theme(
        panel.background = element_blank(),
        panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks.y = element_blank(),
        axis.title.y = element_blank(),  
        axis.title.x = element_text(size = 20), 
        legend.position = "none",
        panel.spacing = unit(0, "lines")
      )
    
    KY_final_plot3 <- KY_top_plot + KY_legend + KY_main_plot + KY_right_plot + 
      plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))
    
    KY_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

    ggsave("20251113_agreement_heatmap_with_marginals_Kentucky.png", plot = KY_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.

    KY_ab_directindirect <- KY_ab_long %>%
      select(id, dataset_name, indicator, rating_a, rating_b) %>%
      mutate(across(c(rating_a, rating_b), ~ case_when(
        . == 2 ~ 1,
        . %in% c(3,4) ~ 2,
        TRUE ~ .
      )))
    
    KY_ratings_kappaweighted <- KY_ab_directindirect %>%
      select(rating_a, rating_b)
    KY_kappa_weighted <- kappa2(KY_ratings_kappaweighted, weight = "equal")
    print(KY_kappa_weighted)
##  Cohen's Kappa for 2 Raters (Weights: equal)
## 
##  Subjects = 9216 
##    Raters = 2 
##     Kappa = 0.738 
## 
##         z = 77.6 
##   p-value = 0
    KY_weighted_kappa_by_indicator <- KY_ab_directindirect %>%
      group_by(indicator) %>%
      summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)
    
    KY_weighted_kappa_by_indicator
## # A tibble: 16 × 2
##    indicator                                                      weighted_kappa
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                   0.771
##  2 EA_Postsecondary_readiness                                              0.756
##  3 K-12_Access_to_effective_teaching                                       0.893
##  4 K-12_Access_to_high-quality_academic_supports                           0.586
##  5 K-12_Access_to_rigorous_coursework                                      0.645
##  6 K-12_Curricular_breadth                                                 0.603
##  7 K-12_Engagement_in_schooling                                            0.799
##  8 K-12_Nonacademic_supports_for_student_success                           0.522
##  9 K-12_Nonexclusionary_discipline_practices                               0.388
## 10 K-12_Performance_in_coursework                                          0.617
## 11 K-12_Performance_on_tests                                               0.849
## 12 K-12_School_climate                                                     0.734
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat…          0.646
## 14 Pre-K_Academic_readiness                                                0.745
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr…          0.402
## 16 Pre-K_Self-regulation_and_attention_skills                              1
IRRstatistics_Kentucky <- KY_indicator_kappa_count_merge %>%
  left_join(KY_weighted_kappa_by_indicator, by = "indicator") %>%
  left_join(KY_percent_agreement, by = "indicator") %>%
  select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)
    
indicator_order <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs", 
  "K-12_Access_to_effective_teaching",
  "K-12_Access_to_rigorous_coursework",
  "K-12_Curricular_breadth",
  "K-12_Access_to_high-quality_academic_supports",
  "K-12_School_climate",
  "K-12_Nonexclusionary_discipline_practices",
  "K-12_Nonacademic_supports_for_student_success",
  "Pre-K_Academic_readiness",
  "Pre-K_Self-regulation_and_attention_skills",
  "K-12_Engagement_in_schooling",
  "K-12_Performance_in_coursework",
  "K-12_Performance_on_tests",
  "EA_On-time_graduation",
  "EA_Postsecondary_readiness"
)

indicator_name_mapping <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
  "K-12_Access_to_effective_teaching" = "Effective Teaching",
  "K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
  "K-12_Curricular_breadth" = "Curricular Breadth",
  "K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
  "K-12_School_climate" = "School Climate",
  "K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
  "K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
  "Pre-K_Academic_readiness" = "Academic Readiness",
  "Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
  "K-12_Engagement_in_schooling" = "Engagement in Schooling",
  "K-12_Performance_in_coursework" = "Performance in Coursework",
  "K-12_Performance_on_tests" = "Performance on Tests",
  "EA_On-time_graduation" = "On-Time Graduation",
  "EA_Postsecondary_readiness" = "Postsecondary Readiness"
)

IRRstatistics_Kentucky_ordered <- IRRstatistics_Kentucky %>%
  arrange(match(indicator, indicator_order)) %>%
  mutate(indicator_name = indicator_name_mapping[indicator]) %>%
  select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
    rename(
      "Indicator" = indicator_name,
      "Mapped Datasets" = mapped_counts,
      "Percent Agreement" = percent_agreement,
      "Cohen's Kappa" = kappa_stat,
      "Cohen's Weighted Kappa" = weighted_kappa) %>%
  mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))

kable(IRRstatistics_Kentucky_ordered)
Indicator Mapped Datasets Percent Agreement Cohen’s Kappa Cohen’s Weighted Kappa
Racial, Ethnic, and Economic Segregation 106 89.93056 0.6197717 0.6459107
High-Quality Pre-K Programs 33 96.35417 0.5213675 0.4020385
Effective Teaching 67 95.13889 0.7437926 0.8931468
Rigorous Coursework 47 95.65972 0.6364187 0.6451993
Curricular Breadth 56 93.05556 0.4983234 0.6027915
High-Quality Academic Supports 30 97.56944 0.6879981 0.5862281
School Climate 92 92.88194 0.6998322 0.7344902
Nonexclusionary Discipline Practices 26 96.52778 0.3628319 0.3875236
Nonacademic Supports for Student Success 22 97.22222 0.4889086 0.5222749
Academic Readiness 15 98.95833 0.7448317 0.7448317
Self-Regulation and Attention Skills 11 100.00000 1.0000000 1.0000000
Engagement in Schooling 72 93.92361 0.6794097 0.7991544
Performance in Coursework 82 91.14583 0.5452913 0.6165257
Performance on Tests 144 91.84028 0.7894409 0.8491343
On-Time Graduation 97 92.01389 0.7000136 0.7705538
Postsecondary Readiness 159 88.19444 0.6933388 0.7559226
write.csv(IRRstatistics_Kentucky_ordered, "20251113_IRRstatistics_Kentucky.csv", row.names = FALSE)

MARYLAND

Filtering for only Maryland datasets.

MD_rater_ab <- rater_ab %>%
  filter(str_starts(id, "MD"))

str(MD_rater_ab)
## tibble [1,719 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:1719] "MD1218" "MD1218" "MD1218" "MD1218" ...
##  $ rater           : chr [1:1719] "R514" "R625" "R514" "R514" ...
##  $ validation_round: chr [1:1719] "A" "B" "A" "A" ...
##  $ dataset_name    : chr [1:1719] "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" ...
##  $ indicator       : chr [1:1719] "EA_On-time_graduation" "EA_On-time_graduation" "EA_Postsecondary_readiness" "K-12_Access_to_rigorous_coursework" ...
##  $ rating          : num [1:1719] 3 3 3 3 3 3 3 3 3 3 ...
MD_duplicates_check <- MD_rater_ab %>%
  group_by(id, indicator, validation_round) %>%
  filter(n() > 1) %>%
  ungroup()

# View the duplicates
print(MD_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## #   dataset_name <chr>, indicator <chr>, rating <dbl>

Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).

MD_rater_ab <- MD_rater_ab %>%
  filter(!(id %in% c("MD1597", "MD1600", "MD1603", "MD1606", "MD1609", "MD1612", "MD1615", "MD1618", "MD1621", "MD1625", "MD1629", "MD1633", "MD1637", "MD1638", "MD1639", "MD1640", "MD1644", "MD1645", "MD1646", "MD1647", "MD1651", "MD1652", "MD1653", "MD1654", "MD1660", "MD1663", "MD1664", "MD1665", "MD1668", "MD1677", "MD1678", "MD1679", "MD1688", "MD1689", "MD1690", "MD1701", "MD1702", "MD1703", "MD1704", "MD1705", "MD1706")
  ))

str(MD_rater_ab)
## tibble [1,636 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:1636] "MD1218" "MD1218" "MD1218" "MD1218" ...
##  $ rater           : chr [1:1636] "R514" "R625" "R514" "R514" ...
##  $ validation_round: chr [1:1636] "A" "B" "A" "A" ...
##  $ dataset_name    : chr [1:1636] "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" "2019 Accountability Detail.csv" ...
##  $ indicator       : chr [1:1636] "EA_On-time_graduation" "EA_On-time_graduation" "EA_Postsecondary_readiness" "K-12_Access_to_rigorous_coursework" ...
##  $ rating          : num [1:1636] 3 3 3 3 3 3 3 3 3 3 ...

Making sure each id, indicator, validator combination is present.

MD_expanded_grid <- MD_rater_ab %>%
  distinct(id, dataset_name) %>%
  crossing(indicators_df, validation_round = c("A", "B"))

head(MD_expanded_grid)
## # A tibble: 6 × 4
##   id     dataset_name                   indicator               validation_round
##   <chr>  <chr>                          <chr>                   <chr>           
## 1 MD1218 2019 Accountability Detail.csv EA_On-time_graduation   A               
## 2 MD1218 2019 Accountability Detail.csv EA_On-time_graduation   B               
## 3 MD1218 2019 Accountability Detail.csv EA_Postsecondary_readi… A               
## 4 MD1218 2019 Accountability Detail.csv EA_Postsecondary_readi… B               
## 5 MD1218 2019 Accountability Detail.csv K-12_Access_to_effecti… A               
## 6 MD1218 2019 Accountability Detail.csv K-12_Access_to_effecti… B

Making sure that id/indicator combinations that were not rated are represented as zero.

MD_ratings_with_zeros <- MD_expanded_grid %>%
  left_join(MD_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating), 
            by = c("id", "dataset_name", "indicator", "validation_round")) %>%
  mutate(rating = replace_na(rating, 0))
MD_ab_long <- MD_ratings_with_zeros %>%
  pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
  rename(rating_a = rating_A, rating_b = rating_B) %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

head(MD_ab_long)
## # A tibble: 6 × 6
##   id     dataset_name                   indicator    rating_a rating_b agreement
##   <chr>  <chr>                          <chr>           <dbl>    <dbl>     <dbl>
## 1 MD1218 2019 Accountability Detail.csv EA_On-time_…        3        3         1
## 2 MD1218 2019 Accountability Detail.csv EA_Postseco…        3        0         0
## 3 MD1218 2019 Accountability Detail.csv K-12_Access…        0        0         1
## 4 MD1218 2019 Accountability Detail.csv K-12_Access…        0        0         1
## 5 MD1218 2019 Accountability Detail.csv K-12_Access…        3        0         0
## 6 MD1218 2019 Accountability Detail.csv K-12_Curric…        0        3         0
MD_dataset_agreement <- MD_ab_long %>%
  group_by(id) %>%
  summarise(percent_agreement = mean(agreement) * 100)

MD_dataset_agreement <- MD_dataset_agreement %>%
  ungroup()

n_distinct(MD_dataset_agreement$id)
## [1] 484
head(MD_dataset_agreement)
## # A tibble: 6 × 2
##   id     percent_agreement
##   <chr>              <dbl>
## 1 MD1218              75  
## 2 MD1219             100  
## 3 MD1220              87.5
## 4 MD1221             100  
## 5 MD1222             100  
## 6 MD1223             100
write.csv(MD_dataset_agreement, "20251113_percent_agreement_by_id_Maryland.csv", row.names = FALSE)
MD_indicators_mapped_per_id <- MD_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(id) %>%
  summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
  right_join(MD_ab_long %>% distinct(id), by = "id") %>%
  mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))

MD_dataset_agreement_count <- MD_dataset_agreement %>%
  left_join(MD_indicators_mapped_per_id, by = "id")

head(MD_dataset_agreement_count)
## # A tibble: 6 × 3
##   id     percent_agreement indicators_mapped_per_id
##   <chr>              <dbl>                    <int>
## 1 MD1218              75                          7
## 2 MD1219             100                          2
## 3 MD1220              87.5                        3
## 4 MD1221             100                          3
## 5 MD1222             100                          1
## 6 MD1223             100                          1
MD_dataset_agreement_count <- MD_dataset_agreement_count %>%
  arrange(desc(percent_agreement), (indicators_mapped_per_id))

MD_dataset_agreement_count$id <- factor(MD_dataset_agreement_count$id, levels = unique(MD_dataset_agreement_count$id))

MD_dataset_agreement_count$indicators_mapped_per_id <- 
  factor(MD_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
MD_percentagreement_barplot <- ggplot(MD_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
  geom_bar(stat = "identity") +
  theme_minimal() + 
  scale_fill_viridis_d(direction = -1) +
  scale_y_continuous(expand = c(0, 0)) +
  labs(
    title = "Maryland - Percentage Agreement by Dataset",
    x = "Dataset",
    y = "Agreement (%)",
    fill = "Number of Indicators"
  ) +
  theme(
    axis.text.y = element_text(size = 1)
  ) +
  coord_flip()

MD_percentagreement_barplot

ggsave("20251113_percentagreement_barplot_Maryland.svg", plot = MD_percentagreement_barplot, width = 6.5, height = 6.5)

Printing results for chart in findings section.

MD_percent_agreement <- MD_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
  group_by(indicator) %>%
  summarise(percent_agreement = mean(agreement) * 100)

MD_percent_agreement
## # A tibble: 16 × 2
##    indicator                                                   percent_agreement
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                    91.9
##  2 EA_Postsecondary_readiness                                               88.2
##  3 K-12_Access_to_effective_teaching                                        95.2
##  4 K-12_Access_to_high-quality_academic_supports                            97.7
##  5 K-12_Access_to_rigorous_coursework                                       90.7
##  6 K-12_Curricular_breadth                                                  91.1
##  7 K-12_Engagement_in_schooling                                             75.6
##  8 K-12_Nonacademic_supports_for_student_success                            98.1
##  9 K-12_Nonexclusionary_discipline_practices                               100  
## 10 K-12_Performance_in_coursework                                           94.6
## 11 K-12_Performance_on_tests                                                81.4
## 12 K-12_School_climate                                                     100  
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre…              91.9
## 14 Pre-K_Academic_readiness                                                100  
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr…             100  
## 16 Pre-K_Self-regulation_and_attention_skills                               99.4
MD_ab_long <- MD_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

Calculating Cohen’s kappa, first for the whole dataset, then by indicator.

MD_contingency_table <- data.frame(MD_ab_long$rating_a, MD_ab_long$rating_b)
MD_kappa_result <- kappa2(MD_contingency_table, weight = "unweighted")
print(MD_kappa_result)
##  Cohen's Kappa for 2 Raters (Weights: unweighted)
## 
##  Subjects = 7744 
##    Raters = 2 
##     Kappa = 0.665 
## 
##         z = 79.8 
##   p-value = 0
MD_kappa_by_indicator <- MD_ab_long %>%
  group_by(indicator) %>%
  summarise(
    kappa_stat = {
      MD_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
      MD_kappa_result <- kappa2(MD_contingency_table, weight = "unweighted")
      MD_kappa_result$value  
    },
    .groups = "drop"
  )

print(MD_kappa_by_indicator)
## # A tibble: 16 × 2
##    indicator                                                        kappa_stat
##    <chr>                                                                 <dbl>
##  1 EA_On-time_graduation                                               0.819  
##  2 EA_Postsecondary_readiness                                          0.742  
##  3 K-12_Access_to_effective_teaching                                   0.637  
##  4 K-12_Access_to_high-quality_academic_supports                      -0.00339
##  5 K-12_Access_to_rigorous_coursework                                  0.113  
##  6 K-12_Curricular_breadth                                             0.0913 
##  7 K-12_Engagement_in_schooling                                        0.504  
##  8 K-12_Nonacademic_supports_for_student_success                       0      
##  9 K-12_Nonexclusionary_discipline_practices                         NaN      
## 10 K-12_Performance_in_coursework                                      0.758  
## 11 K-12_Performance_on_tests                                           0.574  
## 12 K-12_School_climate                                                 1      
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation    0.562  
## 14 Pre-K_Academic_readiness                                            1      
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs    1      
## 16 Pre-K_Self-regulation_and_attention_skills                          0

Counting the number of datasets mapped to each indicator to create a plot afterwards.

MD_mapped_counts <- MD_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(indicator) %>%
  summarise(mapped_counts = n_distinct(id), .groups = "drop")

MD_mapped_counts
## # A tibble: 15 × 2
##    indicator                                                       mapped_counts
##    <chr>                                                                   <int>
##  1 EA_On-time_graduation                                                     145
##  2 EA_Postsecondary_readiness                                                173
##  3 K-12_Access_to_effective_teaching                                          33
##  4 K-12_Access_to_high-quality_academic_supports                              11
##  5 K-12_Access_to_rigorous_coursework                                         45
##  6 K-12_Curricular_breadth                                                    43
##  7 K-12_Engagement_in_schooling                                              207
##  8 K-12_Nonacademic_supports_for_student_success                               9
##  9 K-12_Performance_in_coursework                                             68
## 10 K-12_Performance_on_tests                                                 164
## 11 K-12_School_climate                                                         8
## 12 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati…            67
## 13 Pre-K_Academic_readiness                                                    3
## 14 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra…            23
## 15 Pre-K_Self-regulation_and_attention_skills                                  3
MD_indicator_kappa_count_merge <- MD_kappa_by_indicator %>%
  left_join(MD_mapped_counts, by = "indicator")

head(MD_indicator_kappa_count_merge)
## # A tibble: 6 × 3
##   indicator                                     kappa_stat mapped_counts
##   <chr>                                              <dbl>         <int>
## 1 EA_On-time_graduation                            0.819             145
## 2 EA_Postsecondary_readiness                       0.742             173
## 3 K-12_Access_to_effective_teaching                0.637              33
## 4 K-12_Access_to_high-quality_academic_supports   -0.00339            11
## 5 K-12_Access_to_rigorous_coursework               0.113              45
## 6 K-12_Curricular_breadth                          0.0913             43
clean_labels <- sub("^[^_]+_", "", MD_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)


MD_indicator_kappa_count_merge$indicator_cleaned <- factor(
  wrapped_labels,
  levels = wrapped_labels[order(MD_indicator_kappa_count_merge$kappa_stat)]
)
MD_balloonplot <- ggballoonplot(MD_indicator_kappa_count_merge, 
                                x = "indicator_cleaned", 
                                y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
  scale_fill_viridis_c(option = "C") +  
  theme_minimal() +
  ylim(-1,1) +
  labs(
    title = "Maryland Interrater Reliability by Indicator",
    subtitle = "Overall Kappa is 0.665",
    x = "Indicator",
    y = "Cohen's Kappa",
    fill = "Kappa",
    size = "Number of Datasets"
  ) +
  theme(
    axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
  ) +
  coord_flip()

MD_balloonplot
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

ggsave("20251113_IRR_kappa_balloonplot_Maryland.svg", plot = MD_balloonplot, width = 6.5, height = 6.5)
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).
MD_heatmap_long <- MD_ab_long %>%
  mutate(weight = case_when(
    rating_a == 0 & rating_b == 0 ~ 0,  # Both rated zero = Not Rated
    (rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2,  # One rated zero, other rated indirect = Moderate Disagreement
    (rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3,  # One rated zero, other rated direct = Serious Disagreement
    (rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1,  # Both rated indirect = Agreement
    (rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1,  # Both rated direct = Agreement
    TRUE ~ NA_real_  # Handle any other cases (e.g., missing or invalid ratings)
  ))

clean_labels <- sub("^[^_]+_", "", MD_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)

MD_heatmap_long$indicator_cleaned <- wrapped_labels

MD_heatmap_long_selected <- MD_heatmap_long %>%
  select(id, indicator_cleaned, weight)

head(MD_heatmap_long_selected)
## # A tibble: 6 × 3
##   id     indicator_cleaned                           weight
##   <chr>  <chr>                                        <dbl>
## 1 MD1218 "On-time graduation"                             1
## 2 MD1218 "Postsecondary readiness"                        3
## 3 MD1218 "Access to effective teaching"                   0
## 4 MD1218 "Access to high-quality academic\nsupports"      0
## 5 MD1218 "Access to rigorous coursework"                  3
## 6 MD1218 "Curricular breadth"                             3

Dataset disagreement with marginal plots.

# 1. Set factor levels based on heatmap ordering
MD_indicator_levels <- levels(with(MD_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
MD_dataset_levels <- levels(with(MD_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
MD_agreement_heatmap <- ggplot(MD_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "Maryland Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
    axis.text.y = element_text(size = 1),
    legend.text = element_text(size = 6),
    plot.title = element_text(hjust = 0.5)  
  )

# 3. Extract the legend
MD_legend <- cowplot::get_legend(MD_agreement_heatmap)
MD_legend_plot <- wrap_elements(MD_legend)

# 4. Remove legend from main plot
MD_main_plot <- MD_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))

# 5. Create top marginal plot (stacked bar chart by indicator)
MD_top_plot <- MD_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = MD_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  scale_x_discrete(drop = FALSE) +
  theme_minimal() +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    legend.position = "none"
  )

# 6. Create right marginal plot (stacked bar chart by dataset)
MD_right_plot <- MD_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  mutate(id = factor(id, levels = MD_dataset_levels)) %>%
  ggplot(aes(x = id, fill = factor(weight))) +
  geom_bar(position = "stack") +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
  theme_minimal() +
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )

MD_final_plot <- MD_top_plot + MD_legend + MD_main_plot + MD_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))

MD_final_plot

ggsave("20251113_agreement_heatmap_with_marginals_Maryland.svg", plot = MD_final_plot, width = 8, height = 12)

Creating the png.

# 1. Set factor levels based on heatmap ordering
MD_indicator_levels <- levels(with(MD_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
MD_dataset_levels <- levels(with(MD_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
MD_agreement_heatmap <- ggplot(MD_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "Maryland Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
    axis.text.y = element_text(size = 30),
    axis.title.y = element_text(size = 20),
    axis.title.x = element_text(size = 20),
    legend.text = element_text(size = 20),
    legend.title = element_text(size = 20),
    plot.title = element_text(size = 25, hjust = 0.5)  
  )

# 3. Extract the legend
MD_legend <- cowplot::get_legend(MD_agreement_heatmap)

# 4. Remove legend from main plot and adjust bar width
MD_main_plot <- MD_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
                                             axis.title.y = element_text(size = 20),
                                             axis.title.x = element_text(size = 20), hjust = 0.5)

# 5. Create top marginal plot 
MD_top_plot <- MD_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = MD_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_text(size = 20),
    legend.position = "none")

# 6. Create right marginal plot
MD_right_data <- MD_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  count(id, weight) %>%
  mutate(id = factor(id, levels = MD_dataset_levels))

MD_right_plot <- ggplot(MD_right_data, aes(x = id, y = n, fill = factor(weight))) +
  geom_col(position = "stack", width = 1, colour = NA) +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +  
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),  
    axis.title.x = element_text(size = 20), 
    legend.position = "none",
    panel.spacing = unit(0, "lines")
  )

MD_final_plot3 <- MD_top_plot + MD_legend + MD_main_plot + MD_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))

MD_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

ggsave("20251113_agreement_heatmap_with_marginals_Maryland.png", plot = MD_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.

MD_ab_directindirect <- MD_ab_long %>%
  select(id, dataset_name, indicator, rating_a, rating_b) %>%
  mutate(across(c(rating_a, rating_b), ~ case_when(
    . == 2 ~ 1,
    . %in% c(3,4) ~ 2,
    TRUE ~ .
  )))

MD_ratings_kappaweighted <- MD_ab_directindirect %>%
  select(rating_a, rating_b)
MD_kappa_weighted <- kappa2(MD_ratings_kappaweighted, weight = "equal")
print(MD_kappa_weighted)
##  Cohen's Kappa for 2 Raters (Weights: equal)
## 
##  Subjects = 7744 
##    Raters = 2 
##     Kappa = 0.719 
## 
##         z = 70.4 
##   p-value = 0
MD_weighted_kappa_by_indicator <- MD_ab_directindirect %>%
  group_by(indicator) %>%
  summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)

MD_weighted_kappa_by_indicator
## # A tibble: 16 × 2
##    indicator                                                      weighted_kappa
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                 0.856  
##  2 EA_Postsecondary_readiness                                            0.768  
##  3 K-12_Access_to_effective_teaching                                     0.777  
##  4 K-12_Access_to_high-quality_academic_supports                        -0.00373
##  5 K-12_Access_to_rigorous_coursework                                    0.123  
##  6 K-12_Curricular_breadth                                               0.136  
##  7 K-12_Engagement_in_schooling                                          0.680  
##  8 K-12_Nonacademic_supports_for_student_success                         0      
##  9 K-12_Nonexclusionary_discipline_practices                            NA      
## 10 K-12_Performance_in_coursework                                        0.734  
## 11 K-12_Performance_on_tests                                             0.613  
## 12 K-12_School_climate                                                   1      
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat…        0.604  
## 14 Pre-K_Academic_readiness                                              1      
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr…        1      
## 16 Pre-K_Self-regulation_and_attention_skills                            0
IRRstatistics_Maryland <- MD_indicator_kappa_count_merge %>%
  left_join(MD_weighted_kappa_by_indicator, by = "indicator") %>%
  left_join(MD_percent_agreement, by = "indicator") %>%
  select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)

indicator_order <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs", 
  "K-12_Access_to_effective_teaching",
  "K-12_Access_to_rigorous_coursework",
  "K-12_Curricular_breadth",
  "K-12_Access_to_high-quality_academic_supports",
  "K-12_School_climate",
  "K-12_Nonexclusionary_discipline_practices",
  "K-12_Nonacademic_supports_for_student_success",
  "Pre-K_Academic_readiness",
  "Pre-K_Self-regulation_and_attention_skills",
  "K-12_Engagement_in_schooling",
  "K-12_Performance_in_coursework",
  "K-12_Performance_on_tests",
  "EA_On-time_graduation",
  "EA_Postsecondary_readiness"
)

indicator_name_mapping <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
  "K-12_Access_to_effective_teaching" = "Effective Teaching",
  "K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
  "K-12_Curricular_breadth" = "Curricular Breadth",
  "K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
  "K-12_School_climate" = "School Climate",
  "K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
  "K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
  "Pre-K_Academic_readiness" = "Academic Readiness",
  "Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
  "K-12_Engagement_in_schooling" = "Engagement in Schooling",
  "K-12_Performance_in_coursework" = "Performance in Coursework",
  "K-12_Performance_on_tests" = "Performance on Tests",
  "EA_On-time_graduation" = "On-Time Graduation",
  "EA_Postsecondary_readiness" = "Postsecondary Readiness"
)

IRRstatistics_Maryland_ordered <- IRRstatistics_Maryland %>%
  arrange(match(indicator, indicator_order)) %>%
  mutate(indicator_name = indicator_name_mapping[indicator]) %>%
  select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
    rename(
      "Indicator" = indicator_name,
      "Mapped Datasets" = mapped_counts,
      "Percent Agreement" = percent_agreement,
      "Cohen's Kappa" = kappa_stat,
      "Cohen's Weighted Kappa" = weighted_kappa) %>%
  mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))

kable(IRRstatistics_Maryland_ordered)
Indicator Mapped Datasets Percent Agreement Cohen’s Kappa Cohen’s Weighted Kappa
Racial, Ethnic, and Economic Segregation 67 91.94215 0.5624478 0.6038152
High-Quality Pre-K Programs 23 100.00000 1.0000000 1.0000000
Effective Teaching 33 95.24793 0.6372760 0.7769943
Rigorous Coursework 45 90.70248 0.1133366 0.1230334
Curricular Breadth 43 91.11570 0.0912584 0.1357675
High-Quality Academic Supports 11 97.72727 -0.0033924 -0.0037329
School Climate 8 100.00000 1.0000000 1.0000000
Nonexclusionary Discipline Practices NA 100.00000 NA NA
Nonacademic Supports for Student Success 9 98.14050 0.0000000 0.0000000
Academic Readiness 3 100.00000 1.0000000 1.0000000
Self-Regulation and Attention Skills 3 99.38017 0.0000000 0.0000000
Engagement in Schooling 207 75.61983 0.5042232 0.6798713
Performance in Coursework 68 94.62810 0.7580837 0.7344241
Performance on Tests 164 81.40496 0.5742310 0.6132684
On-Time Graduation 145 91.94215 0.8188362 0.8561597
Postsecondary Readiness 173 88.22314 0.7419608 0.7675692
write.csv(IRRstatistics_Maryland_ordered, "20251113_IRRstatistics_Maryland.csv", row.names = FALSE)
```

OREGON

Filtering for only Oregon datasets.

OR_rater_ab <- rater_ab %>%
  filter(str_starts(id, "OR"))

str(OR_rater_ab)
## tibble [2,068 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:2068] "OR1218" "OR1218" "OR1218" "OR1219" ...
##  $ rater           : chr [1:2068] "R625" "R625" "R514" "R625" ...
##  $ validation_round: chr [1:2068] "A" "A" "B" "A" ...
##  $ dataset_name    : chr [1:2068] "OnTracktoELP_1617.xlsx" "OnTracktoELP_1617.xlsx" "OnTracktoELP_1617.xlsx" "OnTracktoELP_1718.xlsx" ...
##  $ indicator       : chr [1:2068] "K-12_Access_to_high-quality_academic_supports" "K-12_Performance_on_tests" "K-12_Performance_on_tests" "K-12_Access_to_high-quality_academic_supports" ...
##  $ rating          : num [1:2068] 2 2 2 2 2 2 2 2 2 2 ...
OR_duplicates_check <- OR_rater_ab %>%
  group_by(id, indicator, validation_round) %>%
  filter(n() > 1) %>%
  ungroup()

# View the duplicates
print(OR_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## #   dataset_name <chr>, indicator <chr>, rating <dbl>

Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).

OR_rater_ab <- OR_rater_ab %>%
  filter(!(id %in% c("OR1450", "OR1451", "OR1452", "OR1453", "OR1454", "OR1463", "OR1464", "OR1465", "OR1466", "OR1467", "OR1468", "OR1469", "OR1470", "OR1471", "OR1472", "OR1473", "OR1474", "OR1475", "OR1476", "OR1477", "OR1478", "OR1479", "OR1480", "OR1481", "OR1482", "OR1483", "OR1484", "OR1485", "OR1486", "OR1487", "OR1488", "OR1489", "OR1490", "OR1491", "OR1492", "OR1493", "OR1494", "OR1495", "OR1496", "OR1497", "OR1498", "OR1499", "OR1500", "OR1501", "OR1502", "OR1503", "OR1504", "OR1505", "OR1506", "OR1507", "OR1508", "OR1509", "OR1510", "OR1511", "OR1512", "OR1513", "OR1514", "OR1515", "OR1516", "OR1517", "OR1518", "OR1519", "OR1520", "OR1521", "OR1522", "OR1523", "OR1524", "OR1525", "OR1526", "OR1527", "OR1528", "OR1529", "OR1530", "OR1531", "OR1532", "OR1533", "OR1534", "OR1535", "OR1536", "OR1537", "OR1538", "OR1539", "OR1540", "OR1541", "OR1542", "OR1543", "OR1544", "OR1545", "OR1546", "OR1547", "OR1548", "OR1549", "OR1551", "OR1552", "OR1553", "OR1554", "OR1558", "OR1576", "OR1577", "OR1580", "OR1597", "OR1598", "OR1599", "OR1601", "OR1606", "OR1609", "OR1612", "OR1614", "OR1617", "OR1766", "OR1767")
  ))

str(OR_rater_ab)
## tibble [1,833 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:1833] "OR1218" "OR1218" "OR1218" "OR1219" ...
##  $ rater           : chr [1:1833] "R625" "R625" "R514" "R625" ...
##  $ validation_round: chr [1:1833] "A" "A" "B" "A" ...
##  $ dataset_name    : chr [1:1833] "OnTracktoELP_1617.xlsx" "OnTracktoELP_1617.xlsx" "OnTracktoELP_1617.xlsx" "OnTracktoELP_1718.xlsx" ...
##  $ indicator       : chr [1:1833] "K-12_Access_to_high-quality_academic_supports" "K-12_Performance_on_tests" "K-12_Performance_on_tests" "K-12_Access_to_high-quality_academic_supports" ...
##  $ rating          : num [1:1833] 2 2 2 2 2 2 2 2 2 2 ...

Making sure each id, indicator, validator combination is present.

OR_expanded_grid <- OR_rater_ab %>%
  distinct(id, dataset_name) %>%
  crossing(indicators_df, validation_round = c("A", "B"))

head(OR_expanded_grid)
## # A tibble: 6 × 4
##   id     dataset_name           indicator                       validation_round
##   <chr>  <chr>                  <chr>                           <chr>           
## 1 OR1218 OnTracktoELP_1617.xlsx EA_On-time_graduation           A               
## 2 OR1218 OnTracktoELP_1617.xlsx EA_On-time_graduation           B               
## 3 OR1218 OnTracktoELP_1617.xlsx EA_Postsecondary_readiness      A               
## 4 OR1218 OnTracktoELP_1617.xlsx EA_Postsecondary_readiness      B               
## 5 OR1218 OnTracktoELP_1617.xlsx K-12_Access_to_effective_teach… A               
## 6 OR1218 OnTracktoELP_1617.xlsx K-12_Access_to_effective_teach… B

Making sure that id/indicator combinations that were not rated are represented as zero.

OR_ratings_with_zeros <- OR_expanded_grid %>%
  left_join(OR_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating), 
            by = c("id", "dataset_name", "indicator", "validation_round")) %>%
  mutate(rating = replace_na(rating, 0))
OR_ab_long <- OR_ratings_with_zeros %>%
  pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
  rename(rating_a = rating_A, rating_b = rating_B) %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

head(OR_ab_long)
## # A tibble: 6 × 6
##   id     dataset_name           indicator            rating_a rating_b agreement
##   <chr>  <chr>                  <chr>                   <dbl>    <dbl>     <dbl>
## 1 OR1218 OnTracktoELP_1617.xlsx EA_On-time_graduati…        0        0         1
## 2 OR1218 OnTracktoELP_1617.xlsx EA_Postsecondary_re…        0        0         1
## 3 OR1218 OnTracktoELP_1617.xlsx K-12_Access_to_effe…        0        0         1
## 4 OR1218 OnTracktoELP_1617.xlsx K-12_Access_to_high…        2        0         0
## 5 OR1218 OnTracktoELP_1617.xlsx K-12_Access_to_rigo…        0        0         1
## 6 OR1218 OnTracktoELP_1617.xlsx K-12_Curricular_bre…        0        0         1
OR_dataset_agreement <- OR_ab_long %>%
  group_by(id) %>%
  summarise(percent_agreement = mean(agreement) * 100)

OR_dataset_agreement <- OR_dataset_agreement %>%
  ungroup()

n_distinct(OR_dataset_agreement$id)
## [1] 466
head(OR_dataset_agreement)
## # A tibble: 6 × 2
##   id     percent_agreement
##   <chr>              <dbl>
## 1 OR1218              93.8
## 2 OR1219              93.8
## 3 OR1220              93.8
## 4 OR1221              93.8
## 5 OR1222              93.8
## 6 OR1223             100
write.csv(OR_dataset_agreement, "20251113_percent_agreement_by_id_Oregon.csv", row.names = FALSE)
OR_indicators_mapped_per_id <- OR_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(id) %>%
  summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
  right_join(OR_ab_long %>% distinct(id), by = "id") %>%
  mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))

OR_dataset_agreement_count <- OR_dataset_agreement %>%
  left_join(OR_indicators_mapped_per_id, by = "id")

head(OR_dataset_agreement_count)
## # A tibble: 6 × 3
##   id     percent_agreement indicators_mapped_per_id
##   <chr>              <dbl>                    <int>
## 1 OR1218              93.8                        2
## 2 OR1219              93.8                        2
## 3 OR1220              93.8                        2
## 4 OR1221              93.8                        2
## 5 OR1222              93.8                        2
## 6 OR1223             100                          1
OR_dataset_agreement_count <- OR_dataset_agreement_count %>%
  arrange(desc(percent_agreement), (indicators_mapped_per_id))

OR_dataset_agreement_count$id <- factor(OR_dataset_agreement_count$id, levels = unique(OR_dataset_agreement_count$id))

OR_dataset_agreement_count$indicators_mapped_per_id <- 
  factor(OR_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
OR_percentagreement_barplot <- ggplot(OR_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
  geom_bar(stat = "identity") +
  theme_minimal() + 
  scale_fill_viridis_d(direction = -1) +
  scale_y_continuous(expand = c(0, 0)) +
  labs(
    title = "Oregon - Percentage Agreement by Dataset",
    x = "Dataset",
    y = "Agreement (%)",
    fill = "Number of Indicators"
  ) +
  theme(
    axis.text.y = element_text(size = 1)
  ) +
  coord_flip()

OR_percentagreement_barplot

ggsave("20251113_percentagreement_barplot_Oregon.svg", plot = OR_percentagreement_barplot, width = 6.5, height = 6.5)

Printing results for chart in findings section.

OR_percent_agreement <- OR_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
  group_by(indicator) %>%
  summarise(percent_agreement = mean(agreement) * 100)

OR_percent_agreement
## # A tibble: 16 × 2
##    indicator                                                   percent_agreement
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                    94.2
##  2 EA_Postsecondary_readiness                                               89.7
##  3 K-12_Access_to_effective_teaching                                        96.6
##  4 K-12_Access_to_high-quality_academic_supports                            91.2
##  5 K-12_Access_to_rigorous_coursework                                      100  
##  6 K-12_Curricular_breadth                                                 100  
##  7 K-12_Engagement_in_schooling                                             79.4
##  8 K-12_Nonacademic_supports_for_student_success                            97.9
##  9 K-12_Nonexclusionary_discipline_practices                                97.0
## 10 K-12_Performance_in_coursework                                           94.4
## 11 K-12_Performance_on_tests                                                95.9
## 12 K-12_School_climate                                                     100  
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre…              84.1
## 14 Pre-K_Academic_readiness                                                100  
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr…              97.9
## 16 Pre-K_Self-regulation_and_attention_skills                              100
OR_ab_long <- OR_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

Calculating Cohen’s kappa, first for the whole dataset, then by indicator.

OR_contingency_table <- data.frame(OR_ab_long$rating_a, OR_ab_long$rating_b)
OR_kappa_result <- kappa2(OR_contingency_table, weight = "unweighted")
print(OR_kappa_result)
##  Cohen's Kappa for 2 Raters (Weights: unweighted)
## 
##  Subjects = 7456 
##    Raters = 2 
##     Kappa = 0.771 
## 
##         z = 95.2 
##   p-value = 0
OR_kappa_by_indicator <- OR_ab_long %>%
  group_by(indicator) %>%
  summarise(
    kappa_stat = {
      OR_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
      OR_kappa_result <- kappa2(OR_contingency_table, weight = "unweighted")
      OR_kappa_result$value  
    },
    .groups = "drop"
  )

print(OR_kappa_by_indicator)
## # A tibble: 16 × 2
##    indicator                                                        kappa_stat
##    <chr>                                                                 <dbl>
##  1 EA_On-time_graduation                                                 0.856
##  2 EA_Postsecondary_readiness                                            0.661
##  3 K-12_Access_to_effective_teaching                                     0.824
##  4 K-12_Access_to_high-quality_academic_supports                         0.174
##  5 K-12_Access_to_rigorous_coursework                                    1    
##  6 K-12_Curricular_breadth                                               1    
##  7 K-12_Engagement_in_schooling                                          0.664
##  8 K-12_Nonacademic_supports_for_student_success                         0.405
##  9 K-12_Nonexclusionary_discipline_practices                             0.487
## 10 K-12_Performance_in_coursework                                        0.644
## 11 K-12_Performance_on_tests                                             0.936
## 12 K-12_School_climate                                                   1    
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation      0.544
## 14 Pre-K_Academic_readiness                                              1    
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs      0    
## 16 Pre-K_Self-regulation_and_attention_skills                            1

Counting the number of datasets mapped to each indicator to create a plot afterwards.

OR_mapped_counts <- OR_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(indicator) %>%
  summarise(mapped_counts = n_distinct(id), .groups = "drop")

OR_mapped_counts
## # A tibble: 16 × 2
##    indicator                                                       mapped_counts
##    <chr>                                                                   <int>
##  1 EA_On-time_graduation                                                     119
##  2 EA_Postsecondary_readiness                                                 92
##  3 K-12_Access_to_effective_teaching                                          54
##  4 K-12_Access_to_high-quality_academic_supports                              46
##  5 K-12_Access_to_rigorous_coursework                                          7
##  6 K-12_Curricular_breadth                                                     4
##  7 K-12_Engagement_in_schooling                                              230
##  8 K-12_Nonacademic_supports_for_student_success                              13
##  9 K-12_Nonexclusionary_discipline_practices                                  21
## 10 K-12_Performance_in_coursework                                             52
## 11 K-12_Performance_on_tests                                                 241
## 12 K-12_School_climate                                                         3
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati…           121
## 14 Pre-K_Academic_readiness                                                   13
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra…            10
## 16 Pre-K_Self-regulation_and_attention_skills                                 13
OR_indicator_kappa_count_merge <- OR_kappa_by_indicator %>%
  left_join(OR_mapped_counts, by = "indicator")

head(OR_indicator_kappa_count_merge)
## # A tibble: 6 × 3
##   indicator                                     kappa_stat mapped_counts
##   <chr>                                              <dbl>         <int>
## 1 EA_On-time_graduation                              0.856           119
## 2 EA_Postsecondary_readiness                         0.661            92
## 3 K-12_Access_to_effective_teaching                  0.824            54
## 4 K-12_Access_to_high-quality_academic_supports      0.174            46
## 5 K-12_Access_to_rigorous_coursework                 1                 7
## 6 K-12_Curricular_breadth                            1                 4
clean_labels <- sub("^[^_]+_", "", OR_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)


OR_indicator_kappa_count_merge$indicator_cleaned <- factor(
  wrapped_labels,
  levels = wrapped_labels[order(OR_indicator_kappa_count_merge$kappa_stat)]
)
OR_balloonplot <- ggballoonplot(OR_indicator_kappa_count_merge, 
                                x = "indicator_cleaned", 
                                y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
  scale_fill_viridis_c(option = "C") +  
  theme_minimal() +
  ylim(-1,1) +
  labs(
    title = "Oregon Interrater Reliability by Indicator",
    subtitle = "Overall Kappa is 0.771",
    x = "Indicator",
    y = "Cohen's Kappa",
    fill = "Kappa",
    size = "Number of Datasets"
  ) +
  theme(
    axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
  ) +
  coord_flip()

OR_balloonplot

ggsave("20251113_IRR_kappa_balloonplot_Oregon.svg", plot = OR_balloonplot, width = 6.5, height = 6.5)
OR_heatmap_long <- OR_ab_long %>%
  mutate(weight = case_when(
    rating_a == 0 & rating_b == 0 ~ 0,  # Both rated zero = Not Rated
    (rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2,  # One rated zero, other rated indirect = Moderate Disagreement
    (rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3,  # One rated zero, other rated direct = Serious Disagreement
    (rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1,  # Both rated indirect = Agreement
    (rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1,  # Both rated direct = Agreement
    TRUE ~ NA_real_  # Handle any other cases (e.g., missing or invalid ratings)
  ))

clean_labels <- sub("^[^_]+_", "", OR_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)

OR_heatmap_long$indicator_cleaned <- wrapped_labels

OR_heatmap_long_selected <- OR_heatmap_long %>%
  select(id, indicator_cleaned, weight)

head(OR_heatmap_long_selected)
## # A tibble: 6 × 3
##   id     indicator_cleaned                           weight
##   <chr>  <chr>                                        <dbl>
## 1 OR1218 "On-time graduation"                             0
## 2 OR1218 "Postsecondary readiness"                        0
## 3 OR1218 "Access to effective teaching"                   0
## 4 OR1218 "Access to high-quality academic\nsupports"      2
## 5 OR1218 "Access to rigorous coursework"                  0
## 6 OR1218 "Curricular breadth"                             0

Dataset disagreement with marginal plots.

# 1. Set factor levels based on heatmap ordering
OR_indicator_levels <- levels(with(OR_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
OR_dataset_levels <- levels(with(OR_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
OR_agreement_heatmap <- ggplot(OR_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "Oregon Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
    axis.text.y = element_text(size = 1),
    legend.text = element_text(size = 6),
    plot.title = element_text(hjust = 0.5)  
  )

# 3. Extract the legend
OR_legend <- cowplot::get_legend(OR_agreement_heatmap)
OR_legend_plot <- wrap_elements(OR_legend)

# 4. Remove legend from main plot
OR_main_plot <- OR_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))

# 5. Create top marginal plot (stacked bar chart by indicator)
OR_top_plot <- OR_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = OR_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  scale_x_discrete(drop = FALSE) +
  theme_minimal() +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    legend.position = "none"
  )

# 6. Create right marginal plot (stacked bar chart by dataset)
OR_right_plot <- OR_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  mutate(id = factor(id, levels = OR_dataset_levels)) %>%
  ggplot(aes(x = id, fill = factor(weight))) +
  geom_bar(position = "stack") +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
  theme_minimal() +
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )

OR_final_plot <- OR_top_plot + OR_legend + OR_main_plot + OR_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))

OR_final_plot

ggsave("20251113_agreement_heatmap_with_marginals_Oregon.svg", plot = OR_final_plot, width = 8, height = 12)

Creating the png.

# 1. Set factor levels based on heatmap ordering
OR_indicator_levels <- levels(with(OR_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
OR_dataset_levels <- levels(with(OR_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
OR_agreement_heatmap <- ggplot(OR_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "Oregon Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
    axis.text.y = element_text(size = 30),
    axis.title.y = element_text(size = 20),
    axis.title.x = element_text(size = 20),
    legend.text = element_text(size = 20),
    legend.title = element_text(size = 20),
    plot.title = element_text(size = 25, hjust = 0.5)  
  )

# 3. Extract the legend
OR_legend <- cowplot::get_legend(OR_agreement_heatmap)

# 4. Remove legend from main plot and adjust bar width
OR_main_plot <- OR_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
                                             axis.title.y = element_text(size = 20),
                                             axis.title.x = element_text(size = 20), hjust = 0.5)

# 5. Create top marginal plot 
OR_top_plot <- OR_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = OR_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_text(size = 20),
    legend.position = "none")

# 6. Create right marginal plot
OR_right_data <- OR_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  count(id, weight) %>%
  mutate(id = factor(id, levels = OR_dataset_levels))

OR_right_plot <- ggplot(OR_right_data, aes(x = id, y = n, fill = factor(weight))) +
  geom_col(position = "stack", width = 1, colour = NA) +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +  
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),  
    axis.title.x = element_text(size = 20), 
    legend.position = "none",
    panel.spacing = unit(0, "lines")
  )

OR_final_plot3 <- OR_top_plot + OR_legend + OR_main_plot + OR_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))

OR_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

ggsave("20251113_agreement_heatmap_with_marginals_Oregon.png", plot = OR_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.

OR_ab_directindirect <- OR_ab_long %>%
  select(id, dataset_name, indicator, rating_a, rating_b) %>%
  mutate(across(c(rating_a, rating_b), ~ case_when(
    . == 2 ~ 1,
    . %in% c(3,4) ~ 2,
    TRUE ~ .
  )))

OR_ratings_kappaweighted <- OR_ab_directindirect %>%
  select(rating_a, rating_b)
OR_kappa_weighted <- kappa2(OR_ratings_kappaweighted, weight = "equal")
print(OR_kappa_weighted)
##  Cohen's Kappa for 2 Raters (Weights: equal)
## 
##  Subjects = 7456 
##    Raters = 2 
##     Kappa = 0.831 
## 
##         z = 77.2 
##   p-value = 0
OR_weighted_kappa_by_indicator <- OR_ab_directindirect %>%
  group_by(indicator) %>%
  summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)

OR_weighted_kappa_by_indicator
## # A tibble: 16 × 2
##    indicator                                                      weighted_kappa
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                   0.896
##  2 EA_Postsecondary_readiness                                              0.735
##  3 K-12_Access_to_effective_teaching                                       0.854
##  4 K-12_Access_to_high-quality_academic_supports                           0.281
##  5 K-12_Access_to_rigorous_coursework                                      1    
##  6 K-12_Curricular_breadth                                                 1    
##  7 K-12_Engagement_in_schooling                                            0.754
##  8 K-12_Nonacademic_supports_for_student_success                           0.493
##  9 K-12_Nonexclusionary_discipline_practices                               0.485
## 10 K-12_Performance_in_coursework                                          0.640
## 11 K-12_Performance_on_tests                                               0.959
## 12 K-12_School_climate                                                     1    
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat…          0.687
## 14 Pre-K_Academic_readiness                                                1    
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr…          0    
## 16 Pre-K_Self-regulation_and_attention_skills                              1
IRRstatistics_Oregon <- OR_indicator_kappa_count_merge %>%
  left_join(OR_weighted_kappa_by_indicator, by = "indicator") %>%
  left_join(OR_percent_agreement, by = "indicator") %>%
  select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)

indicator_order <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs", 
  "K-12_Access_to_effective_teaching",
  "K-12_Access_to_rigorous_coursework",
  "K-12_Curricular_breadth",
  "K-12_Access_to_high-quality_academic_supports",
  "K-12_School_climate",
  "K-12_Nonexclusionary_discipline_practices",
  "K-12_Nonacademic_supports_for_student_success",
  "Pre-K_Academic_readiness",
  "Pre-K_Self-regulation_and_attention_skills",
  "K-12_Engagement_in_schooling",
  "K-12_Performance_in_coursework",
  "K-12_Performance_on_tests",
  "EA_On-time_graduation",
  "EA_Postsecondary_readiness"
)

indicator_name_mapping <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
  "K-12_Access_to_effective_teaching" = "Effective Teaching",
  "K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
  "K-12_Curricular_breadth" = "Curricular Breadth",
  "K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
  "K-12_School_climate" = "School Climate",
  "K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
  "K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
  "Pre-K_Academic_readiness" = "Academic Readiness",
  "Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
  "K-12_Engagement_in_schooling" = "Engagement in Schooling",
  "K-12_Performance_in_coursework" = "Performance in Coursework",
  "K-12_Performance_on_tests" = "Performance on Tests",
  "EA_On-time_graduation" = "On-Time Graduation",
  "EA_Postsecondary_readiness" = "Postsecondary Readiness"
)

IRRstatistics_Oregon_ordered <- IRRstatistics_Oregon %>%
  arrange(match(indicator, indicator_order)) %>%
  mutate(indicator_name = indicator_name_mapping[indicator]) %>%
  select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
    rename(
      "Indicator" = indicator_name,
      "Mapped Datasets" = mapped_counts,
      "Percent Agreement" = percent_agreement,
      "Cohen's Kappa" = kappa_stat,
      "Cohen's Weighted Kappa" = weighted_kappa) %>%
  mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))

kable(IRRstatistics_Oregon_ordered)
Indicator Mapped Datasets Percent Agreement Cohen’s Kappa Cohen’s Weighted Kappa
Racial, Ethnic, and Economic Segregation 121 84.12017 0.5443687 0.6872714
High-Quality Pre-K Programs 10 97.85408 0.0000000 0.0000000
Effective Teaching 54 96.56652 0.8241841 0.8542766
Rigorous Coursework 7 100.00000 1.0000000 1.0000000
Curricular Breadth 4 100.00000 1.0000000 1.0000000
High-Quality Academic Supports 46 91.20172 0.1741874 0.2808149
School Climate 3 100.00000 1.0000000 1.0000000
Nonexclusionary Discipline Practices 21 96.99571 0.4872682 0.4845947
Nonacademic Supports for Student Success 13 97.85408 0.4048531 0.4934783
Academic Readiness 13 100.00000 1.0000000 1.0000000
Self-Regulation and Attention Skills 13 100.00000 1.0000000 1.0000000
Engagement in Schooling 230 79.39914 0.6644993 0.7542482
Performance in Coursework 52 94.42060 0.6442748 0.6398764
Performance on Tests 241 95.92275 0.9364681 0.9591458
On-Time Graduation 119 94.20601 0.8557606 0.8959030
Postsecondary Readiness 92 89.69957 0.6614449 0.7354039
write.csv(IRRstatistics_Oregon_ordered, "20251113_IRRstatistics_Oregon.csv", row.names = FALSE)

CALIFORNIA

Filtering for only California datasets.

CA_rater_ab <- rater_ab %>%
  filter(str_starts(id, "CA"))

str(CA_rater_ab)
## tibble [1,163 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:1163] "CA1218" "CA1218" "CA1218" "CA1219" ...
##  $ rater           : chr [1:1163] "R514" "R562" "R562" "R514" ...
##  $ validation_round: chr [1:1163] "A" "B" "B" "A" ...
##  $ dataset_name    : chr [1:1163] "expulsion12.txt" "expulsion12.txt" "expulsion12.txt" "expulsion13.txt" ...
##  $ indicator       : chr [1:1163] "K-12_Nonexclusionary_discipline_practices" "K-12_Nonexclusionary_discipline_practices" "K-12_School_climate" "K-12_Nonexclusionary_discipline_practices" ...
##  $ rating          : num [1:1163] 4 4 2 4 4 2 4 4 2 4 ...
CA_duplicates_check <- CA_rater_ab %>%
  group_by(id, indicator, validation_round) %>%
  filter(n() > 1) %>%
  ungroup()

# View the duplicates
print(CA_duplicates_check)
## # A tibble: 0 × 6
## # ℹ 6 variables: id <chr>, rater <chr>, validation_round <chr>,
## #   dataset_name <chr>, indicator <chr>, rating <dbl>

Removing datasets that were excluded during the screening phase for reasons other than not being mappable to NASEM (ex: not downloadable, not at the school level, etc.).

CA_rater_ab <- CA_rater_ab %>%
  filter(!(id %in% c("CA1412")))

str(CA_rater_ab)
## tibble [1,163 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id              : chr [1:1163] "CA1218" "CA1218" "CA1218" "CA1219" ...
##  $ rater           : chr [1:1163] "R514" "R562" "R562" "R514" ...
##  $ validation_round: chr [1:1163] "A" "B" "B" "A" ...
##  $ dataset_name    : chr [1:1163] "expulsion12.txt" "expulsion12.txt" "expulsion12.txt" "expulsion13.txt" ...
##  $ indicator       : chr [1:1163] "K-12_Nonexclusionary_discipline_practices" "K-12_Nonexclusionary_discipline_practices" "K-12_School_climate" "K-12_Nonexclusionary_discipline_practices" ...
##  $ rating          : num [1:1163] 4 4 2 4 4 2 4 4 2 4 ...

Making sure each id, indicator, validator combination is present.

CA_expanded_grid <- CA_rater_ab %>%
  distinct(id, dataset_name) %>%
  crossing(indicators_df, validation_round = c("A", "B"))

head(CA_expanded_grid)
## # A tibble: 6 × 4
##   id     dataset_name    indicator                         validation_round
##   <chr>  <chr>           <chr>                             <chr>           
## 1 CA1218 expulsion12.txt EA_On-time_graduation             A               
## 2 CA1218 expulsion12.txt EA_On-time_graduation             B               
## 3 CA1218 expulsion12.txt EA_Postsecondary_readiness        A               
## 4 CA1218 expulsion12.txt EA_Postsecondary_readiness        B               
## 5 CA1218 expulsion12.txt K-12_Access_to_effective_teaching A               
## 6 CA1218 expulsion12.txt K-12_Access_to_effective_teaching B

Making sure that id/indicator combinations that were not rated are represented as zero.

CA_ratings_with_zeros <- CA_expanded_grid %>%
  left_join(CA_rater_ab %>% select(id, dataset_name, indicator, validation_round, rating), 
            by = c("id", "dataset_name", "indicator", "validation_round")) %>%
  mutate(rating = replace_na(rating, 0))
CA_ab_long <- CA_ratings_with_zeros %>%
  pivot_wider(names_from = validation_round, values_from = rating, names_prefix = "rating_") %>%
  rename(rating_a = rating_A, rating_b = rating_B) %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

head(CA_ab_long)
## # A tibble: 6 × 6
##   id     dataset_name    indicator                   rating_a rating_b agreement
##   <chr>  <chr>           <chr>                          <dbl>    <dbl>     <dbl>
## 1 CA1218 expulsion12.txt EA_On-time_graduation              0        0         1
## 2 CA1218 expulsion12.txt EA_Postsecondary_readiness         0        0         1
## 3 CA1218 expulsion12.txt K-12_Access_to_effective_t…        0        0         1
## 4 CA1218 expulsion12.txt K-12_Access_to_high-qualit…        0        0         1
## 5 CA1218 expulsion12.txt K-12_Access_to_rigorous_co…        0        0         1
## 6 CA1218 expulsion12.txt K-12_Curricular_breadth            0        0         1
CA_dataset_agreement <- CA_ab_long %>%
  group_by(id) %>%
  summarise(percent_agreement = mean(agreement) * 100)

CA_dataset_agreement <- CA_dataset_agreement %>%
  ungroup()

n_distinct(CA_dataset_agreement$id)
## [1] 313
head(CA_dataset_agreement)
## # A tibble: 6 × 2
##   id     percent_agreement
##   <chr>              <dbl>
## 1 CA1218              93.8
## 2 CA1219              93.8
## 3 CA1220              93.8
## 4 CA1221              93.8
## 5 CA1222              93.8
## 6 CA1223              93.8
write.csv(CA_dataset_agreement, "20251113_percent_agreement_by_id_California.csv", row.names = FALSE)
CA_indicators_mapped_per_id <- CA_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(id) %>%
  summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
  right_join(CA_ab_long %>% distinct(id), by = "id") %>%
  mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))

CA_dataset_agreement_count <- CA_dataset_agreement %>%
  left_join(CA_indicators_mapped_per_id, by = "id")

head(CA_dataset_agreement_count)
## # A tibble: 6 × 3
##   id     percent_agreement indicators_mapped_per_id
##   <chr>              <dbl>                    <int>
## 1 CA1218              93.8                        2
## 2 CA1219              93.8                        2
## 3 CA1220              93.8                        2
## 4 CA1221              93.8                        2
## 5 CA1222              93.8                        2
## 6 CA1223              93.8                        2
CA_dataset_agreement_count <- CA_dataset_agreement_count %>%
  arrange(desc(percent_agreement), (indicators_mapped_per_id))

CA_dataset_agreement_count$id <- factor(CA_dataset_agreement_count$id, levels = unique(CA_dataset_agreement_count$id))

CA_dataset_agreement_count$indicators_mapped_per_id <- 
  factor(CA_dataset_agreement_count$indicators_mapped_per_id) %>% fct_rev()
CA_percentagreement_barplot <- ggplot(CA_dataset_agreement_count, aes(x = id, y = percent_agreement, fill = indicators_mapped_per_id)) +
  geom_bar(stat = "identity") +
  theme_minimal() + 
  scale_fill_viridis_d(direction = -1) +
  scale_y_continuous(expand = c(0, 0)) +
  labs(
    title = "California - Percentage Agreement by Dataset",
    x = "Dataset",
    y = "Agreement (%)",
    fill = "Number of Indicators"
  ) +
  theme(
    axis.text.y = element_text(size = 1)
  ) +
  coord_flip()

CA_percentagreement_barplot

ggsave("20251113_percentagreement_barplot_California.svg", plot = CA_percentagreement_barplot, width = 6.5, height = 6.5)

Printing results for chart in findings section.

CA_percent_agreement <- CA_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0)) %>%
  group_by(indicator) %>%
  summarise(percent_agreement = mean(agreement) * 100)

CA_percent_agreement
## # A tibble: 16 × 2
##    indicator                                                   percent_agreement
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                    91.4
##  2 EA_Postsecondary_readiness                                               92.3
##  3 K-12_Access_to_effective_teaching                                        95.2
##  4 K-12_Access_to_high-quality_academic_supports                            93.0
##  5 K-12_Access_to_rigorous_coursework                                      100  
##  6 K-12_Curricular_breadth                                                  92.7
##  7 K-12_Engagement_in_schooling                                             82.7
##  8 K-12_Nonacademic_supports_for_student_success                            93.3
##  9 K-12_Nonexclusionary_discipline_practices                                97.8
## 10 K-12_Performance_in_coursework                                           95.8
## 11 K-12_Performance_on_tests                                                93.6
## 12 K-12_School_climate                                                      88.8
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segre…              75.4
## 14 Pre-K_Academic_readiness                                                100  
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_pr…             100  
## 16 Pre-K_Self-regulation_and_attention_skills                              100
CA_ab_long <- CA_ab_long %>%
  mutate(agreement = ifelse(rating_a == rating_b, 1, 0))

Calculating Cohen’s kappa, first for the whole dataset, then by indicator.

CA_contingency_table <- data.frame(CA_ab_long$rating_a, CA_ab_long$rating_b)
CA_kappa_result <- kappa2(CA_contingency_table, weight = "unweighted")
print(CA_kappa_result)
##  Cohen's Kappa for 2 Raters (Weights: unweighted)
## 
##  Subjects = 5008 
##    Raters = 2 
##     Kappa = 0.683 
## 
##         z = 64.9 
##   p-value = 0
CA_kappa_by_indicator <- CA_ab_long %>%
  group_by(indicator) %>%
  summarise(
    kappa_stat = {
      CA_contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
      CA_kappa_result <- kappa2(CA_contingency_table, weight = "unweighted")
      CA_kappa_result$value  
    },
    .groups = "drop"
  )

print(CA_kappa_by_indicator)
## # A tibble: 16 × 2
##    indicator                                                        kappa_stat
##    <chr>                                                                 <dbl>
##  1 EA_On-time_graduation                                                 0.746
##  2 EA_Postsecondary_readiness                                            0.773
##  3 K-12_Access_to_effective_teaching                                     0.739
##  4 K-12_Access_to_high-quality_academic_supports                         0.269
##  5 K-12_Access_to_rigorous_coursework                                    1    
##  6 K-12_Curricular_breadth                                               0.436
##  7 K-12_Engagement_in_schooling                                          0.548
##  8 K-12_Nonacademic_supports_for_student_success                         0.572
##  9 K-12_Nonexclusionary_discipline_practices                             0.893
## 10 K-12_Performance_in_coursework                                        0.742
## 11 K-12_Performance_on_tests                                             0.771
## 12 K-12_School_climate                                                   0.199
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation      0.481
## 14 Pre-K_Academic_readiness                                            NaN    
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs      1    
## 16 Pre-K_Self-regulation_and_attention_skills                          NaN

Counting the number of datasets mapped to each indicator to create a plot afterwards.

CA_mapped_counts <- CA_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(indicator) %>%
  summarise(mapped_counts = n_distinct(id), .groups = "drop")

CA_mapped_counts
## # A tibble: 14 × 2
##    indicator                                                       mapped_counts
##    <chr>                                                                   <int>
##  1 EA_On-time_graduation                                                      73
##  2 EA_Postsecondary_readiness                                                 72
##  3 K-12_Access_to_effective_teaching                                          38
##  4 K-12_Access_to_high-quality_academic_supports                              23
##  5 K-12_Access_to_rigorous_coursework                                         28
##  6 K-12_Curricular_breadth                                                    32
##  7 K-12_Engagement_in_schooling                                               96
##  8 K-12_Nonacademic_supports_for_student_success                              33
##  9 K-12_Nonexclusionary_discipline_practices                                  40
## 10 K-12_Performance_in_coursework                                             34
## 11 K-12_Performance_on_tests                                                  57
## 12 K-12_School_climate                                                        40
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregati…           154
## 14 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progra…             9
CA_indicator_kappa_count_merge <- CA_kappa_by_indicator %>%
  left_join(CA_mapped_counts, by = "indicator")

head(CA_indicator_kappa_count_merge)
## # A tibble: 6 × 3
##   indicator                                     kappa_stat mapped_counts
##   <chr>                                              <dbl>         <int>
## 1 EA_On-time_graduation                              0.746            73
## 2 EA_Postsecondary_readiness                         0.773            72
## 3 K-12_Access_to_effective_teaching                  0.739            38
## 4 K-12_Access_to_high-quality_academic_supports      0.269            23
## 5 K-12_Access_to_rigorous_coursework                 1                28
## 6 K-12_Curricular_breadth                            0.436            32
clean_labels <- sub("^[^_]+_", "", CA_indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)


CA_indicator_kappa_count_merge$indicator_cleaned <- factor(
  wrapped_labels,
  levels = wrapped_labels[order(CA_indicator_kappa_count_merge$kappa_stat)]
)
CA_balloonplot <- ggballoonplot(CA_indicator_kappa_count_merge, 
                                x = "indicator_cleaned", 
                                y = "kappa_stat", size ="mapped_counts", fill = "kappa_stat") +
  scale_fill_viridis_c(option = "C") +  
  theme_minimal() +
  ylim(-1,1) +
  labs(
    title = "California Interrater Reliability by Indicator",
    subtitle = "Overall Kappa is 0.683",
    x = "Indicator",
    y = "Cohen's Kappa",
    fill = "Kappa",
    size = "Number of Datasets"
  ) +
  theme(
    axis.text.y = element_text(size = 6), legend.title = element_text(size = 8), legend.text = element_text(size = 6), legend.key.size = unit (0.5, "cm"),
  ) +
  coord_flip()

CA_balloonplot
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).

ggsave("20251113_IRR_kappa_balloonplot_California.svg", plot = CA_balloonplot, width = 6.5, height = 6.5)
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
CA_heatmap_long <- CA_ab_long %>%
  mutate(weight = case_when(
    rating_a == 0 & rating_b == 0 ~ 0,  # Both rated zero = Not Rated
    (rating_a == 1 | rating_a == 2) & (rating_b == 3 | rating_b == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_b == 1 | rating_b == 2) & (rating_a == 3 | rating_a == 4) ~ 2,  # One rated indirect, other rated direct = Moderate Disagreement
    (rating_a == 0 & (rating_b == 1 | rating_b == 2)) | (rating_b == 0 & (rating_a == 1 | rating_a == 2)) ~ 2,  # One rated zero, other rated indirect = Moderate Disagreement
    (rating_a == 0 & (rating_b == 3 | rating_b == 4)) | (rating_b == 0 & (rating_a == 3 | rating_a == 4)) ~ 3,  # One rated zero, other rated direct = Serious Disagreement
    (rating_a == 1 | rating_a == 2) & (rating_b == 1 | rating_b == 2) ~ 1,  # Both rated indirect = Agreement
    (rating_a == 3 | rating_a == 4) & (rating_b == 3 | rating_b == 4) ~ 1,  # Both rated direct = Agreement
    TRUE ~ NA_real_  # Handle any other cases (e.g., missing or invalid ratings)
  ))

clean_labels <- sub("^[^_]+_", "", CA_heatmap_long$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)\\s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)

CA_heatmap_long$indicator_cleaned <- wrapped_labels

CA_heatmap_long_selected <- CA_heatmap_long %>%
  select(id, indicator_cleaned, weight)

head(CA_heatmap_long_selected)
## # A tibble: 6 × 3
##   id     indicator_cleaned                           weight
##   <chr>  <chr>                                        <dbl>
## 1 CA1218 "On-time graduation"                             0
## 2 CA1218 "Postsecondary readiness"                        0
## 3 CA1218 "Access to effective teaching"                   0
## 4 CA1218 "Access to high-quality academic\nsupports"      0
## 5 CA1218 "Access to rigorous coursework"                  0
## 6 CA1218 "Curricular breadth"                             0

Dataset disagreement with marginal plots.

# 1. Set factor levels based on heatmap ordering
CA_indicator_levels <- levels(with(CA_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
CA_dataset_levels <- levels(with(CA_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
CA_agreement_heatmap <- ggplot(CA_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "California Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 5),
    axis.text.y = element_text(size = 1),
    legend.text = element_text(size = 6),
    plot.title = element_text(hjust = 0.5)  
  )

# 3. Extract the legend
CA_legend <- cowplot::get_legend(CA_agreement_heatmap)
CA_legend_plot <- wrap_elements(CA_legend)

# 4. Remove legend from main plot
CA_main_plot <- CA_agreement_heatmap + theme(axis.text.y = element_text(size = 1.5), legend.position = "none", plot.title = element_text(hjust = 0.5))

# 5. Create top marginal plot (stacked bar chart by indicator)
CA_top_plot <- CA_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = CA_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  scale_x_discrete(drop = FALSE) +
  theme_minimal() +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    legend.position = "none"
  )

# 6. Create right marginal plot (stacked bar chart by dataset)
CA_right_plot <- CA_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  mutate(id = factor(id, levels = CA_dataset_levels)) %>%
  ggplot(aes(x = id, fill = factor(weight))) +
  geom_bar(position = "stack") +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"), drop = TRUE) +
  theme_minimal() +
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),
    legend.position = "none"
  )

CA_final_plot <- CA_top_plot + CA_legend + CA_main_plot + CA_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(1, 1), c('in', 'null')))

CA_final_plot

ggsave("20251113_agreement_heatmap_with_marginals_California.svg", plot = CA_final_plot, width = 8, height = 12)

Creating the png.

# 1. Set factor levels based on heatmap ordering
CA_indicator_levels <- levels(with(CA_heatmap_long_selected, reorder(indicator_cleaned, -weight)))
CA_dataset_levels <- levels(with(CA_heatmap_long_selected, reorder(id, -weight)))

# 2. Create the main heatmap
CA_agreement_heatmap <- ggplot(CA_heatmap_long_selected, aes(x = reorder(indicator_cleaned, -weight),
                                                             y = reorder(id, -weight),
                                                             fill = factor(weight))) +
  geom_tile() +
  scale_fill_manual(
    values = c("0" = "#E0E0E0", "1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946"),
    labels = c("Not Rated", "Agreement", "Moderate Disagreement", "Serious Disagreement")
  ) +
  labs(
    title = "California Disagreement Heatmap",
    x = "Indicator",
    y = "Dataset Name",
    fill = "Agreement Type"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
    axis.text.y = element_text(size = 30),
    axis.title.y = element_text(size = 20),
    axis.title.x = element_text(size = 20),
    legend.text = element_text(size = 20),
    legend.title = element_text(size = 20),
    plot.title = element_text(size = 25, hjust = 0.5)  
  )

# 3. Extract the legend
CA_legend <- cowplot::get_legend(CA_agreement_heatmap)

# 4. Remove legend from main plot and adjust bar width
CA_main_plot <- CA_agreement_heatmap + theme(axis.text.y = element_text(size = 3), legend.position = "none", plot.title = element_text(size = 25),
                                             axis.title.y = element_text(size = 20),
                                             axis.title.x = element_text(size = 20), hjust = 0.5)

# 5. Create top marginal plot 
CA_top_plot <- CA_heatmap_long_selected %>%
  filter(weight != 0) %>%
  mutate(indicator_cleaned = factor(indicator_cleaned, levels = CA_indicator_levels)) %>%
  ggplot(aes(x = indicator_cleaned, fill = factor(weight))) +
  geom_bar(position = "stack", width = 1) +  
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_text(size = 20),
    legend.position = "none")

# 6. Create right marginal plot
CA_right_data <- CA_heatmap_long_selected %>%
  filter(weight %in% c(1, 2, 3)) %>%
  count(id, weight) %>%
  mutate(id = factor(id, levels = CA_dataset_levels))

CA_right_plot <- ggplot(CA_right_data, aes(x = id, y = n, fill = factor(weight))) +
  geom_col(position = "stack", width = 1, colour = NA) +
  coord_flip() +
  scale_y_continuous(breaks = seq(0, 14, by = 2)) +
  scale_fill_manual(values = c("1" = "#1FA187", "2" = "#5E4FA2", "3" = "#E63946")) +
  theme_minimal() +
  labs(y = "Count") +  
  theme(
    panel.background = element_blank(),
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank(),
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank(),  
    axis.title.x = element_text(size = 20), 
    legend.position = "none",
    panel.spacing = unit(0, "lines")
  )

CA_final_plot3 <- CA_top_plot + CA_legend + CA_main_plot + CA_right_plot + 
  plot_layout(widths = c(3, 1), heights = unit(c(3, 1), c('in', 'null')))

CA_final_plot3
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

ggsave("20251113_agreement_heatmap_with_marginals_California.png", plot = CA_final_plot3, width = 16, height = 24, dpi = 300)
## Warning in plot_theme(plot): The `hjust` theme element is not defined in the
## element hierarchy.

Calculating Cohen’s weighted kappa. Changing the ratings so that zero still represents not rated, but now 1 will represent indirect rating and 2 will represent a direct rating.

CA_ab_directindirect <- CA_ab_long %>%
  select(id, dataset_name, indicator, rating_a, rating_b) %>%
  mutate(across(c(rating_a, rating_b), ~ case_when(
    . == 2 ~ 1,
    . %in% c(3,4) ~ 2,
    TRUE ~ .
  )))

CA_ratings_kappaweighted <- CA_ab_directindirect %>%
  select(rating_a, rating_b)
CA_kappa_weighted <- kappa2(CA_ratings_kappaweighted, weight = "equal")
print(CA_kappa_weighted)
##  Cohen's Kappa for 2 Raters (Weights: equal)
## 
##  Subjects = 5008 
##    Raters = 2 
##     Kappa = 0.722 
## 
##         z = 56 
##   p-value = 0
CA_weighted_kappa_by_indicator <- CA_ab_directindirect %>%
  group_by(indicator) %>%
  summarise(weighted_kappa = kappa2(as.matrix(select(cur_data(), rating_a, rating_b)), weight = "equal")$value)

CA_weighted_kappa_by_indicator
## # A tibble: 16 × 2
##    indicator                                                      weighted_kappa
##    <chr>                                                                   <dbl>
##  1 EA_On-time_graduation                                                   0.812
##  2 EA_Postsecondary_readiness                                              0.872
##  3 K-12_Access_to_effective_teaching                                       0.816
##  4 K-12_Access_to_high-quality_academic_supports                           0.406
##  5 K-12_Access_to_rigorous_coursework                                      1    
##  6 K-12_Curricular_breadth                                                 0.453
##  7 K-12_Engagement_in_schooling                                            0.560
##  8 K-12_Nonacademic_supports_for_student_success                           0.585
##  9 K-12_Nonexclusionary_discipline_practices                               0.936
## 10 K-12_Performance_in_coursework                                          0.742
## 11 K-12_Performance_on_tests                                               0.756
## 12 K-12_School_climate                                                     0.199
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregat…          0.472
## 14 Pre-K_Academic_readiness                                               NA    
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_progr…          1    
## 16 Pre-K_Self-regulation_and_attention_skills                             NA
IRRstatistics_California <- CA_indicator_kappa_count_merge %>%
  left_join(CA_weighted_kappa_by_indicator, by = "indicator") %>%
  left_join(CA_percent_agreement, by = "indicator") %>%
  select(indicator, mapped_counts, percent_agreement, kappa_stat, weighted_kappa)

indicator_order <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs", 
  "K-12_Access_to_effective_teaching",
  "K-12_Access_to_rigorous_coursework",
  "K-12_Curricular_breadth",
  "K-12_Access_to_high-quality_academic_supports",
  "K-12_School_climate",
  "K-12_Nonexclusionary_discipline_practices",
  "K-12_Nonacademic_supports_for_student_success",
  "Pre-K_Academic_readiness",
  "Pre-K_Self-regulation_and_attention_skills",
  "K-12_Engagement_in_schooling",
  "K-12_Performance_in_coursework",
  "K-12_Performance_on_tests",
  "EA_On-time_graduation",
  "EA_Postsecondary_readiness"
)

indicator_name_mapping <- c(
  "K-12_Students_exposure_to_racial_ethnic_and_economic_segregation" = "Racial, Ethnic, and Economic Segregation",
  "Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs" = "High-Quality Pre-K Programs",
  "K-12_Access_to_effective_teaching" = "Effective Teaching",
  "K-12_Access_to_rigorous_coursework" = "Rigorous Coursework",
  "K-12_Curricular_breadth" = "Curricular Breadth",
  "K-12_Access_to_high-quality_academic_supports" = "High-Quality Academic Supports",
  "K-12_School_climate" = "School Climate",
  "K-12_Nonexclusionary_discipline_practices" = "Nonexclusionary Discipline Practices",
  "K-12_Nonacademic_supports_for_student_success" = "Nonacademic Supports for Student Success",
  "Pre-K_Academic_readiness" = "Academic Readiness",
  "Pre-K_Self-regulation_and_attention_skills" = "Self-Regulation and Attention Skills",
  "K-12_Engagement_in_schooling" = "Engagement in Schooling",
  "K-12_Performance_in_coursework" = "Performance in Coursework",
  "K-12_Performance_on_tests" = "Performance on Tests",
  "EA_On-time_graduation" = "On-Time Graduation",
  "EA_Postsecondary_readiness" = "Postsecondary Readiness"
)

IRRstatistics_California_ordered <- IRRstatistics_California %>%
  arrange(match(indicator, indicator_order)) %>%
  mutate(indicator_name = indicator_name_mapping[indicator]) %>%
  select(indicator_name, mapped_counts, percent_agreement, kappa_stat, weighted_kappa) %>%
    rename(
      "Indicator" = indicator_name,
      "Mapped Datasets" = mapped_counts,
      "Percent Agreement" = percent_agreement,
      "Cohen's Kappa" = kappa_stat,
      "Cohen's Weighted Kappa" = weighted_kappa) %>%
  mutate(across(where(is.numeric), ~ ifelse(is.na(.) | is.nan(.), NA, .)))

kable(IRRstatistics_California_ordered)
Indicator Mapped Datasets Percent Agreement Cohen’s Kappa Cohen’s Weighted Kappa
Racial, Ethnic, and Economic Segregation 154 75.39936 0.4810293 0.4721985
High-Quality Pre-K Programs 9 100.00000 1.0000000 1.0000000
Effective Teaching 38 95.20767 0.7388330 0.8157523
Rigorous Coursework 28 100.00000 1.0000000 1.0000000
Curricular Breadth 32 92.65176 0.4361244 0.4529338
High-Quality Academic Supports 23 92.97125 0.2690021 0.4059782
School Climate 40 88.81789 0.1994885 0.1994885
Nonexclusionary Discipline Practices 40 97.76358 0.8926033 0.9356695
Nonacademic Supports for Student Success 33 93.29073 0.5717636 0.5846511
Academic Readiness NA 100.00000 NA NA
Self-Regulation and Attention Skills NA 100.00000 NA NA
Engagement in Schooling 96 82.74760 0.5483888 0.5596703
Performance in Coursework 34 95.84665 0.7422563 0.7422563
Performance on Tests 57 93.61022 0.7710147 0.7558992
On-Time Graduation 73 91.37380 0.7462543 0.8117440
Postsecondary Readiness 72 92.33227 0.7725842 0.8722232
write.csv(IRRstatistics_California_ordered, "20251113_IRRstatistics_California.csv", row.names = FALSE)

ALL STATE PERCENT AGREEMENT

allstates_ab_long <- bind_rows(
  OH_ab_long,
  TX_ab_long,
  NC_ab_long,
  DC_ab_long,
  KY_ab_long,
  MD_ab_long,
  OR_ab_long,
  CA_ab_long
)

str(allstates_ab_long)
## tibble [59,184 × 6] (S3: tbl_df/tbl/data.frame)
##  $ id          : chr [1:59184] "OH1218" "OH1218" "OH1218" "OH1218" ...
##  $ dataset_name: chr [1:59184] "21-22_Achievement_Building.xlsx" "21-22_Achievement_Building.xlsx" "21-22_Achievement_Building.xlsx" "21-22_Achievement_Building.xlsx" ...
##  $ indicator   : chr [1:59184] "EA_On-time_graduation" "EA_Postsecondary_readiness" "K-12_Access_to_effective_teaching" "K-12_Access_to_high-quality_academic_supports" ...
##  $ rating_a    : num [1:59184] 0 0 0 0 0 0 0 0 0 0 ...
##  $ rating_b    : num [1:59184] 0 0 0 0 0 0 0 0 0 0 ...
##  $ agreement   : num [1:59184] 1 1 1 1 1 1 1 1 1 1 ...
dataset_agreement <- allstates_ab_long %>%
  group_by(id) %>%
  summarise(percent_agreement = mean(agreement) * 100)

dataset_agreement <- dataset_agreement %>%
  ungroup()

summary(dataset_agreement)
##       id            percent_agreement
##  Length:3699        Min.   : 25.00   
##  Class :character   1st Qu.: 87.50   
##  Mode  :character   Median : 93.75   
##                     Mean   : 93.55   
##                     3rd Qu.:100.00   
##                     Max.   :100.00
write.csv(dataset_agreement, "20251113_percent_agreement_by_id_8state.csv", row.names = FALSE)
indicators_mapped_per_id <- allstates_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(id) %>%
  summarise(indicators_mapped_per_id = n_distinct(indicator), .groups = "drop") %>%
  right_join(allstates_ab_long %>% distinct(id), by = "id") %>%
  mutate(indicators_mapped_per_id = replace_na(indicators_mapped_per_id, 0))

summary(indicators_mapped_per_id)
##       id            indicators_mapped_per_id
##  Length:3699        Min.   : 0.000          
##  Class :character   1st Qu.: 1.000          
##  Mode  :character   Median : 2.000          
##                     Mean   : 2.169          
##                     3rd Qu.: 3.000          
##                     Max.   :13.000

ALL STATE COHEN’S KAPPA

contingency_table <- data.frame(allstates_ab_long$rating_a, allstates_ab_long$rating_b)
kappa_result <- kappa2(contingency_table, weight = "unweighted")
print(kappa_result)
##  Cohen's Kappa for 2 Raters (Weights: unweighted)
## 
##  Subjects = 59184 
##    Raters = 2 
##     Kappa = 0.685 
## 
##         z = 240 
##   p-value = 0
kappa_by_indicator <- allstates_ab_long %>%
  group_by(indicator) %>%
  summarise(
    kappa_stat = {
      contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
      kappa_result <- kappa2(contingency_table, weight = "unweighted")
      kappa_result$value  
    },
    .groups = "drop"
  )

kappa_by_indicator
## # A tibble: 16 × 2
##    indicator                                                        kappa_stat
##    <chr>                                                                 <dbl>
##  1 EA_On-time_graduation                                                 0.745
##  2 EA_Postsecondary_readiness                                            0.670
##  3 K-12_Access_to_effective_teaching                                     0.771
##  4 K-12_Access_to_high-quality_academic_supports                         0.312
##  5 K-12_Access_to_rigorous_coursework                                    0.665
##  6 K-12_Curricular_breadth                                               0.456
##  7 K-12_Engagement_in_schooling                                          0.612
##  8 K-12_Nonacademic_supports_for_student_success                         0.473
##  9 K-12_Nonexclusionary_discipline_practices                             0.768
## 10 K-12_Performance_in_coursework                                        0.482
## 11 K-12_Performance_on_tests                                             0.758
## 12 K-12_School_climate                                                   0.598
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_segregation      0.598
## 14 Pre-K_Academic_readiness                                              0.862
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_programs      0.560
## 16 Pre-K_Self-regulation_and_attention_skills                            0.674
mapped_counts <- allstates_ab_long %>%
  filter(rating_a != 0 | rating_b != 0) %>%
  group_by(indicator) %>%
  summarise(n_mapped = n_distinct(id), .groups = "drop")

summary(mapped_counts)
##   indicator            n_mapped     
##  Length:16          Min.   :  47.0  
##  Class :character   1st Qu.: 157.5  
##  Mode  :character   Median : 257.0  
##                     Mean   : 501.4  
##                     3rd Qu.: 861.8  
##                     Max.   :1514.0
kappa_with_ci <- kappa_by_indicator %>%
  left_join(mapped_counts, by = "indicator") %>%
  mutate(standard_error = sqrt((1 - kappa_stat^2) / n_mapped),
         ci_lower = kappa_stat - 1.96 * standard_error,
         ci_upper = kappa_stat + 1.96 * standard_error
         )

kappa_with_ci
## # A tibble: 16 × 6
##    indicator                kappa_stat n_mapped standard_error ci_lower ci_upper
##    <chr>                         <dbl>    <int>          <dbl>    <dbl>    <dbl>
##  1 EA_On-time_graduation         0.745     1094         0.0202    0.705    0.784
##  2 EA_Postsecondary_readin…      0.670      993         0.0236    0.623    0.716
##  3 K-12_Access_to_effectiv…      0.771      264         0.0392    0.695    0.848
##  4 K-12_Access_to_high-qua…      0.312      250         0.0601    0.194    0.430
##  5 K-12_Access_to_rigorous…      0.665      387         0.0380    0.591    0.740
##  6 K-12_Curricular_breadth       0.456      212         0.0611    0.336    0.576
##  7 K-12_Engagement_in_scho…      0.612     1205         0.0228    0.567    0.657
##  8 K-12_Nonacademic_suppor…      0.473      111         0.0836    0.309    0.637
##  9 K-12_Nonexclusionary_di…      0.768      163         0.0502    0.669    0.866
## 10 K-12_Performance_in_cou…      0.482      500         0.0392    0.405    0.559
## 11 K-12_Performance_on_tes…      0.758     1514         0.0168    0.725    0.791
## 12 K-12_School_climate           0.598      241         0.0516    0.497    0.699
## 13 K-12_Students_exposure_…      0.598      818         0.0280    0.544    0.653
## 14 Pre-K_Academic_readiness      0.862       83         0.0556    0.753    0.971
## 15 Pre-K_Access_to_and_par…      0.560      141         0.0697    0.424    0.697
## 16 Pre-K_Self-regulation_a…      0.674       47         0.108     0.462    0.885
indicator_kappa_count_merge <- kappa_by_indicator %>%
  left_join(mapped_counts, by = "indicator")

indicator_kappa_count_merge
## # A tibble: 16 × 3
##    indicator                                                 kappa_stat n_mapped
##    <chr>                                                          <dbl>    <int>
##  1 EA_On-time_graduation                                          0.745     1094
##  2 EA_Postsecondary_readiness                                     0.670      993
##  3 K-12_Access_to_effective_teaching                              0.771      264
##  4 K-12_Access_to_high-quality_academic_supports                  0.312      250
##  5 K-12_Access_to_rigorous_coursework                             0.665      387
##  6 K-12_Curricular_breadth                                        0.456      212
##  7 K-12_Engagement_in_schooling                                   0.612     1205
##  8 K-12_Nonacademic_supports_for_student_success                  0.473      111
##  9 K-12_Nonexclusionary_discipline_practices                      0.768      163
## 10 K-12_Performance_in_coursework                                 0.482      500
## 11 K-12_Performance_on_tests                                      0.758     1514
## 12 K-12_School_climate                                            0.598      241
## 13 K-12_Students_exposure_to_racial_ethnic_and_economic_seg…      0.598      818
## 14 Pre-K_Academic_readiness                                       0.862       83
## 15 Pre-K_Access_to_and_participation_in_high-quality_pre-K_…      0.560      141
## 16 Pre-K_Self-regulation_and_attention_skills                     0.674       47
clean_labels <- sub("^[^_]+_", "", indicator_kappa_count_merge$indicator)
clean_labels <- gsub("_", " ", clean_labels)
clean_labels <- sub("^(K-12|Ea|Pre-k)//s+", "", clean_labels, ignore.case = TRUE)
clean_labels <- str_to_sentence(clean_labels)
wrapped_labels <- str_wrap(clean_labels, width = 35)


indicator_kappa_count_merge$indicator_cleaned <- factor(
  wrapped_labels,
  levels = wrapped_labels[order(indicator_kappa_count_merge$kappa_stat)]
)
kappa_with_ci$indicator_cleaned <- factor(wrapped_labels,
  levels = wrapped_labels[order(indicator_kappa_count_merge$kappa_stat)]
)
kappaforestplot_8state <- ggplot(kappa_with_ci, aes(x = kappa_stat, y = indicator_cleaned)) +
  geom_point(aes(size = n_mapped), color = "lightblue") +
  geom_errorbarh(aes(xmin = ci_lower, xmax = ci_upper), height = 0.2) +
  scale_x_continuous(limits = c(-1, 1), breaks = seq(-1, 1, by = 0.2), expand = c(0,0)) +
  scale_size_continuous(breaks = c(250, 500, 750, 1000, 1250, 1500)) + 
  theme_minimal() +
  theme(
    axis.text.y = element_text(size = 8),
    panel.grid.minor = element_blank(),
    axis.title.y = element_text(margin = margin(r = 10)),
    axis.title.x = element_text(margin = margin(t = 10)),
    axis.line.x = element_line(color = "black"),
    axis.line.y = element_line(color = "black"),
    panel.border = element_blank(),
  ) +
  labs(
    title = "Interrater Reliability Across All States",
    subtitle = "Cohen's Kappa with 95% Confidence Interval",
    x = "Cohen's Kappa",
    y = "Indicator",
    size = "Number of \nDatasets"
  )
## Warning: `geom_errobarh()` was deprecated in ggplot2 4.0.0.
## ℹ Please use the `orientation` argument of `geom_errorbar()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
kappaforestplot_8state
## `height` was translated to `width`.

ggsave("20251113_kappaforestplot_8state.svg", plot = kappaforestplot_8state, width = 6.5, height = 6.5)
## `height` was translated to `width`.

COHEN’S KAPPA BY STATE

allstates_ab_long <- allstates_ab_long %>%
  mutate(state = str_sub(id, 1, 2))

kappa_by_state <- allstates_ab_long %>%
  group_by(state) %>%
  summarise(
    kappa_stat = {
      contingency_table <- data.frame(rating_a = rating_a, rating_b = rating_b)
      kappa_result <- kappa2(contingency_table, weight = "unweighted")
      kappa_result$value
    },
    .groups = "drop"
  )

kappa_by_state
## # A tibble: 8 × 2
##   state kappa_stat
##   <chr>      <dbl>
## 1 CA         0.683
## 2 DC         0.547
## 3 KY         0.682
## 4 MD         0.665
## 5 NC         0.657
## 6 OH         0.654
## 7 OR         0.771
## 8 TX         0.707
mapped_counts_state <- allstates_ab_long %>%
  filter(rating_a != 0 | rating_b !=0) %>%
  group_by(state) %>%
  summarise(n_mapped = n_distinct(id), .groups = "drop")

summary(mapped_counts_state)
##     state              n_mapped    
##  Length:8           Min.   :267.0  
##  Class :character   1st Qu.:306.0  
##  Mode  :character   Median :454.0  
##                     Mean   :452.4  
##                     3rd Qu.:500.0  
##                     Max.   :804.0
kappa_with_ci_state <- kappa_by_state %>%
  left_join(mapped_counts_state, by = "state") %>%
  mutate(standard_error = sqrt((1 - kappa_stat^2) / n_mapped),
         ci_lower = kappa_stat - 1.96 * standard_error,
         ci_upper = kappa_stat + 1.96 * standard_error
         )

kappa_with_ci_state
## # A tibble: 8 × 6
##   state kappa_stat n_mapped standard_error ci_lower ci_upper
##   <chr>      <dbl>    <int>          <dbl>    <dbl>    <dbl>
## 1 CA         0.683      313         0.0413    0.602    0.764
## 2 DC         0.547      285         0.0496    0.449    0.644
## 3 KY         0.682      563         0.0308    0.622    0.743
## 4 MD         0.665      479         0.0341    0.598    0.732
## 5 NC         0.657      267         0.0461    0.567    0.748
## 6 OH         0.654      444