Chronic Absenteeism in Lynn Public Schools
For my capstone project in the MS program, I wanted to find out if the distance a student lives from school has any measurable impact on whether they show up.
I geocoded student addresses across the district and correlated each student's distance from school with their absenteeism record. I wrote scripts in R to run the full analysis — scatter plots, violin plots, box plots, KDE heatmaps, hexbin maps, and distance band breakdowns. I also broke the data down by grade level, ethnicity, ML status, and SPED status to look for patterns across subgroups.
The result: an extremely slight negative correlation. Distance alone is not a strong predictor of attendance. But the maps tell a much richer story about where students live, where the hotspots are, and how absenteeism patterns vary across demographics.