Analyzing Gender & Performance in Competitive Environments with Machine Learning: A High School Debate Case Study
22 Pages Posted: 28 Mar 2022
Date Written: February 24, 2022
Abstract
In this paper, I examine the relationship between gender and performance in high school Public Forum, a 2 on 2 style of debate where judges subjectively decide the winner, for 3 seasons from 2019 to 2022. I use a variety of metrics to gauge success, including a composite score, speaker points, and win rates. I classify the gender of competitors using Natural Language Processing on a training set of ethnically-representative names based on data provided by the National Speech and Debate Association. I introduce Gender Dominance, a calculus-based approach to quantifying the gender disparity (in terms of overrepresentation and underrepresentation) in any given range of a performance metric. I find that, year-over-year, all examined performance metrics have shown a decrease in the gender disparity to the point where, for the 2021-22 season, they do not indicate any statistically significant difference in performance between males or females. Finally, I identify several factors that this decrease could be attributed to, including debater-led advocacy (both in and out of round) and tournament-led advocacy regarding gender discrimination. My research does not guarantee that any portion of rounds are free from discrimination. Instead, it compares trends between gender and performance while offering possible explanations for causality, which future research should work to prove with the ultimate goal of creating a general plan of attack to address gender disparities within any competitive field.
Keywords: Machine Learning, Natural Language Processing, Gender Dominance Factor, Gender Disparities, Debate
JEL Classification: C80
Suggested Citation: Suggested Citation