A Large-Scale Corpus for Assessing Source-Based Writing Quality: Asap 2.0
21 Pages Posted: 8 Feb 2025
Abstract
This paper introduces ASAP 2.0, a dataset of ~25,000 source-based argumentative essays from U.S. secondary students. The corpus addresses the shortcomings of the original ASAP corpus by including demographic data, consistent scoring rubrics, and source texts. ASAP 2.0 aims to support the development of unbiased, sophisticated Automatic Essay Scoring (AES) systems that can foster improved educational practices by providing summative to students. The corpus is designed for broad accessibility with the hope of facilitating research into writing quality and AES system biases.
Keywords: Corpus Linguistics, source-based writing, writing quality
Suggested Citation: Suggested Citation