A Large-Scale Corpus for Assessing Source-Based Writing Quality: Asap 2.0

21 Pages Posted: 8 Feb 2025

See all articles by Scott Andrew Crossley

Scott Andrew Crossley

Vanderbilt University

Perpetual Baffour

affiliation not provided to SSRN

L. Burleigh

affiliation not provided to SSRN

Jules King

affiliation not provided to SSRN

Abstract

This paper introduces ASAP 2.0, a dataset of ~25,000 source-based argumentative essays from U.S. secondary students. The corpus addresses the shortcomings of the original ASAP corpus by including demographic data, consistent scoring rubrics, and source texts. ASAP 2.0 aims to support the development of unbiased, sophisticated Automatic Essay Scoring (AES) systems that can foster improved educational practices by providing summative to students. The corpus is designed for broad accessibility with the hope of facilitating research into writing quality and AES system biases.

Keywords: Corpus Linguistics, source-based writing, writing quality

Suggested Citation

Crossley, Scott Andrew and Baffour, Perpetual and Burleigh, L. and King, Jules, A Large-Scale Corpus for Assessing Source-Based Writing Quality: Asap 2.0. Available at SSRN: https://ssrn.com/abstract=5129353 or http://dx.doi.org/10.2139/ssrn.5129353

Scott Andrew Crossley (Contact Author)

Vanderbilt University ( email )

2301 Vanderbilt Place
Nashville, TN 37240
United States

Perpetual Baffour

affiliation not provided to SSRN ( email )

No Address Available

L. Burleigh

affiliation not provided to SSRN ( email )

No Address Available

Jules King

affiliation not provided to SSRN ( email )

No Address Available

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
43
Abstract Views
250
PlumX Metrics