Towards the Automated Evaluation of Crowd Work: Machine-Learning Based Classification of Complex Texts Simplified by Laymen
46th Hawaii International Conference on System Sciences, January 2013
11 Pages Posted: 1 Aug 2014
Date Written: 2013
The work paradigm of crowdsourcing holds huge potential for organizations by providing access to a large workforce. However, an increase of crowd work entails increasing effort to evaluate the quality of the submissions. As evaluations by experts are inefficient, time-consuming, expensive, and are not guaranteed to be effective, our paper presents a concept for an automated classification process for crowd work. Using the example of crowd generated patent transcripts we build on interdisciplinary research to present an approach to classifying them along two dimensions – correctness and readability. To achieve this, we identify and select text attributes from different disciplines as input for machine- learning classification algorithms and evaluate the suitability of three well regarded algorithms, Neural Networks, Support Vector Machines and k-Nearest Neighbor algorithms. Key findings are that the proposed classification approach is feasible and the SVM classifier performs best in our experiment.
Suggested Citation: Suggested Citation