Preregistration of Machine Learning Research Design. Against P-hacking
Preregistration of machine learning research design. Against P-hacking in: BEING PROFILED:COGITAS ERGO SUM, ed. Emre Bayamlıoğlu, Irina Baraliuc , Liisa Janssens, Mireille Hildebrandt Amsterdam University Press 2018 (Forthcoming)
3 Pages Posted: 19 Oct 2018
Date Written: September 27, 2018
This brief provocation targets the mantra of the trade-off between accuracy and interpretability: the higher the accuracy, the lower the interpretability. It seems that this trade-off appeals to a deep-seated desire for magical thinking; the lure of things that work well even if we have no idea why. The suggestion is that in the realm of specific types of machine learning (ML), neither causality nor reasoning matters. Correlation and prediction are all that counts. The story goes that not just lay people, those using an ML application or those targeted by its decisions, but even those who developed the application cannot explain why it gets things right.
I will confront this narrative from the perspective of ML research design, arguing that accuracy depends on the appropriate selection and curation of training and validation data, a properly articulated machine-readable task, a well-developed hypotheses space, and the selection of a relevant performance metric. The latter, indeed, may give rise to P-hacking: cherry picking the most favourable performance metric. This entails that accuracy in the realm of data should not be conflated with correctness in the realm of atoms. In other words, if we cannot explain why an ML application gets things right, we cannot be sure that it gets things right.
Keywords: Machine Learning, Research Design, Trade-offs, Accuracy and Interpretability, P-hacking
Suggested Citation: Suggested Citation