References (46)


Citations (2)



The Scientific Method in Practice: Reproducibility in the Computational Sciences

Victoria Stodden

Columbia University - Department of Statistics

February 9, 2010

MIT Sloan Research Paper No. 4773-10

Since the 1660’s the scientific method has included reproducibility as a mainstay in its effort to root error from scientific discovery. With the explosive growth of digitization in scientific research and communication, it is easier than ever to satisfy this requirement. In computational research experimental details and methods can be recorded in code and scripts, data is digital, papers are frequently online, and the result is the potential for “really reproducible research.” Imagine the ability to routinely inspect code and data and recreate others’ results: Every step taken to achieve the findings can potentially be transparent. Now imagine anyone with an Internet connection and the capability of running the code being able to do this.

This paper investigates the obstacles blocking the sharing of code and data to understand conditions under which computational scientists reveal their full research compendium. A survey of registrants at a top machine learning conference (NIPS) was used to discover the strength of underlying factors that affect the decision to reveal code, data, and ideas. Sharing of code and data is becoming more common as about a third of respondents post some on their websites, and about 85% self report to have some code or data publicly available on the web. Contrary to theoretical expectations, the decision to share work is grounded in communitarian norms, although when work remains hidden private incentives dominate the decision. We find that code, data, and ideas are each regarded differently in terms of how they are revealed and that guidance from scientific norms varies with pervasiveness of computation in the field. The largest barriers to sharing are time involved in preparation of work and the legal Intellectual Property framework scientists face.

This paper does two things. It provides evidence in the debate about whether scientists’ research revealing behavior is wholly governed by considerations of personal impact or whether the reasoning behind the revealing decision involves larger scientific ideals, and secondly, this research describes the actual sharing behavior in the Machine Learning community.

Number of Pages in PDF File: 33

Download This Paper

Date posted: February 9, 2010  

Suggested Citation

Stodden, Victoria, The Scientific Method in Practice: Reproducibility in the Computational Sciences (February 9, 2010). MIT Sloan Research Paper No. 4773-10. Available at SSRN: http://ssrn.com/abstract=1550193 or http://dx.doi.org/10.2139/ssrn.1550193

Contact Information

Victoria Stodden (Contact Author)
Columbia University - Department of Statistics ( email )
3022 Broadway
New York, NY 10027
United States
Feedback to SSRN

Paper statistics
Abstract Views: 2,834
Downloads: 442
Download Rank: 39,001
References:  46
Citations:  2

© 2015 Social Science Electronic Publishing, Inc. All Rights Reserved.  FAQ   Terms of Use   Privacy Policy   Copyright   Contact Us
This page was processed by apollo7 in 0.250 seconds