There are at least four benefits of releasing the code and data produced during a thesis or research project (citing textually this laboratory):
- Allows to reproduce figures in the revisions of a paper
- Other people who want to do research in the field can start from the current state of the art, instead of spending months trying to figure out what was exactly done in a certain paper
- Makes easier to compare the method to existing ones
- Increases the impact of the research
This is really cool for someone trying to develop its own study as he/she can have access to world-quality material for free. However, I have observed that this is a practice followed by well-known universities, with federal financial support, and that usually host international students. Is this practice also recommended for universities with little or no federal financial support and no grants for students?
I am curious if a student from a university like the latter would improve his/her chances to work in academia by releasing his/her material in this reproducible research modality.
Answer
An article about a computational result is advertising, not scholarship. The actual scholarship is the full software environment, code and data, that produced the result. (Buckheit & Donoho, 1995)
Yes, you should publish your code and data. Reproducibility is part of the definition of science: if the results of your experiments or computations cannot be replicated by different people in a different location, then you're not doing science. Far from being a mere philosophic concern, reproducible research has been a key issue in prominent controversies like climategate and cancer research clinical trials.
The person most likely to benefit from your efforts to clean up and publish your code and data is your future self. Why?
Error is ubiquitous in scientific computing...I find that researchers quite generally forget what they have done and misrepresent their computations. (Donoho)
The first step toward working reproducibly is simply to put the code and data that is used in your published research out in the open.
You may adopt reproducible research practices for philosophical reasons, but you will soon find that they bring more direct benefits. Because you write code and prepare data with the expectation that it will be seen by others, you'll find it much easier for yourself and your colleagues to build on past work. New collaborations may form when others discover your work through openly released code and data. And the code itself may be the main subject of publications in journals that have come to recognize the importance of scientific software.
More resources:
No comments:
Post a Comment