Recently, we had a lecture about Reproducible research on of the slides was:
Reproducibility: start from the same samples/data, use the same methods, get the same results.
Replicability: conduct again the experiment with independent samples and/or methods to get confirmatory results.
Replicability = Reproducibility + Conduct experiment again
Replicability might be challenging in epidemiology (recruit again a cohort) or molecular biology (complex cell manipulation).
Reproducibility should be a minimum standard. One should strive to at least make his/her own research reproducible.
I find many articles that use software and when they hint what kind of analysis did they don't provide the code nor the data. There are public free ways of storing data and code of studies, and even with a DOI.
So far, I thought that some data might not be freely available because it has some private information (my field is bioinformatics), or the authors intend to use it for further investigations and want to keep for themselves the data.
The same happens with the code is the intellectual property of the lab or principal investigator, but retaining the rights of the code don't goes against the replicability.
Why are these papers accepted and publicized, even if they don't allow reproducibility?
Related: Reproducible Studies? and an example of the problem it causes Can up to 70% of scientific studies not be reproduced?.
Edit:
Some other papers about replicability: 1, 2.
Excel case: in this paper we can see an example. Reviewers of Growth in a Time of Debt estimated (note that this is not measurable/checkable afirmation) that the analysis could bring the results presented.
(I couldn't find a description of the methods used for the analysis on the paper, but it is a different field of mine and I have skimmed through).
But in new methods without prior experience/validation how can one estimate it without looking to the analysis itself?
And in "old" methods, how bad would be to share them if they are already checked?"Understand why replicability is important, and you'll understand which guidelines and rules should be applied, and how to deal with research where guidelines are skipped.":
Replicability is important because science is about finding objective mesurable relations. This makes the relationship independent of who performs the study. But this can be discussed/answered on another question :)
I am aware that we do our mistakes, (see my other question here on academia), but we should aim for the best behavior and the best science.
"Put another way, there are finite resources so the more you rerun the same code the less scientific progress you make overall"
I don't think that we make less scientific progress overall rerunning the same code. Checking that we know for sure that A is true is far better than work for 3 years or more and then discover that A was wrong.
How many finite resources are/were used on studies based on those the ALS Therapy Development Institute couldn't replicate?
At the same time the induced pluripotent cells were hard to reproduce and replicate, but this isn't a software based, or the recent example of @tpg2114, 3 years to replicate their own study in 4 new settings.Quality of academic software when sharing it, here it seems that it is better to share the awful, crapy code rather than hide it.
Necessity of the reviewer of the code and data was answered here. In short:
Of course, the degree to which a referee is expected to verify the correctness of results varies greatly between fields. But you can always choose a personal standard higher than what's usual in your field. Just realize that good refereeing takes a significant time investment.
It seems that the comments are extending, and from the answers it seems that it is not the job of the reviewers or the journals, it might be a job for the reader (and of no one), or it doesn't worth it to make articles reproducible because it is too hard.
No comments:
Post a Comment