Tuesday 25 April 2017

When should you publish code on GitHub? Work-in-progress or after publication?



What is policy are you following about publishing data analysis code on GitHub? Do you do it after publishing or as a work-in-progress?


I developed a number of Python algorithms to analyse a large dataset, and I would like to make my work visible.



Answer



There is a movement gathering strength lately to encourage publishing the code:



Nature-Publish your computer code: it is good enough


Or, more vehement: If there is no code, there is no paper


The reasons outlied on the article are very reasonable. If you are expected to publish detailed derivations, experimental methods, and proofs of theorems, why would you be allowed to keep the code? No one will accept a theorem if you claim: "the proof is too messy to show, but hey, here are three cases where it works".


I think the best way is to publish the code used as supplementary material, and include a link to the repository, so people can get the improved versions. If you are concerned about people using too bleeding edge versions, make releases, but leave the development public. This will also help you get bugfixes and contributions.


Thank you for wanting to release your code. I really believe this attitude will help make research better.


Edit:


After some time, I have something to add. Most of the code in an application is there for "administrative purposes": load and write data, massage, check conditions... For publishing, that part can be as hackish as one needs it to be. The real "research" is usually in a small part. That is where one should dedicate one or two hours of adding a few comments and clearing the code.


For the rest, a docstring in the functions or a paragraph explaining the aim, should be fine.


No comments:

Post a Comment

evolution - Are there any multicellular forms of life which exist without consuming other forms of life in some manner?

The title is the question. If additional specificity is needed I will add clarification here. Are there any multicellular forms of life whic...