Friday 31 January 2020

Open-source the project code before or after publication?


I recently completed a part of my project and communicated a paper to a conference. Let's call the paper's title as "project x: for this and that". Now, I wish to open-source project x to facilitate reproducible research and to have more (at least some) people use it (and cite it!).


Are there any specific drawbacks or risks involved in open-sourcing project x on, say, Github or Sourceforge? Do note here that I would still be improving on project x, and possibly sending the extended version to a journal (my area of work being Computer Science).


I understand that if a conference/journal requires double-blind review and my project is searchable on the Internet, I am revealing my identity to the reviewers. This is bad, right?


Are there any other cons I should be considering? And are there any pros of open-sourcing before a making it into a publication?



Answer




Double-blind reviews are usually more common in conferences. For example, I don't know of any journals using double-blind review in my field (machine learning). I'm going to assume revealing your identity is not an issue (if it is, circumstances differ).


Whenever relevant I provide an implementation along with a paper. This also helps reviewers, in case they want to fiddle with an algorithm under slightly different circumstances than those reported in the paper (which is a good thing!). When an implementation is provided, the option is there.


The pros are increased visibility, reproducibility and (in my opinion) credibility since you allow everyone to try for themselves instead of taking your word for it in the paper. On rare occasions, your software may become quite popular during the review period, which may positively impact the paper under review.


A potential con is that someone may discover a critical bug in your implementation. From the perspective of software engineering this is always helpful since you can then improve the software. For the associated paper this may be a good, bad or irrelevant thing, depending on the type of bug:



  1. One that does not influence the results reported in your paper: no big deal. Simply fix and move on. Best case this improves the user experience of your software, worst case you lost a bit of time fixing something unimportant.

  2. One that does influence the results: big deal. This will at least delay a potential publication. Ofcourse it is better to find such errors and fix them instead of publishing erroneous conclusions, but this may have an impact on credibility.


No comments:

Post a Comment

evolution - Are there any multicellular forms of life which exist without consuming other forms of life in some manner?

The title is the question. If additional specificity is needed I will add clarification here. Are there any multicellular forms of life whic...