Of late, I have been reading some scientific articles and opinion pieces regarding one of the biggest fears that I, and I am sure other scientists in the field too, face often. The fear that there may be bugs left in the trail of code somewhere that we depend upon so dearly to analyse and interpret datasets, making scientific judgements and conclusions. Here is a recent article in Nature which should force us all to re-think and re-evaluate our coding practices.
In my view, and I tell this to my students also, scientists should try to adhere to the following basic guidelines when writing their code:
– Mention the name of the coder and date along with any versioning
– If the code is based on or derived from some other code, mention that too
– Use extensive commenting, for your own and others’ sake
– Check and double-check your code
Recently, there has been a lot of advocacy on making the code public and sharing it in public; however, I do think only well-documented code should be published, because a non-documented or badly-documented code can actually cause more confusion. Here is an old article in Nature which argues to the contrary. Let us know your coding practices and recommendations in the comments.
Note: This post is inspired from this fascinating article in Nature, which I encourage you all to read: Computational science: …Error