One specific challenge, when writing code as a scientist, is that we care a lot about getting the right answer; but of course, the right answer is not always obvious. So we should be very careful with the code we write. A piece of code that crashes is annoying; but a piece of code that runs, and give you the wrong answer can compromise your science and your career. This guide will help you adopt practices that make it less likely to introduce mistakes in your code, and more likely to catch them. Hopefully, this will let all of us write code we can trust more.
Good principles in scientific computing can help you write code that is easier to maintain, easier to reproduce, and easier to debug. But it can be difficult to find an introduction to get you started. The goal of this project is to provide reproducible documents you can use to get started on the most important points. You can use these lessons on your own, or as a group.
Who is this material for?
This material is aimed at people who have already interacted with a computer using a programming language, but want to adopt best practices that make their code more robust. It can also be used to facilitate the onboarding of new people in your lab or in your project.
Scientific computing can be very diverse, ranging from a few-step analysis of small data sets to simulations running for weeks on supercomputers. We focus on the most common situations that every scientist encounters at some stage of a research project: data analyses performed on a standard desktop computer. The general ideas and principles that we expose carry over to other situations as well, but the concrete tools and methods may not be suitable for tasks requiring special hardware such as GPUs or supercomputers, or for projects requiring a significant software development effort.
We will use the Julia programming language; but you don’t need to know anything about it either. We will keep the discussion very general, and not use any of the (very cool!) language-specific features and syntax. And do you want good news? You don’t need to install anything! You can run everything on JuliaBox, a cloud-based service to run reproducible documents as notebooks (if you want to read a short introduction to notebooks, we have written one here).
In fact, you will see that good practices for scientific computing have very little to do with tools and technical things; instead, they rely on thinking about programming in a slightly different way. You will be able to apply these principles to any language you prefer to use.
How to use this material
The best way to read this material is to keep a window with either JuliaBox or a terminal running Julia open, and type the code. It is tempting to copy and paste, but typing the code actually matters.
Snippets of code that are important are on a white background, like so:
[rand(i) for i in 1:5]
Bits of code of lower importance (pseudocode or code you are not meant to type), as well as the output of the important parts of code, are presented this way:
for each_element in vector function(each_element) end
Throughout the lessons, we have added some asides – they are ranked in order of importance. “Opinions” are points we would like to raise for the reader’s consideration, and can be ignored. Example:
People who think it’s OK to criticize others based on their choice of language, OS, text editor, etc, should go home and think about what they did. They should not make themselves a cup of tea; tea is for good people.
“Warnings” are points that can be important, but not necessarily as a novice. It is worth keeping a mental note of them, especially in the long term. Example:
Any time you are about to comment on people’s choice of tools, ask yourself whether this is really necessary, and the answer is usually “no”. The Good Tool is the one that works for its user.
“Dangers” are really important point, that can prove especially dangerous or risky to everyone. They are worth reading a few times over. Example:
For real tho, this toxic behaviour is driving brilliant people away, and should never be tolerated. Disliking Windows has not made anyone edgy or cool since 1998.
There are a number of ways to contribute. Before you start, please have a look at our Code of Conduct. It boils down to be nice and respectful – no contribution, no matter how amazing it may be, justifies or excuses bad behaviour.
The first thing you can do is comment on Issues that have the “Request for feedback” label. They represent situations for which we are actively seeking community feedback, and anyone is always welcome to chime in.
If you are not sure about opening an issue, you can use the chatroom. We’ll be glad to have a more informal conversation with you!
If there is a more specific point you would like to raise, you can create a new Issue, and explain your idea, critique, or comment. And of course, you can always browse the current Issues, to get a sense of what is being discussed.
If you want to contribute more, then great! Have a look at the contribution guidelines first, to get you started with setting up a development environment. You can have a look at “Good first issues”, if you want some inspiration.
Comments, ideas, feedback: Hao Ye, Philipp Bayer, Tim Head, Ethan White
Lesson contents: Timothée Poisot
Other contributions: Konrad Hinsen
Want to read more?
In a rush? Yes you are. We suggest “Good enough practices in scientific computing” to get you started.
A little bit more time? We think “Best practices in scientific computing” might suit you.
Finally, a short Q&A at Nature Jobs about this project.