All entries for Wednesday 28 November 2018
November 28, 2018
No, not the Microsoft spreadsheet program. Warwick RSE don't think Excel is evil, or that it makes errors more likely, but most of what we work on or talk about is beyond the point where Excel, Numbers or any other spreadsheet software is really efficient. Anyway, this post is going to be a little bit of programming and research philosophy, as well as some practical bits about how to make errors in your code and research less likely.
Tools of the Trade
First, a few easy rules to make serious errors a lot less likely.
Have some Standards
Most of the "Old Guard" programming languages, the ones you find in banking software or aeroplanes, have very strict standards dictating what is valid in the language. They are usually revised every 3-5 years, and controlled by some professional body, often ISO. A working compiler or interpreter must obey these standards or it is wrong. In practice, support for new versions of standards takes some time to develop, but this is usually well documented. Some major packages, such as MPI, also have standards. If your language (or library) has standards, follow them! If it doesn't have formal standards (e.g. Python), be as conservative as you can in what you assume. For instance, try the calculation 2/3 in Python 2 versus Python 3.
Proper Planning and Preparation Prevents Piss Poor Performance. Supposedly a British Army adage, and one of my Mum's favorites. Plan before starting! Work out what you're going to do. Prepare - read up on any new techniques, do some "Hello World" examples with any new packages etc. Check all the bits of your plan can work in practice - can they be fast enough? Is your chosen language/package a good choice? Can you handle the amounts of data/computation/other needed?
Borrow your Wheels
There is a crucial balance between reinventing the wheel everytime you write some code on the one hand, and importing a package for every non trivial task on the other. Neither of these are a good idea, and both tend to lead to more and worse errors. Borrow widely (with proper attribution), but don't code yourself into a morass of dependencies that you can never hope to test. Especially don't borrow from things that are no better tested or verified than your own code - this is a recipe for disaster.
It's very easy to assume some feature or limitation of a code snippet is "obvious" when you're writing it, only tocompletely forget about it when you come back. The only way to avoid this is to document. Do this a few ways.
Restrictions on a code snippet, that don't impact anything else, so don't fit either of the next two paragraphs, should be commented as near the line as possible. Try and make it so that the comment still makes sense even if the code changes a bit - e.g. don't use line numbers, don't use "the following line" etc.
Restrictions on a function (parameter x must be +ve etc) are best done using one of the documentation libraries, so the restrictions are in comments near the function code.
Restrictions that run beyond a single function (don't use this code for more than 1 million elements at a time!, NaNs must be removed from data before input) should be mentioned both on the functions where they arise and in your examples and any user docs. User docs are vital if you're sharing you code with other people.
Check yourself before you Wreck yourself
Sometimes, you have known preconditions (things which must be true when a function is called), invariants (things which should stay the same throughout), and postconditions (things which are true after a function ends). You should consider adding code to check (some of) these things. Sometimes a compiler can do some of this for you, such as Fortran's array bounds checking, or C++'s ".at()" to get a vector element, rather than using "". Since these can be costly in run time, they may be better as a debugging option rather than a normal part of the code.
Test your Limits
The last essential element is testing. This is a very broad topic, and really isn't easy. At its core though, it means working out what parts of the code should do and checking that they do it. Smaller parts are easier to handle, but (remember your time is precious) can mean writing lots of trivial tests where you could easily absorb them all into one test. If it is at all possible, have at least one test case of the entire code - something you can run and know the answer to, that isn't too trivial. We talk a little about testing in our workshopsand there are millions of books, blog posts, papers etc on how to do it. Just make sure you do something. The rest of this post should convince you why.
The Ideal World
In an ideal world, any code you write has a well thought out, complete specification. Its purpose is fixed and perfectly known; there's a wealth of known results to compare to; the computations are all nice and self-contained; nobody else is really working on it, so there's no rush; and yet you can still do something novel and interesting with it. In this case, avoiding errors is as easy as it ever gets. You split the code into self-contained modules, each doing one thing and doing it well. You write careful error checking into every function, making sure they are never called with invalid parameters. You test each function of every module carefully in isolation and as you put the modules together, you test at various stages. Finally, you run some problems with known results and verify that you get the correct answer. As you write, you document all the functions and the assumptions they make. Finally, you write some user documentation describing what the code is and does, what its limitations are.
That's all great in theory, but for it to be enough it really is essential that everythingof importance be in the specs. I am not sure that has ever happened, anywhere, ever. It gets close in things like Aerospace and Medical engineering, but even there mistakes are made.
Code or Computation
The second best situation, from a coding point of view, is the one where the Code is trivial, but the Computations hard. For instance, a simple program to count which words occur near to some target word is fairly easy to write, although far form trivial (capitalisation, punctuation, hyphenations etc), but text concordance analysis is widely used and is able to find new, novel results if correctly analysed. Or imagine coding up a very sophisticated algorithm, with minimal support code. From a code perspective this is also quite easy - you just make sure the code contains the right equations (and hope that you wrote them down correctly).
Of course, in these case you still do all the things from the ideal-world - you modularise the code, you test it carefully, you document the assumptions. I'm not going to discuss all the details of this here: we cover a lot of it in our training, especially here. You do all this, and you'll catch some errors, but it probably wont be all of them.
Sadly though, with many pieces of code you may write, you don't even have this level of "simplicity". The code is hard - it uses difficult techniques, or has many interacting pieces, or there aren't any good test cases you can use. Testing, verification and documentation as important now, if not more so. But you wont find all the potential errors.
What's the Point?
But at this point, it seems a bit pointless. You make all this effort, spend all this time on testing and verification, and its still probably got bugs in. Your code still probably has errors somewhere no matter what you do.
Well firstly, and most practically, the more you check and test, the smaller your errors are hopefully getting. You catch any really big glaring errors, you fix lots of things, and you absolutely have a better piece of code. The time you're spending is never wasted, although its important to try and direct it to the places you get the most return. Keep the 80-20 rule in mind. You get 80% of the gains from the first 20% of the effort in many endeavours, but getting the final 20% takes the remaining 80% of the time. This doesn't mean you should fix 20% of the bugs though! It means you finish all the easy 80s before moving on to the more finicky checks.
Secondly, more philosophically, you will always know that you did it. You can point to the time you spent and say "I tried. I did the things I was supposed to. I wont make this mistake again." That is a very nice thing to be able to say when you find you've made a mistake that's going to cost you time and/or effort.
So, if you've published a paper, you've probably published a mistake. Hopefully, this isn't a important msitakes, just some minor grammar, a misprinted formula, or a miswritten number that doesn't affect results. It's pretty important to keep perspective. Everybody makes these mistakes. The important thing is not to make them because you didn't do things how you know you should.
I like to read Retraction Watchon occasion. Much of what they publish is serious misconduct from serial bad actors, but sometimes there's an interesting post about honest mistakes, and interesting commentary on how it's meant to work. For instance, when should a paper be retracted rather than amended? How do you distinguish stock phrases from plagiarism? What's so bad about salami slicing?
I particularly like things like this retraction, where an error was found that needs more work to address, so a correction wasn't suitable; or this one where the error meant the result wasn't interesting. The errors probably happen a lot more often than the retractions or corrections. That's not really in scope of a blog post, but it's good to see the process working well.
To summarise briefly, everybody makes mistakes. If you're lucky yours will be minor, caught before they go "into the wild", and make funny anecdotes rather than tragic tales. The worst thing you can do, as a researcher, is try to hide your errors. You should fix them. Whether this can be done in a follow-on paper, or needs a correction or retraction, it should be done!
Errors in code, rather than papers, are a tiny bit different - they're probably more likely to be harmless. But don't be fooled - if the error invokes "undefined behaviour" (anything outside of, or not conforming to, the standard) you will never know whether the result was affected unless you redo everything. It's very tempting to run a few test cases, find they're OK and assume all the others are fine too. The definition of undefined behaviour means you can't really know - its perfectly valid to work 99 times and fail the 100th. If you find any, you have to fix it and redo all results. Hopefully they're unchanged. That's great. You fix the code, you upload the fixed version - admitting the error and the fix, and you add the error to your list of "things to be careful of". You don't need to do anything about your results - they're still solid.
If results are affected, talk to your supervisor, or PI, or trusted colleagues and work out what's an appropriate solution. Make sure you've learned from this and you're not going to make the same mistake again. The former CEO of IBM, Thomas John Watson Snr said it best:
Recently, I was asked if I was going to fire an employee who made a mistake that cost the company $600,000. “No”, I replied. “I just spent $600,000 training him – why would I want somebody to hire his experience?”
Don't be scared of bugs. They happen and they always will. But the more you find, the better you develop your "sixth sense" and can almost smell where they might be, and the better your code and the research you do with it, will become.