All entries for March 2022

March 18, 2022

Getting a little testy

Testing code is essential to knowing if it works. But how do you know what to test? How do you know you've done enough?

Let's be clear to start here, "testing" as we think of it is some form of comparing a code answer to a predicted one, making sure the code "gets it right" or meets expectations. If there is a canonical "right answer" then it's fairly easy to do - if not then things get difficult, if not impossible, but correctness is the goal.

What to test? Well, all of it, of course! We want to know the entire program works, from each function, to the entire chain. We want to know it all hangs together correctly and does what it should.

So when have you done enough testing to be sure? Almost certainly never. In a non trivial program there is almost always more you could test. Anything which takes a user input has almost infinite possibilities. Anything which can be asked to repeat a task for as long as you like has literally infinite possibilities. Can we test them all? Of course not!

Aside - Testing Everything

Why did we say user input is only almost infinite? Well, all numbers in the computer have a fixed number of bits for storage, which means there's a strictly fixed number of numbers. Technically, for a lot of problems, we really can test absolutely every input. In practice, we can't afford the time and anyway, how do we know what the right answer is without solving the problem completely.

Picking random things to check is an idea that's used sometimes, as are "fuzzers", which aim to try a wide range of correct and incorrect inputs to find errors. But these are also costly to perform, and still require us to know the right answer to be really useful. They can find crashes and other "always wrong" behaviour though.

Lastly, in lots of codes, we aren't expected to protect users from themselves, so entering an obviously silly value (like a person's height of 15 ft) needn't give a sensible answer. We can fall back to "Garbage In Garbage Out" to excuse our errors. But our users might be a lot happier if we don't, and in important circumstances we wont be allowed to either.

Back to the Grind

That all sounds rather dreary - writing good tests is hard, and we're saying they're never enough, so why bother? Well, lets back off for a minute and think about this. How do we turn an "infinite" space into a tractable one? Well, we have to make it smaller. We have to impose restrictions, and we have to break links and make more things independent. The smaller the space we need to test, the more completely we can cover it.

Unit testing

Most people have probably heard of unit testing by now - testing individual functions in isolation. It seems like if you do this then you have tested everything, but this is not true. What happens if you call this function followed by that one (interdependency)? What happens if you call this function before you did this other thing (violation of preconditions)?

Unit testing is not the solution! Unit testing is the GOAL!

If we could reliably say that all of the units working means the program works, then we can completely test our program, which is amazing! But in reality, we can't decouple all the bits to that extent, because our program is a chain of actions - they will always be coupled because the next action occurs in the arena set up by the preceeding ones. But we have to try!

Meeting the Goal

So we have to design, architect and write our programs to decouple as much as possible, otherwise we can't understand them, struggle to reason about them, and are forced to try and test so much we will certainly fail. A lot of advice is given with this sort of "decoupling" in mind - making sure the ways in which one part of a program affects another are:

as few as possible
as obvious as possible
as thin/minimal as possible

What sorts of things do we do?

Avoid global variables as much as possible as anything which touches them is implicitly coupled

C++ statics are a form of global, as are public Fortran module variables (in most contexts)

Similarly, restrict the scope of everything to be as small as we can to reduce coupling of things between and inside our functions
- C++ namespaces, Fortran private module variables, Python "private" variables (starting with an underscore, although not enforced in any way by the language)
Avoid side-effects from functions where possible, as these produce coupling. Certainly avoid unexpected side effects (a "getter" function should not make changes)
- Fortran offers PURE functions explicitly. C++ constexpr is a more complicated, but related idea
Use language features to limit/indicate where things can change. Flag fixed things as constant.
- Use PARAMETER, const etc freely
- In Fortran, use INTENT, in C/C++ make function arguments const etc
- In C++ try for const correctness, and make member functions const where possible. Use const references for things you only want to read
Reduce the "branchiness" (cyclomatic complexity) of code. More paths means more to test.
Keep "re-entrancy" in mind - a function which depends only on its arguments, doesn't change it's arguments, and has no side effects can be called over and over and always give the same answer. It can be interrupted/stopped and started afresh and still give the same answer. This is the ultimate unit-test friendly function!

Overall, we are trying to break our code into separate parts, making the "seams"between parts small or narrow. If we drew a graph of the things in our code and denote links between them with lines, we want blobs connected with few lines wherever possible.

Every link is something we can't check via unit testing, and unit testing is king because it is possible to see that we have checked every function, although we can never see if we have checked every possible behaviour within the function, even with code coverage tools.

Aside - Writing useful unit tests

Even if everything is unit-testing amenable, it doesn't make it actually easy. There's a few things to keep in mind when writing the tests too.

Poke at the edge-cases. If an argument of '3' works, then '6' probably will too, but will '0'? What about a very large number? Even (a+b)/2 for an average will break down if a + b overflows! A negative number? Look for the edges where behaviour might change, and test extra hard there
If your function is known only to work for certain inputs, make them preconditions (things which must be true). This can be done either in documentation, or using things like assert
Check for consistency. If you mutate an object, check it is still a valid one at the end
Try to come up with things you didn't think about when you wrote the function, or that no "sensible caller" would do
Check that things fail! If you have error states or exceptions, make sure these do occur under error conditions

Dealing with the Rest

If we could actually produce a piece of code with no globals, no side-effects and every function fully re-entrant, unit tests would be sufficient to check everything. Sadly, this is impossible. Printing to screen is a side-effect. All useful code has some side effects somewhere. Mutating an argument makes a function non-reentrant (we'd have to make a copy and return the new one instead, and that has costs). So we seem doomed to fail.

But that is OK. We said unit-testing is a goal, something we're trying to make as useful as possible. We can do other things to make up for our shortfalls, and we definitely should, even if we think we got everything. Remember, all of those bullets are about minimising the places we violate them, minimising the chances of emergent things happening when we plug functions together.

We need to check actual paths through our entire code (integration testing). We need to check that things we fixed previously still work (regression testing). We probably want to run some "real" scenarios, or get a colleague to do so (sorta beta testing). These are hard, and we might mostly do them by just running our program with our sceptical hats on and watching for any suspicious results. This is not a bad start, as long as we keep in mind its limitations.

Takeaway

What's the takeaway point here? Unit Testing works well on code that is written with the above points in mind. Those points also make our code easier to understand and reason about, meaning we're much less likely to make mistakes. Honestly, sometimes writing the code to be testable has already gained us so much that the tests themselves are only proving what we already know. Code where we're likely to have written bugs is unlikely to fail our unit tests - the errors will run deeper than that.

So don't get caught up in trying to jam tests into your existing code if that proves difficult. You won't gain nearly as much as by rewriting it to be test friendly, and then you almost get your tests for free. And if you can't do that, sadly unit tests might not get you much more than a false sense of security anyway.

Heather Ratcliffe : 18 Mar 2022 13:26 | Tags: Philosophy Programming Testing | Comments (0) | Close comments | Report a problem

March 07, 2022

What's the time right now?

This is not one of my mistakes, this is one I read about second hand (you'll see the pun in a moment), but it is a great illustration of the importance of writing the code you actually mean to, thinking properly about what it is that you mean, and watching out for weasel words that might confuse you.

Consider these two lines of code and decide for youself whether they have the same effect:

if (DateTime.Now.Second == 0 && DateTime.Now.Minute == 0)

if (DateTime.Now.Minute == 0 && DateTime.Now.Second == 0)

Decided yet? If you think they are the same, take a second now (that's a hint!) to think about the meaning of each part, and what its value will be. Suppose the original author intended to run some process once per hour, so put one of these into a loop. Can you see why one of these might succeed, and the other fail occasionally (assume the rest of the code takes some tiny fraction of a second)?

By the way, if you're worried about not knowing what language this is in, don't be. This is a perfect example of good "self documenting code" which we've written a bit about before. Based on the names, we can infer (correctly) that DateTime is some sort of thing, which can supply the date-and-time Now through the "DateTime.Now" construct, and that we can then look at the Minutes and Seconds value of the time Now. That's all we need to know.

So, back to the question. Are the lines the same? Well, the order is different. There's a double-& AND operation. We don't strictly know whether the left or right condition will be checked first (it might depend on language), and we don't actually know if both will be (if you're not sure why that is, we've talked about short circuiting in logical operations here). But that shouldn't matter unless the DateTime object is having some weird side-effects and I promise it isn't. The problem is much more fundamental than that.

If you haven't spotted it, it's time to write down very clearly what we wanted to check. We want to know if, right now, the value of both the minutes and the seconds entries are 0. It looks like that's what either line does, but there's a nasty weasly word sneaking in here - the word "now". What can "now" actually mean in programming terms, where we know operations ultimately are carried out one by one in sequence? For instance, what would you expect the following code could print?

if (DateTime.Now.Second == 0) print(DateTime.Now.Second)

If you said 0 only, hold on a second (that's a hint, too) and look again, and think about what this code will actually do in terms of basic operations. It will get a value for DateTime.Now.Second. It will compare this to 0. If the comparison is true it will get a value for DateTime.Now.Second, and it will print this value. And there is the problem! The value we check and the value we print are not always the same.

If it still isn't clear to you why the original two lines are not equivalent, reread that last paragraph a few times and apply the same idea to the original question. If it's been obvious to you all along, great. You're unlikely to make the mistake this original programmer did. But why did they make this mistake? There could be several reasons:

They might simply not have spotted there could be a problem - this is pretty likely, but not very interesting
They might have thought somehow the compiler/runtime would recognise the values as "the same" - this is a pretty fundamental mistake, perhaps due to thinking of "Now" as a property rather than an operation. In some very imprecise sense, we should think of "Now" as having a side-effect on itself because it takes time, so multiple calls will not have the same result as a single one
They might have blindly removed a temporary variable containing DateTime.Now, without thinking deeply about the consequences. This is quite likely because it looks so simple, and is why refactoring like this needs, if anything, more attention than writing the code in the first place
Lastly, they might have thought carefully and precisely, but ultimately wrongly about it. If they confused the idea of seconds as a time point with seconds as a time interval, they may have thought that both the Seconds and the Minutes property would surely be evaluated within the same second, and thus work as expected, where actually even a microsecond between them could be the difference between 1:59.999 and 2:00.000

So, that explains why the lines don't work as intended, but there's one weird thing left - how come one of them (the first one, if evaluation occurs left to right) works, but the other doesn't? Effectively, the minutes value depends on the seconds value - the minute is incremented only when the seconds reads 59. We know the whole operation takes far far less than 1 second (ignoring interrupts or anything else suspending operation), so once we have confirmed the seconds value as 0, we have at least 59-60 seconds to check the minutes value, during which it cannot change. Certainly this will occur within that interval, and the code will work perfectly (again, ignoring external iterrupts, which our code can never account for anyway).

So, does that mean the first option is actually safe, or good code? Well, not without a comment it certainly isn't! If we're relying on this sort of subtlety, somebody is going to be tripped up by it, and it might even be us. It's also a bit delicate - suppose we were to need to use DateTime.Now in another part of the condition, or in the body? We'd have to think very hard about whether we're still "safe" and we simply don't need to. A temporary variable is much clearer, and the possible optimisation here is a) tiny and b) best left to our compiler anyway. Oh and c) possibly a de-optimisation anyway, depending on the cost of the DateTime.Now lookup. Also, it'd be easy to forget that the left-to-right ordering is important, and we might try this in a situation or language where that is not guaranteed.

Ultimately, I simply don't like relying on such a subtlety. It's too clever and it doesn't need to be. It's not that I don't like clever code, because I do. I like clever code that makes one exclaim "oh yes, that's so simple, it's brilliant", and this aint that. This is more "oh blimey, is that how it works". Keep it simple, and if you can't keep it simple, at least make it satisfying.

Postscript

Before we go, I hope something has been nagging at you throughout this post. I started off by saying we should think hard about what we mean and what we want to actually do. I introduced this idea of minute-0 and second-0 and then proposed it as a solution to doing something once per hour. Did it strike you as a pretty terrible solution to that problem? Because it is. Suppose we start running our code at 9:00:30 AM. Would we expect whatever this block controls to wait an entire hour to fire for the first time? What if the code is never running at the change of the hour? This block would never run. Suppose we start 1000 copies of this code. Do we really want all 1000 to execute this block at the same time? Suppose some other code took longer than expected and this block isn't reached once per second - some hours might be simply skipped. In nearly all cases we'd be better checking whether at least an hour has elapsed since last we did the thing, and we'd avoid all of these problems. We meant once per hour, not on the first second of every hour, and we should have coded that, not this.

This post is inspired by an article I read a while ago, namely https://thedailywtf.com/articles/an-hourly-rate In particular, a comment mentions the different behaviour of the two possible lines, which this post focuses on.

Heather Ratcliffe : 07 Mar 2022 16:11 | Tags: Bugs Code Programming | Comments (0) | Close comments | Report a problem

March 02, 2022

Learning to Program

We (Warwick RSE) love quotes, and we love analogies. We do always caution against letting your analogies leak, by supposing properties of the model will necessarily apply to reality, but nontheless, a carefully chosen story can illustrate a concept like nothing else. This post discusses some of my favorite analogies for the process of learning to program - which lasts one's entire programming life, by the way. We're never done learning.

Jigsaw Puzzles

Learning to program is a lot like trying to complete a jigsaw puzzle without the picture on the box. At first, you have a blank table and a heap of pieces. Perhaps some of the pieces have snippets you can understand already - writing perhaps, or small pictures. You know what these depict, but not where they might fit overall. You start by gathering pieces that somehow fit together - edges perhaps, or writing, or small amounts of distinctive colours. You build up little blocks (build understanding of isolated concepts) and start to fit them together. As the picture builds up, new pieces fit in far more easily - you use the existing picture to help you understand what they mean. Of course, the programming puzzle is never finished, which is where this analogy starts to leak.

Jigsaw image

I particularly like this one for two reasons. Firstly, one really does tend to build little isolated mini-pictures, and move them around the table until they start to connect together, both in jigsaws and in programming. Secondly, occasionally, you have these little moments - a piece that seemed to be just some weird lines fits into place and you say "oh is THAT what that is!". One gets the same thing when programming - "oh, is that how that works!" or "is that what that means".

Personally, a lot of these moments then become "Why didn't anybody SAY that!". This was the motivation behind our blog series of "Well I Never Knew That" - all those little things that make so much sense once you know.

The other motivation for the WINKT series is better illustrated using a different analogy - hence why we like to have plenty on hand.

A Walk in the Woods

Another way to look at the learning process, is as a process of exploring some new terrain, building your mental map of it. At first, you mostly stick to obvious paths, slowly venturing further and further. You find landmarks and add them to your map. Some paths turn out to link up to other paths, and you start to assemble a true connected picture of how things fit together. Yet still there is terrain just off the paths you haven't gone into. There could be anything out there, just outside your view. Even as you venture further and deeper into the woods, you must also make sure to look around you, and truly understand the areas you've already walked through, and how they connect together.

My badly done keynote picture is below. The paths are thin lines, the red fuzzy marks show those we have explored and the grey shading shows the areas we've actually been into and now "understand". Eventually we notice that the two paths in the middle are joined - a nice "Aha" moment of clarity ensues. But much of our map remains a mystery.
Schematic map with coverage marked

And that is one of the reasons I really like this analogy - all of the things you still don't know - all that unexplored terrain. This too is very like the learning process - you are able to use things once you have walked their paths, even though there may be details about them you don't understand. On the one hand, this is very powerful. On the other, it can lead to "magic incantations" - things you can repeat without actually understanding. This is inevitable in a task this complex, but it is important to go back and understand eventually. Don't let your map get too thin and don't always stick to the paths, or when you do step off them, you'll be lost.

This was the other main motivation behind our WINKT series - the moments when things you do without thinking suddenly make sense, and a new chunk of terrain gets filled in. Like approaching the same bit of terrain from a new angle, or discovering that two paths join, you gain a new perspective, and ideally, this new perspective is simpler than before. Rather than two paths, you see there is only one. Rather than two mysterious houses deep in the woods, you see there is one, seen from two angles.

Take Away Point

If you take one thing away from this post, make it this: the more you work on fitting the puzzle pieces together, the clearer things become. Rather than brown and green and blue and a pile of mysterious shapes, eventually you just have a horse.

And one final thing, if we may stretch the jigsaw analogy a little further: just because a piece seems to fit, doesn't mean it's always the right piece...

Heather Ratcliffe : 02 Mar 2022 12:52 | Tags: Code Philosophy Programming Winkt | Comments (0) | Close comments | Report a problem