Whilst looking for code coverage utilities for Python, I discovered this
paper by Brian Marick. He makes
some interesting points, although (unsuprisingly) similar to what’s in “The Art
of Software Testing” by Myers.
- Line by line coverage is too simplistic, so you have to test for branches
- When testing branches, you need to ensure that each component of the
conditional is tested (short circuited evaluation).
- Static code coverage can tell you only some things about code that is
present, due to the halting problem (this point came from another paper).
Dynamic code coverage can only describe code that gets run.
The example given was checking a function’s return value for FATAL_ERROR and
exiting or continuing. What is missing is that the function can also return
RECOVERABLE_ERROR which requires some remedial action before the program can
continue. This is an “error by omission”.
A more sophisticated tool would determine the dependency between the function
and the code that checks its return values and check that all possible classes
of values are returned and checked.
<ObMissingThePoint>Of course, they should be using exceptions rather than
FATAL_ERROR return values ;-)</>.
- Don’t expect full coverage or think too much about whole program coverage.
It’s a misleading goal: testing may be clustered, with some modules heavily
tested and some overlooked (Marick uses the term “black holes”, crediting Rich
Full coverage doesn’t guarantee correctness anyway: missing conditions and even
more importantly, side-effects means reordering operations may give different
results. Determining all valid reorderings is NP-complete (I think I read this
in one of the papers I’ve collected, I’ll verify this at some point…) and
then each of those permutations would have to be tested.
- There exists a temptation to treat messages from the coverage tool(s) as
commands (“make that statement evaluate true”) rather than hints (“you made
some mistakes somewhere around there”). Marick advises against using code
coverage in test design as the “return on your testing dollar (in terms of bugs
found) is too low”.
Despite these problems, Marick still finds code coverage useful: “I wouldn’t
have written four coverage tools if I didn’t think they’re helpful. But they’re
only helpful if they’re used to enhance thought, not replace it.”
EDIT: Uploaded old version of document, stupid caching.