All entries for Sunday 10 September 2006
September 10, 2006
Quick list of implementation details for my project:
Revision control system: Bazaar-ng
Language: Python 2.5. Final release should be very soon (i.e. next few days).
I like the language and library improvements over Python 2.4, details here.
Only drawback is availability: Ubuntu 6.10 (Edgy Eft) and Debian 3.1 (Etch) will
probably be the first distributions to ship with support for 2.5.
Initial target platform: Vim 7.0. It’s what my desktop runs and reportedly has
better features for writing things to assist the programmer. Emacs support
would be using Pymacs.
I will need to compile both Python 2.5 and Vim 7.0 for use in DCS as RHEL4 ships
with Python 2.3 (although Python 2.4 is locally available) and Vim 6.x.
Bzr will have to be installed to my account’s python install as well.
I shall be using the py.test tool/library for project wide testing (and probably some
other features from that library).
Whilst looking for code coverage utilities for Python, I discovered this
paper by Brian Marick. He makes
some interesting points, although (unsuprisingly) similar to what’s in “The Art
of Software Testing” by Myers.
- Line by line coverage is too simplistic, so you have to test for branches
- When testing branches, you need to ensure that each component of the
conditional is tested (short circuited evaluation).
- Static code coverage can tell you only some things about code that is
present, due to the halting problem (this point came from another paper).
Dynamic code coverage can only describe code that gets run.
The example given was checking a function’s return value for FATAL_ERROR and
exiting or continuing. What is missing is that the function can also return
RECOVERABLE_ERROR which requires some remedial action before the program can
continue. This is an “error by omission”.
A more sophisticated tool would determine the dependency between the function
and the code that checks its return values and check that all possible classes
of values are returned and checked.
<ObMissingThePoint>Of course, they should be using exceptions rather than
FATAL_ERROR return values ;-)</>.
- Don’t expect full coverage or think too much about whole program coverage.
It’s a misleading goal: testing may be clustered, with some modules heavily
tested and some overlooked (Marick uses the term “black holes”, crediting Rich
Full coverage doesn’t guarantee correctness anyway: missing conditions and even
more importantly, side-effects means reordering operations may give different
results. Determining all valid reorderings is NP-complete (I think I read this
in one of the papers I’ve collected, I’ll verify this at some point…) and
then each of those permutations would have to be tested.
- There exists a temptation to treat messages from the coverage tool(s) as
commands (“make that statement evaluate true”) rather than hints (“you made
some mistakes somewhere around there”). Marick advises against using code
coverage in test design as the “return on your testing dollar (in terms of bugs
found) is too low”.
Despite these problems, Marick still finds code coverage useful: “I wouldn’t
have written four coverage tools if I didn’t think they’re helpful. But they’re
only helpful if they’re used to enhance thought, not replace it.”
EDIT: Uploaded old version of document, stupid caching.