*Ieee*

#
All 1 entries tagged

View all 2 entries tagged *Ieee* on Warwick Blogs | View entries tagged *Ieee* at Technorati | There are no images tagged *Ieee* on this blog

## November 14, 2018

### Now with Less NaN!

This week I'm at Supercomputing 2018 in Dallas, enjoying a wide variety of talks! Just a short post on a couple of the "Aha" moments from this talkon correctness and reproducibility of Floating point code by James Demmel (University of California, Berkeley). I had never noticed the perils of NaN and Inf propagation he mentioned and I think they're really worth knowing. I said a bit about IEEE 754which dictates what NaN and Inf are and how they behave in this post. This entry is mostly about unexpected ways the standard can be broken by accident, mainly due to optimisations that rely on multiplying by zero. Per the standard, anything **except NaN, **times by zero is zero, but **any operation involving a NaN**must give NaN.

## Max Value the Obvious Way

Imagine you're asked to write a routine to find the Max or Min value of an array. It's obvious. Start by assuming it's the first value, and then compare every other value to see if it's higher/lower, until you reach the end. Max/Min MUST look at every value in the array to get their answer (I am sure there's a mathematical word for that, but I don't know it), so this is the most efficient option there is in terms of comparisons. However, it has a major flaw once we account for IEEE behaviour: NaN compares as false to *everything*, even itself.

So, imagine running this algorithm on the arrays [NaN, 2, 1] and [2, NaN, 1]. We get NaN and 2 respectively, as we've made the first element "sticky"! Step-by-step:

Max of [NaN, 2, 1] : Take first element : NaN Compare to second element: 2 > NaN -> False : Max = NaN Compare to third element: 1 > NaN -> False : Max = NaN

RESULT: NaN

Max of [2, NaN, 1] : Take first element: 2 Compare to second element: NaN > 2 -> False: Max = 2 Compare to third element: 1 > 2 -> False: Max = 2

RESULT: 2

This was an example given in the talk, and it's obvious when you look at it, but easy to not think of! There's two serious problems here: firstly the loss of the NaN value in the second case, which violates the standard and secondly the change of answer on array permutation, which is just generally bad for something like Max which should be invariant.

## 3x3 Determinant

Now imagine taking the determinant of a 3x3 Matrix. Generally, this is easy. You take each element of the first row, and multiply by the determinant of the 2x2 sub-matrix omitting the column at hand, with appropriate signs. Suppose a first row element is zero: obviously that term contributes 0 regardless of the 2x2 submatrix! So you can speed things up by testing for zero and omitting any such terms, right? Not once NaN gets involved. Now consider this matrix:

1 0 0 NaN 1 1 0 0 1

We **should** get: 1 * 1 + 0 * NaN + 0 * NaN = NaN. But with our clever optimisation, we get 1 * 1 + 0 + 0 = 1. This is a pretty big problem! We're hiding away that NaN, by accident!

There's another problem too! Even if we checked for NaN in our array, what if we have an Inf instead? Inf * 0 should give us NaN. Effectively, Inf is "a number, but bigger than we can represent". Zero is "anything smaller than the smallest number we can represent". The product is then kind-of "any number, we don't know which". It might be zero, it might be Inf, it might be anything. So, the standard dictates it be NaN.

So, if we use the optimisation, we're not just hiding a NaN, we're stopping one ever being created! This is potentially an much more serious problem, because we might be happy using the Compiler flag to give us errors on NaN, but not willing to make under or overflows fatal. Then we'd completely miss this problem!

## Optimisations in BLAS

The talk (at least it's first half) focused on the Linear Algebra library BLAS, which is heavily optimised, but as it turns out does have several bugs like this. For instance, a slightly complicated multiply op, alpha * A * B, might check for alpha = 0 in order to save the work of the matrix multiply when the end result should be zero anyway. Unless, of course, either A or B contains any NaN's when we must get back NaN.

## But it's all gone wrong anyway...

One comment on the talk, which is worth a mention was, to paraphrase "If you're getting NaN's things have already gone wrong, so why should the Linear Algebra library bother to check? You should be checking." This is an interesting point and in some circumstances holds true. I tend to think of NaN or Inf as bugs and would prefer to write code so they can't arise, but I am not working with long linear algebra workflows where you want to run a series of operations in something like BLAS and don't examine the intermediate steps.

## Reproducible BLAS

The problem above was largely the topic of the first half of the talk, which focused on how BLAS could manage the problem via errors/exceptions, and being careful with optimisations. For instance, use the optimisations to do the calculations, but also return an error code detailling any issues with the arguments.

Another large part of this talk was about reproducibility, in particular in parallel codes. Here, depending on the processor decomposition, you can get subtly different results for a sum over numbers of varying size. This is a really tricky problem, which is being addressed by things like ReproBLAS.

## Punchline

Dealing with Floating point numbers is hard. It gets harder once you demand reproducibility at a bitwise level. Both the Maxval issue above, and a simple sum with values of varying size may not give the same answers when terms/entries are reordered. In particular, parallel codes always effectively reorder terms whenever you chance the decomposition (e.g. run on more or less cores).

Hopefully, you never need to really worry about NaN (or the less tricky, but still tricky Inf), but do be careful! The problems I've discussed are mainly due to treating zero incorrectly per the IEEE standard, the problems with NaN or Inf following from this. **Be careful with zero!** It might not be as small as you think ;)