I came across a lovely bug lately. Integer arithmetic, especially in C and C++ it seems, is error-prone. In addition to the risk of having the wrong expressions altogether (a logic error, one could say), integer arithmetic is subject to a number of pitfalls, some I have already discussed here, here, and here. This week, I discuss yet another occasion for error using integer arithmetic.

Consider this piece of code, one that you have seen many times probably, at least as a variation on the theme:

int cost=INT_MAX; /* here goes code that may or mayn't assign cost */ int adjusted_cost=cost+extra_cost; if (cost+extra_cost<best_cost) best_cost=adjusted_cost;

Can you see what’s the problem with this code (besides the fact that it may or mayn’t assign `cost`)? Think it over before continuing.

*

* *

The problem is that since `cost` is `int` and is initialized to `INT_MAX` adding anything (other than zero) to it will cause to overflow, and wrap around, which means that `cost+extra_cost` is now negative, and is therefore (very possibly) smaller than `best_cost`. The exact behavior is implementation-specific, but it will break the algorithm.

There are basically tree solutions to this kind of problem. The first is to test whether or not the value is `INT_MAX`. The condition becomes something like:

if ( (cost!=INT_MAX) && ( cost+extra_cost<best_cost)) { ...

(You may have noticed that this version isn’t much safer. If `cost` is `INT_MAX-1` and `extra_cost` is 10, then it’s still broken. If fact, it is complicated to get arbitrary expressions like this one to behave correctly in all cases, that is, regardless of the of the values involved.)

The second solution is to replace int by an explicitly larger type but to use the same constant:

#include <stdint.h> ... int64_t cost=INT32_MAX;

Now `cost+extra_cost` cannot overflow/wrap-around, and the code now behaves the way it was intended, but at the cost of manipulating larger integers (which can have implementation-specific performance issues).

The third solution requires analyzing your algorithm and knowing what are the values you operate on. What is the largest possible value for `cost` for this algorithm? Maybe it’s 100, maybe it’s 1 000 000, or maybe it’s 200 000 000? Maybe with `extra_cost` always less than 1024 and `cost` at 1 000 000 will provide safe arithmetic in all cases. Adjusting the maximal cost to a reasonable maximum that is far from `INT_MAX` is a variation on the theme of using `uint64_t` instead of only `int` except that it uses `int` (chosen as to be a machine-efficient size) and it provides an occasion to understand *and* document your algorithm.

*

* *

The basic fact is that getting integer arithmetic exactly right *all the time* given two’s complement, modular arithmetic (and thus finite integer sizes) is really complicated. Not always, but I invite the reader to consider again the second solution above, or even the original expression. Can you write a test that checks both for overflow and test the condition correctly? How many special case will it have? What if I change `int` to `unsigned`? will it still work?

If both values are `unsigned` (`short`, `int`), testing for `(a+b)<max(a,b)` correctly detects overflows for all possible pairs of `a` and `b`. If the values are `signed`, it fails on many negative pairs of `a` and `b`, for example, `-32768 + -17488` will yield `15280` (assuming 16 bits integers). One possible solution is to compute the carry; if there’s a carry, there’s an overflow. While the carry is readily available to the CPU, C doesn’t help much here; you’ll have to write your own:

int will_overflow(int16_t i, int16_t j) { int16_t s=i+j; int32_t s32=i+j; return s!=s32; }

Your will_overflow function doesn’t really work, it returns “yes, it will overflow” for something as simple as -32768 and 0. I think there was a formula for checking for overflow in Hacker’s Delight, I’ll take a look when I’ll finally find this book.

The original

is clearly defective, as you rightly point out. I will fix that to:

Does what it should. I’ve been looking in Hacker’s Delight and what is presented there seems to depends on the sign of the operands (leaving two branches of tests.) It does have the merit of not necessitating bigger registers to perform its tests.

I found it:

No branches ;)

Well, the above code is for addition, for subtraction the formula is ((i^j)&(difference^i))>>15.

By the way, could you fix the tags (and possibly merge the comments), please?

Excellent!

You may find this interesting: http://channel9.msdn.com/Shows/Going+Deep/David-LeBlanc-Inside-SafeInt

It’s about integer wrapper types that automatically detects any overflow or divide by zero. It looks very useful. I wish it was included in boost.

I always avoided this issue by just assigning it to a large number instead of int_max. for example, if you know the value is unlikely to be more than 10,000,000 you can just put 9999999 or something.

Almost as bad as contracting “may not” to “main’t”