Size(_t) matters!

Sometime last week, a tweet from @nixCraft prompted the question, quite ironically, how do you get the maximum (largest positive) value for an integer?

max_int

Well, there are many wrong ways to find out. The one suggested in the tweet is hilariously inefficient, while, if it were C or C++ instead of C#, expecting definite result from undefined behavior. Yes, that i+1 wrapping at -1 is undefined behavior! (see, §5, ¶4, of ISO-14882:2011(E), C++11’s standard). But let’s see:

int nono()
 {
  int i=0;
  while ((i+1)>0)
   i++;
   
  return i;
 }

In -O0 mode, that is, with optimizations disabled, the code takes about 4 seconds to spew out the right answer (for my computer, anyway): 2147483647. The code generated is somewhat as expected:

0000000000400866 <_Z4nonov>:
400866: 55                      push   rbp
400867: 48 89 e5                mov    rbp,rsp
40086a: c7 45 fc 00 00 00 00    mov    DWORD PTR [rbp-0x4],0x0
400871: 8b 45 fc                mov    eax,DWORD PTR [rbp-0x4]
400874: 83 c0 01                add    eax,0x1
400877: 85 c0                   test   eax,eax
400879: 7e 06                   jle    400881 <_Z4nonov+0x1b>
40087b: 83 45 fc 01             add    DWORD PTR [rbp-0x4],0x1
40087f: eb f0                   jmp    400871 <_Z4nonov+0xb>
400881: 8b 45 fc                mov    eax,DWORD PTR [rbp-0x4]
400884: 5d                      pop    rbp
400885: c3                      ret    

It does its ordinary stuff. Stack building preamble, sets i to zero in eax, loops, returns (implicitly) eax as result.

Now, with -O3, all optimizations:

00000000004008e0 <_Z4nonov>:
4008e0: eb fe                   jmp    4008e0 <_Z4nonov>

Bummer!

*
* *

So, what is the right answer?

Use the compiler and the language itself, its standard library if necessary. For the original code, which methinks is C#, int32.MaxValue is the right answer, always. For C or C++, we must use the limits headers. If, and only if, for some obscure reason these aren’t available, should you resort to the black arts. For example ((~0u)>>1), which only works if the integers are binary numbers (no, that’s not actually guaranteed by the standards for C and C++).

Type safety is a major issue in software development, especially when language features vary over platforms, something programming languages such as Java and C# work hard to eliminate and protect you from, but C and C++ thoroughly exploit to generate efficient code.

So, what’s the right method of getting what, exactly? Let’s refer to this table:

what C C++
Bits per char CHAR_BIT
in <limits.h>
std::numeric_limits<char>::digits
in <limits>
Max int
(signed) value
INT_MAX
in <limits.h>
std::numeric_limits<int>::max()
in <limits>
Size of memory block size_t
in <stddef.h>
size_t
in <cstddef>
File offset off_t
in <sys/types.h>
std::streampos
in <iostream>
Pointers as integers intptr_t,
uintptr_t,
and ptrdiff_t
in <stddef>
intptr_t,
uintptr_t,
and ptrdiff_t
in <cstddef>

These are guaranteed by the standards to be the right types and values. Use them.

*
* *

Type safety and portability is one of the things I insist on when I teach programming. The reflex is always to use int, float or some other naïve type to store just about anything. File position? Never mind that the offsets are 128 bits, use a 32 bits int, then complain the code behaves weirdly when the file is over 4 gigabyte ¯\_(ツ)_/¯.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: