just_enough

The C99 <stdint.h> header provides a plethora of type definition for platform-independent safe code: int_fast16_t, for example, provides an integer that plays well with the machine but has at least 16 bits. The int_fastxx_t and int_leastxx_t defines doesn’t guarantee a tight fit, they provide an machine-efficient fit. They find the fastest type of integer for that machine that respects the constraint.

But let’s take the problem the other way around: what about defines that gives you the smallest integer, not for the number of bits (because that’s trivial with intxx_t) but from the maximum value you need to represent?

There are plenty of reasons to use the smallest possible integer to store values with known bounds: smaller files, smaller data structures, more useful data in the same cache line. The C99 <stdint.h> isn’t much use here because it lacks metaprogramming, or more exactly, the C preprocessor and old-style macros are too weak to provide the kind of metaprogramming we need here:

  • A compile-time function that tells us how many bits are needed to represent a max value;
  • A template that takes a number of bits and decides one the smallest integer accommodating it.

The first constexpr function gives the number of bits to represent a maximum value, that is, it doesn’t think you need 5000 values (0-4999) but that you need to represent the value 5000 (0-5000). So if the maximum value is 4, you do need 3 bits (because 4 is 1002); so it’s not quite log-base-2 (and don’t come whine in the comments).

////////////////////////////////////////
// c++14 and +
constexpr std::size_t bits_from_value(std::size_t n)
 {
  if (n)
   return (n<2)?1:(1+bits_from_value(n/2));
  else
   return 0;
 }

Now, let’s create a template that takes a number of bits and decides on the smallest integer that accommodates that number of bits:

////////////////////////////////////////
// should also do signed...
template <int x> struct __just_enough_uint; // incomplete type
template <> struct __just_enough_uint<64> { using type = uint64_t; };
template <> struct __just_enough_uint<32> { using type = uint32_t; };
template <> struct __just_enough_uint<16> { using type = uint16_t; };
template <> struct __just_enough_uint<8>  { using type =  uint8_t; };

template <const int x>
using just_enough_uint =
 typename
  __just_enough_uint<(x>32) ? 64 : ((x>16) ? 32 : ((x>8) ? 16 : 8))>::type;

////////////////////////////////////////
template <typename T>
 struct bits_from_type { constexpr static size_t value = sizeof(T)*8; };

You may have to extend the above if your architecture has more possible types; but you should be fine with 8, 16, 32, and 64 bits. To create a variable of the just_enough_uint type, you use:

 just_enough_uint<bits_from_value(13)> z; //enough for max value 13 (0...13)

*
* *

This is but a small brick in a much wider scheme to save memory (or storage). Compact data structures have been a research interest for while now and I’ve done a few things before. However, a more algorithmic approach is needed, as a couple of clever tricks, while helpful, aren’t a complete theory. More on this later.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: