## (Sub)bit-fields (Coding with fractions of bits, Part II)

Last week, we used the 6×7×6 palette as an example of very simple fraction-of-a-bit coding1. However, we can generalize still a bit more to allow single field extraction and modification In the 6×7×6 palette example, we have used the formula $c=(6\times{7})r+6g+b$

to encode an 676-RGB triplet into a single value. The inverse was given by $\displaystyle b=c \mod{}6$, $\displaystyle g=\lfloor\frac{c}{6}\rfloor \mod{}$, $\displaystyle r=\lfloor\frac{c}{42}\rfloor$.

That may give the impression that we must encode/decode the entire triplet each time. Certainly, the inverse gives us the means to extract only one of the values—by using any of the three equations! But what if I wanted to rewrite only the blue component? $c=(c-(c \mod{6})+b$?

Clearly, that subtract the old blue ( $c\mod{6}$) and replaces it by the new blue, $b$. What if we wanted to rewrite the green? I we did $c=(c-(c \mod{42}))+ g*6 + (c\mod{6})$,

that would indeed change only the blue since $c-(c\mod{42})$ is the red component only, (what’s subtracting the green and blue, but we could have used $c=42\lfloor c/42\rfloor$, an equivalent to x=(x>>a)<<a to set the lower bits to zero), and we add back the green and blue.

Let’s look at the general case now. Let $n_1$, $n_2$, … $n_m$ be the numbers of values the $m$ fields can take. Let also $\displaystyle p_k = \prod_{i=1}^{k-1}n_i$, $\displaystyle p_1 = 1$,

be the product of the number of values of the fields that precede the $k$th, or, in other words, the number of combination of all possible values of the fields that precede the $k$th.

To extract the $k$th field, we compute: $\displaystyle f_k=\lfloor\frac{c}{p_k}\rfloor\mod n_k$.

The first $p_k$ shifts the desired value in the least significant bits and $\mod n_k$ extracts it. That’s equivalent to (x>>shift_bits)&mask_bits, but using potentially a fractional number of bits for the shift, and a fractional number of bits for the mask.

To set a new value $f$ for the $k$th field, we compute: $c=(c-(c \mod p_{k+1}))+ f\cdot p_k+(c \mod p_k)$.

The first part, $(c-(c \mod p_{k+1}))$ contains all the fields “above” the $k$th, fields $k+1$ to $m$. The last part, $(c \mod p_k)$ contains the value of all fields before the $k$th. Finally, $f\cdot p_k$ is the new value shifted in the $k$th place.

*
* *

Let’s see what the code to do this looks like. I’ll use initializer lists once more to pass the $n$s to the function, mostly because it’s convenient, and you can do your own container-agnostic version from it quite easily.

The most difficult part is to compute the $p_k$. We basically compute the products of the $n$ up to the $(k-1)$th field. Extracting a code becomes:

int get( int c, int f, const std::initializer_list<int> & n)
{
int preds=1,ff=0;
auto pn=n.begin();

while (ff++<f) preds*=*pn++;

return (c/preds) % *pn;
}


Setting a new code asks for the computation of $p_k$ and $p_{k+1}=n_k\cdot{}p_k$. The code is:

void set( int & c, int f, int v, const std::initializer_list<int> & n)
{

int preds=1,ff=0,one_more;
auto pn=n.begin();

while (ff++<f) preds*=*pn++;

one_more=preds**pn;

c=(c-(c % one_more)) + (v*preds) + (c % preds);
}


In the code, c is the code for all fields, f is the field number, and v the new value. A much better version would use compile-time techniques, like tuple’s get and set functions to allow the compiler to optimize everything away.

Use would be something like this:

// fields: 11,3,4,5,12 values.
const std::initializer_list<int> params{11,3,4,5,12};

int c=3*11*3+2*11+7; // values: 7,2,3,0,0
std::cout << get(c,1,params) << std::endl;
set(c,2,1,params);

std::cout << get(c,2,params) << std::endl;
std::cout << get(c,3,params) << std::endl;



This outputs:

2
1
0


which is what’s expected.

*
* *

What are the expected savings? Well, very obviously, that depends on the number of values for each field. If they’re all powers of two, the method becomes a complicated way of doing classical shifts and masking; if they’re not, the savings can be important. Let’s take the field values in the example above.

In the classical (integer number of bits) bit-fields, encoding five fields with 11, 3, 4, 5, and 12 possible values each would as for 4, 2, 2, 3, and 4 bits (because, for example, $2^3\leqslant{}11<2^4$, therefore 4 bits). That's a total of 4+2+2+3+4=15 bits.

The largest possible value the (sub)bit-field can take is $11\times{3}\times{4}\times{5}\times{12}-1=7919$, that is, 7920 different combinations. We have $\log_{2}7920=12.951\ldots$. Since the language doesn’t allow us to use 12.951 bits, the best we can do is to store the value in a 13-bits bit-field. That 2 bits less than the naive encoding.

*
* *

While 2 bits doesn’t seem much, that leaves us two more bits to pack in the same (sub)bit-field, and we can use it to store something else. It doesn’t have to be an in-memory (sub)bit-field, we could also use it to store stuff in a file, in some bit-stream where alignment isn’t that important.

1I didn’t say arithmetic coding because that refers, of course, to a much more elaborate coding technique.

### 2 Responses to (Sub)bit-fields (Coding with fractions of bits, Part II)

1. Isaac Morton says:

Hey, just found your blog from the recent cpp talk you did on minimizing code footprint. I immediately became fascinated with the concept of sub-bitfields and went searching for more information. This blog is honestly a goldmine, Thanks!

• Steven Pigeon says:

Thanks!