March 21, 2017
The idea of reorganizing data before compression isn’t new. Almost twenty five years ago, Burrows and Wheeler proposed a block sorting transform to reorder data to help compression. The idea here is to try to create repetitions that can be exploited by a second compression engine.

But the Burrows-Wheeler transform isn’t the only possible one. There are a few other techniques to generate (reversible) permutations of the input.

Read the rest of this entry »

Leave a Comment » | algorithms, C-plus-plus, data compression, hacks | Tagged: Burrows, Burrows Wheeler Transform, compression, hash table, Linear probing, Quadratic probing, Transform, Wheeler | Permalink

Posted by Steven Pigeon

March 7, 2017
A rather long time ago, I wrote a blog entry on branchless equivalents of simple functions such as `sex`, `abs`, `min`, `max`. The **S**ing **EX**tension instruction propagates the sign bit in the upper bits, and is typically used in the promotion of, say, a 16 bits signed value into a 32 bits variable.

But this time, I needed something a bit different: I only wanted the sign-extended part. Could I do much better than last time? Turns out, the compiler has a mind of its own.

Read the rest of this entry »

Leave a Comment » | algorithms, bit twiddling, C, C-plus-plus, C99, CPU Architecture, hacks | Tagged: abs, g++, grumpy cat, max, min, sex, sign extension | Permalink

Posted by Steven Pigeon

January 24, 2017
While flipping the pages of a “Win this interview” book—just being curious, not looking for a new job—I saw this seemingly simple question: how would you compute the sum of a series of floats contained in a array? The book proceeded with the simple, obvious answer. But… is it that obvious?

Read the rest of this entry »

4 Comments | algorithms, C, C-plus-plus, C99, hacks | Tagged: float, IEEE 754, Interview, std::accumulate, std::sort | Permalink

Posted by Steven Pigeon

January 10, 2017
Something that used to bug me—used to, because I am so accustomed to work around it that I rarely notice the problem—is that in neither C nor C++ you can use strings (`const char *` or `std::string`) in switch/case statement. Indeed, the switch/case statement works only on integral values (an `enum`, an integral type such as `char` and `int`, or an object type with implicit cast to an integral type). But strings aren’t of integral types!

In pure C, we’re pretty much done for. The C preprocessor is too weak to help us built compile-time expression out of strings (or, more exactly, `const char *`), and there’sn’t much else in the language to help us. However, things are a bit different in C++.

Read the rest of this entry »

9 Comments | algorithms, bit twiddling, C, C-plus-plus, hacks | Tagged: C++11, compile-time, constexpr, hash, hash function, Switch/case | Permalink

Posted by Steven Pigeon

December 27, 2016
Sometime last week, a tweet from @nixCraft prompted the question, quite ironically, how do you get the maximum (largest positive) value for an integer?

Read the rest of this entry »

Leave a Comment » | C, C-plus-plus, C99, Portable Code, programming | Tagged: CHAR_BIT, intptr_t, INT_MAX, ptrdiff_t, size_t, stddef, stdint, uintptr_t | Permalink

Posted by Steven Pigeon

September 6, 2016
Quite a while ago I discussed using flat arrays and address calculations to store a tree in a simple array. The trick wasn’t new (I think it’s due to Williams, 1964 [1]), but is only really practical when we consider heaps, or otherwise very well balanced trees. If you have a misshapen tree, that trick doesn’t help you. It doesn’t help you either if you try to serialize a misshapen tree to disk.

But what if we do want to serialize arbitrarily-shaped trees to disk? Is it painful? Fortunately, no! Let’s see how.

Read the rest of this entry »

Leave a Comment » | algorithms, C-plus-plus, data structures, programming | Tagged: bit, C, Deserialization, Pointers, serialization, Standish, Tree | Permalink

Posted by Steven Pigeon

April 5, 2016
I am still experimenting with hash functions, and I was toying with the Zobrist hash function[1] that is best known for its use in chess engines. The hash function is conceptually simple: you need a large table of random numbers, indexed, in a chess application, by the position on the board of the piece and by the piece itself. To compute a hash for a whole board configuration, you simply xor all the random numbers together. The hard part is choosing the random numbers.

Read the rest of this entry »

Leave a Comment » | C-plus-plus, Cryptography, programming | Tagged: /dev/urandom, hash, hash function, pseudo-random number generator, random, Zobrist | Permalink

Posted by Steven Pigeon