03/03/2009
In a previous post, I’ve presented an ad hoc compression method known as digram (digraph, bigram, bigraph) coding that coded pairs of symbols on used codes from the original alphabet yielding a very simple encode/decode and also sufficiently good compression (in the vicinity of 1.5:1 or so). While clearly not on par with state-of-the-art codecs such as Bzip2 or PAQ8P, this simple codec can still buy you extra space, even when using a very slow/simple processor.
This week, I’ll introduce you to another simple algorithm, Move To Front coding.
Read the rest of this entry »
4 Comments |
algorithms, C, data compression, embedded programming | Tagged: binary encoding, C, coding, data compression, Data Structure, efficient code, Efficient data structures, programming, Reverse index |
Permalink
Posted by Steven Pigeon
17/02/2009
Ad hoc compression algorithms always sound somewhat flimsy, but they still have their usefulness whenever the data you have to compress exhibits very special properties or is extremely short. Universal compression algorithms need a somewhat lengthy input to adapt sufficiently to yield good compression. Moreover, you may not have the computing power (CPU speed, RAM and program space) to implement full-blown compressors. In this case also, an ad hoc algorithm may be useful.
Digram (bigram, digraph, bigraph) coding is one of those ad hoc algorithms. Digram coding will compress text (or any source, really) by finding pairs of symbols that appear often and assigning them codes corresponding to used symbols, if any.

Let us see in detail what digram compression is, and how effective it really is.
Read the rest of this entry »
2 Comments |
algorithms, C, data compression, hacks, programming | Tagged: ad hoc, average code length, bigram, bigraph, compression, Digram, digraph, efficient code, simple, text compression |
Permalink
Posted by Steven Pigeon
03/02/2009
Yesterday, at least at the time I am writing this entry, I spoke about the Zune and its time bug that caused so many of the devices to just freeze on boot. It its clear that this bug is the result of poor testing (in addition to a poor implementation of a conversion using loops rather than explicit divisions and moduli) and this week, I’d like to tell you about one of my tricks to debug code.
Read the rest of this entry »
2 Comments |
C, C99, hacks, Life in the workplace, programming | Tagged: #define, C Preprocessor, debug, design by contract, fence, GNU CPP, GNU CPP bug, GNU CPP bug no it's not a bug we won't fix it, macro, Magic Numbers, memory allocation |
Permalink
Posted by Steven Pigeon
27/01/2009
This week, two “quickies”: rounding up and down to the next power of two, and converting efficiently a value to exactly 0 or 1.
Read the rest of this entry »
Leave a Comment » |
algorithms, assembly language, bit twiddling, C, C99, embedded programming, hacks | Tagged: ANSI C, bitwise and, bitwise negation, bitwise operations, rounding down, rounding up, sex, truncating |
Permalink
Posted by Steven Pigeon
13/01/2009
The original Hungarian notation is due to Charles Simonyi who invented it sometimes while he was working at Xerox Palo Alto Research Center—the Xerox PARC that gave us the mouse and the first graphical user interfaces. The basic principle of Hungarian naming convention is to prefix the variables with one or many particles, encoding alternatively its type or its intend. This lets programmer write prgszNames as a variable name, which is perfectly legible to one well versed in Hungarian; however, but looks mostly like gibberish to just anyone else.
I recently changed my mind about the Hungarian naming convention. I don’t think it’s that stupid anymore.
Read the rest of this entry »
Leave a Comment » |
C, C99, Life in the workplace, Object Oriented Programming, programming | Tagged: C, Hungarian Notation, Microsoft, Naming Convention, PARC, POD, PODs, Xerox |
Permalink
Posted by Steven Pigeon
05/01/2009
The faulty code was leaked some time last week, and I’ve looked at it, and, well, it’s surprisingly clean (despite the obvious defect and how they should’ve found it before releasing it (see here)).
Have a look at the ConvertDays routine.
Read the rest of this entry »
3 Comments |
algorithms, C, C99, embedded programming, hacks, Life in the workplace, Operating System, programming | Tagged: Date Conversion, Microsoft, Source Leak, Unit Testing, Zune |
Permalink
Posted by Steven Pigeon
31/12/2008
The few Zune users where alienated today when they discovered that their Zunes just froze during boot. Apparently, a mysterious bug.
The cause of the bug was rapidly diagnosticated. According to Itsnotabigtruck on Zuneboards, the bug can be traced back to a defective date conversion routine. I reproduce here the code for you:
year = ORIGINYEAR; /* = 1980 */
while (days > 365)
{
if (IsLeapYear(year))
{
if (days > 366)
{
days -= 366;
year += 1;
}
}
else
{
days -= 365;
year += 1;
}
}
Can you see what’s just wrong about that code?
Read the rest of this entry »
10 Comments |
algorithms, C, embedded programming, Life in the workplace, programming | Tagged: Date Conversion, Laughingstock, Microsoft, Total Lack of Common Sense, Unit Testing, Zune |
Permalink
Posted by Steven Pigeon
30/12/2008
The cost of virtual functions is often invoked as a reason to C++’s poor performance compared to other languages, especially C. This is an enduring myth that, like most myths, have always bugged me. C++ myths are propagated by individuals that did not know C++ very well, tried it one weekend in 1996, used a bad compiler, knew nothing about optimization switches, and peremptorily declared C++ as fundamentally broken. Well, I must agree that C++ compilers in the mid-90s weren’t all that hot, but in the last fifteen years, a lot have been done. Compilers are now rather good at generating efficient C++ code.
However, the cost of calls, whether or not they are virtual, is not dominated by the the call itself (getting the address to jump to and jumping) but by everything else surrounding the call, like the stack setup and argument passing. Let us debunk that myth by looking at what types of calls are available in C and C++, how they translate to machine code, and see how faster or slower they are relative to each other.
Read the rest of this entry »
22 Comments |
assembly language, C, C99, database, hacks, Object Oriented Programming, programming | Tagged: C, calls, code generation, Compiler, micro-optimization, Object Oriented Programming, optimization |
Permalink
Posted by Steven Pigeon
02/12/2008
In C and C++, the behavior of integer promotions is not the same whether we use signed integers or unsigned integers. In bit twiddling hacks, it is not immediately apparent that it is a problem that may lead to unexpected result whenever code is ported to a different architecture. This is a recurring topic whenever I discuss portability with friends. Very often, they argue that if it works on Windows and 32 bits x86 Linux, that’s pretty much all there is to it.
Of course, I couldn’t disagree more. I have learned the hard way.
Read the rest of this entry »
2 Comments |
algorithms, bit twiddling, C, C99, hacks, programming | Tagged: C, C99, Constants, Safe Types |
Permalink
Posted by Steven Pigeon
25/11/2008
This week, we’ll be following last week’s post, where we looked at type-safe integer constants, with floating point constants, that is, float and double.
Read the rest of this entry »
Leave a Comment » |
bit twiddling, C, hacks, programming | Tagged: C, C99, double, float, floating point, floating point values |
Permalink
Posted by Steven Pigeon