Fibonacci rabbits as a rewrite system


In my discrete mathematics class, I often use the Fibonacci rabbits example, here to show how to resolve a recurrence, there a variant where some rabbits go away, here again for rewrite systems.


What are rewrite systems? Not unlike context-free grammars, they provide rules generate “words” in a “language”. Turns out the Fibonacci rabbit population problem is super-easy to model as a rewrite system.

Read the rest of this entry »

Medians (Part I)


In a previous installment, about filtering noise, we discussed how to use a moving average to even out irregularities in a sequence. Averaging over a window is tricky. First, the window size must make sense in terms of the speed at which the signal changes (in samples), and the average is (overly) sensitive to outliers.

One way to limit the influence of the outliers for denoising is to use the median. However, computing the median is usually more tricky than computing the average, or mean, and this first post (in a series of three, in the next few weeks), discusses how to compute the median efficiently using the selection algorithm.

Read the rest of this entry »



QuickSort, due to Hoare, is one of the best comparison-based sorting algorithms around. There are a number of variations, but the vanilla QuickSort does a great job… most of the times.

QuickSort Animation (Source: Wikipedia)

Indeed, there are occasions where QuickSort leaves its expected O(n \lg n) run-time to reach O(n^2) run-time, its worst case. But what does it take to push QuickSort into its worst behaviour?

Read the rest of this entry »

Suggested Reading: Algorithms for Memory Hierarchies


Ulrich Meyer, Peter Sanders, Jop Sibeyn (eds.) — Algorithms for Memory Hierarchies — Springer, (Lectures Notes on Computer Science LNCS # 2625), 2003, 428 pp. ISBN 978-3540-00883-5

(Buy at

This book is a collection of chapters on various topics pertaining to memory hierarchies and their algorithms, but written by several different authors, without any special uniformity in tone or topics; but that’s OK because it allows the reader to have a broad view of algorithms for memory hierarchies.

Read the rest of this entry »

Suggested Reading: Advanced Data Structures


Peter Brass — Advanced Data Structures — Cambridge University Press, 2008, 492 pp. ISBN 978-0521-88037-4

(Buy at

The first part of the book concentrates on search trees and variants, whether balanced trees, interval trees, or heaps. Chapters are dedicated to connected components and like algorithms, one to algorithms for strings, and one for hash tables. Follows appendices on computation, cache oblivious algorithms, etc.

Read the rest of this entry »

The Frivolous Theorem of Arithmetic


There’s a theorem that, although its formulation is trivial, is of paramount importance in many things, including data compression. I’m talking about the frivolous theorem of arithmetic, of course. The theorem takes many forms, but one being:

Almost all natural numbers are very, very, very large.


The converse implies that there are a lot more big numbers than there are smaller numbers. Of course, this is trivially self-evident. But this trivial theorem can serve as a brutal reality check for many hypotheses. For example, one can use the frivolous theorem of arithmetic to disprove the existence of a lossless data compression method that compresses all inputs to smaller bit strings.

Read the rest of this entry »

Cargo Cult Programming (part 1)


Programmers aren’t always the very rational beings they please themselves to believe. Very often, we close our eyes and take decisions based on what we think we know, and based on what have been told by more or less reliable sources. Such as, for example, taking red-black trees rather than AVL trees because they are faster, while not being able to quite justify in the details why it must be so. Programming using this kind of decision I call cargo cult programming.


Originally, I wanted to talk about red-black vs. AVL trees and how they compare, but I’ll rather talk about the STL std::map that is implemented using red-black trees with G++ 4.2, and std::unordered_map, a hash-table based container introduced in TR1.

Read the rest of this entry »

Checksums (part I)


I once worked in a company specializing in embedded electronics for industrial applications. In one particular project, the device communicated through a RS-422 cable to the computer and seemed to return weird data once in a while, causing unwanted behavior in the control computer whose programming did not provide for this unexpected data. So I took upon myself to test the communication channel as it seemed that the on-board software was operating properly and did not contain serious bugs. I added a check-sum to the data packet and it turned out that some packets came in indeed corrupted despite the supposedly superior electrical characteristics of the RS-422 link.

After a few days’ work, I implemented the communication protocol that could detect and repair certain errors while reverting to a request to retransmit if the data was too damaged. I then started gathering statistics on error rate, number of retransmit, etc, and the spurious behavior on the controller’s side went away. My (metaphorically) pointy-haired boss opposed the modification because “we didn’t have any of these damn transmission errors until you put your fancy code in there“. Of course, this was an epic facepalm moment. I tried to explain that the errors have always been there, except that now they’re caught and repaired. Needless to say, it ended badly.


Notwithstanding this absurd episode, I kept using check-sum to validate data whenever no other layer of the protocol took care of transmission errors. So, this week, let us discuss check-sums and other error detection algorithms.

Read the rest of this entry »

Powers of Ten (so to speak)


I am not sure if you are old enough to remember the 1977 IBM movie Powers of Ten (trippy version, without narration) [also at the IMDB and wikipedia], but that’s a movie that sure put things in perspective. Thinking in terms of powers of ten helps me sort things out when I am considering a design problem. Thinking of the scale of a problem in terms of physical scale is a good way to assess its true importance for a project. Sometimes the problem is the one to solve, sometimes, it is not. It’s not because a problem is fun, enticing, or challenging, that it has to be solved optimally right away because, in the correct context, considering its true scale, it may not be as important as first thought.


Maybe comparing problems’ scales to powers of ten in the physical realm helps understanding where to put your efforts. So here are the different scales and what I think they should contain:

Read the rest of this entry »

Sorting Lists (part II)


Last week I showed you the radix sort on simple linked lists. This week, I will present a version of QuickSort modified to sort simply linked lists.

Read the rest of this entry »