The Frivolous Theorem of Arithmetic

24/11/2009

There’s a theorem that, although its formulation is trivial, is of paramount importance in many things, including data compression. I’m talking about the frivolous theorem of arithmetic, of course. The theorem takes many forms, but one being:

Almost all natural numbers are very, very, very large.

026-cat-sleeping-01-smaller

The converse implies that there are a lot more big numbers than there are smaller numbers. Of course, this is trivially self-evident. But this trivial theorem can serve as a brutal reality check for many hypotheses. For example, one can use the frivolous theorem of arithmetic to disprove the existence of a lossless data compression method that compresses all inputs to smaller bit strings.

Read the rest of this entry »


Of staircases and textbooks

27/10/2009

This week, I’m talking you about a little identity that crops up often in the study of algorithms and which isn’t found in formula compendia—anyway, none that I have. I’m talking about this function:

\displaystyle C_{n,a}=\sum_{i=1}^n i a^i

This is a variation on the Gabriel’s Staircase function that does not have an infinite number of terms. Let us solve it without supposing that 0<a<1.

Read the rest of this entry »


Cargo Cult Programming (part 1)

13/10/2009

Programmers aren’t always the very rational beings they please themselves to believe. Very often, we close our eyes and take decisions based on what we think we know, and based on what have been told by more or less reliable sources. Such as, for example, taking red-black trees rather than AVL trees because they are faster, while not being able to quite justify in the details why it must be so. Programming using this kind of decision I call cargo cult programming.

cargo

Originally, I wanted to talk about red-black vs. AVL trees and how they compare, but I’ll rather talk about the STL std::map that is implemented using red-black trees with G++ 4.2, and std::unordered_map, a hash-table based container introduced in TR1.

Read the rest of this entry »


Generating Random Sequences (part I)

29/09/2009

Every once in a while, we need a random sequence. Whether to test a data structure’s performance or to run probabilistic unit tests, the provided rand primitive from your favorite programming language has several limitations. First, it’s been known for a while that if most implementations of the C standard library rand() function are not very random, despite being “good enough” in great many cases. Second, and more importantly, it does not allow you to easily control the period nor to generate a permutation on 0\ldots n-1, for example.

dice

There are many methods of generating pseudo-random number sequences. Not all exhibit the same properties and, accordingly, a method may be more useful in one case and perfectly useless in another. High quality pseudo-random number generation is a notoriously hard endeavor, but there is a number of very simple algorithms that will get you out of trouble for certain specific tasks. Let us consider, for example, the example where the pseudo-random generator must generate the numbers in 0\ldots{}n-1 exactly once, in a random order, of course, in exactly n draws.

Read the rest of this entry »


Suggested Reading:The LATEX Companion

04/09/2009

Frank Mittelbach, Michel Goossens, Johannes Braams, David Carlisle, Chris Rowley — The LATEX Companion — 2nd ed, Addison Wesley, 2006, 1090 pp. ISBN 0-201-36299-6

(Buy at Amazon.com)

(Buy at Amazon)

I should have told you about this book a long time ago. The LaTeX Companion is the definitive guide to LaTeX, ideal for anyone using it on a daily basis (or almost, as I do) or anyone wanting to learn LaTex. LaTeX is a complex and sophisticated mark-up language aimed at producing better typography for mathematics and scientific work—in which it totally succeeds. As for Linux, LaTeX (and TeX) comes in many distributions, some more geared toward the humanities, other for science, and still other for exquisite “art” typesetting.

A must read for graduate students.

*
* *

On the web:


Suggested Reading: The Drunkard’s Walk: How Randomness Rules our Lives

29/08/2009

Leonard Mlodinow — The Drunkard’s Walk: How Randomness Rules our Lives — Vintage, 2008, 252 pp. ISBN 978-0-307-27517-2

(Buy at Amazon.com)

(Buy at Amazon.com)

For those who are interested in (but not already familiar with) probabilities and statistics, I would most certainly recommand this book. Mlodinow presents the basic concepts of probability and statistics by concrete everyday examples—and visually, whenever possibile, rather than through classical mathematical notation. He discusses psychological factors that make us so bad at estimating probabilities and understanding statistics. The concepts he presents are deep but the style is fluid and makes The Drunkard’s Walk an easy read.

[…]the human understanding, once it has adopted an opinion, collects any instances that confirms it, and though the contrary instances may be more numerous and more weighty, it either does not notice them or else rejects them, in order that this opinion will remain unshaken.

Francis Bacon, 1620


Filtering Noise (Part I)

25/08/2009

If you own a car, you probably noticed that the speedometer needle’s position varies but relatively slowly, regardless of how the car actually accelerates or decelerates. Unless your speedometer is some variation on the eddy current meter, maybe the noise from the speed sensor isn’t filtered analogically but numerically by the dashboard’s computer.

Ford_Mondeo_MK3_ST220_-_Speedometer_(light)

Let us have a look at how this filtering could be done.

Read the rest of this entry »


Suggested Reading: One Hundred Essential Things You Didn’t Know you Didn’t Know: Maths Explains Your World

18/08/2009

John D. Barrow — One Hundred Essential Things You Didn’t Know you Didn’t Know: Maths Explains Your World — Norton, 2008, 284 pp. ISBN 978-0-393-07007-1

(Buy at Amazon.com)

(Buy at Amazon.com)

This isn’t really a math book; there’s hardly any real math, it’s rather a book about math, or maybe more a book about math in our daily lives, more precisely about things that are indeed solved by maths in our daily lives. The reader needn’t a very high level of knowledge about mathematics to enjoy the read. If it contains several little gems1, it also contains a number of chapters (as each of the 100 things you didn’t know you didn’t know is presented in a standalone chapter) that aren’t all that interesting. Still very much worth the read, though.


1 One I like particularly is the trick proposed by von Neumann to transform a biased coin into a fair coin. Let’s say we have a coin that lands on head with probability p and on tail with probability 1-p (both quite far from \frac{1}{2}). Von Neumann observes that if you throw the coin twice, it will land twice on heads with probability p^2, twice on tails with probability (1-p)^2. But head followed by tail and tail followed by head have the same probability of p(1-p). The quantity p(1-p) isn’t \frac{1}{2}, but if each time we throw two consecutive heads or tails we “forget” them, keeping only the draws with one head and one tail (HT or TH), we get an unbiased, or fair, coin. Suffice to map HT to heads and TH to tails (or vice-versa) and we’re done.


Faster than Bresenham’s Algorithm?

28/07/2009

There’s always a number of graphics primitives you will be using in all kinds of projects. If you’re lucky, you’re working on a “real” platform that comes with a large number of graphics libraries that offer abstractions to the graphics primitives of the OS and, ultimately, to its hardware. However, it’s always good to know what’s involved in a particular graphics primitives and how to recode it yourself should you be in need to do so, either because you do not have a library to help you, or because it would contaminate your project badly to include a library only to draw, say, a line, within an image buffer.

math06-detail

Lines are something we do a lot. Perfectly horizontal or vertical lines have very simple algorithms. Arbitrary lines are a bit more complicated, that is, to get just right. This week, let us take a look at a few algorithms to draw lines. First, we’ll discuss a naïve algorithm using floating point. We’ll also have a look at Bresenham’s algorithm that uses only integer arithmetic. Finally, we’ll show that we can do better than Bresenham if we used fixed point arithmetic.

Read the rest of this entry »


Checksums (part I)

21/07/2009

I once worked in a company specializing in embedded electronics for industrial applications. In one particular project, the device communicated through a RS-422 cable to the computer and seemed to return weird data once in a while, causing unwanted behavior in the control computer whose programming did not provide for this unexpected data. So I took upon myself to test the communication channel as it seemed that the on-board software was operating properly and did not contain serious bugs. I added a check-sum to the data packet and it turned out that some packets came in indeed corrupted despite the supposedly superior electrical characteristics of the RS-422 link.

After a few days’ work, I implemented the communication protocol that could detect and repair certain errors while reverting to a request to retransmit if the data was too damaged. I then started gathering statistics on error rate, number of retransmit, etc, and the spurious behavior on the controller’s side went away. My (metaphorically) pointy-haired boss opposed the modification because “we didn’t have any of these damn transmission errors until you put your fancy code in there“. Of course, this was an epic facepalm moment. I tried to explain that the errors have always been there, except that now they’re caught and repaired. Needless to say, it ended badly.

abstract-0002

Notwithstanding this absurd episode, I kept using check-sum to validate data whenever no other layer of the protocol took care of transmission errors. So, this week, let us discuss check-sums and other error detection algorithms.

Read the rest of this entry »