## Shallow Constitude

March 19, 2013

In programming languages, there are constructs that are of little pragmatic importance (that is, they do not really affect how code behaves or what code is generated by the compiler) but are of great “social” importance as they instruct the programmer as to what contract the code complies to.

One of those constructs in C++ is the const (and other access modifiers) that explicitly states to the programmer that this function argument will be treated as read-only, and that it’s safe to pass your data to it, it won’t be modified. But is it all but security theater?

## Compressed Series (Part II)

March 12, 2013

Last week we looked at an alternative series to compute $e$, and this week we will have a look at the computation of $e^x$. The usual series we learn in calculus textbook is given by

$\displaystyle e^x=\sum_{n=0}^\infty \frac{x^n}{n!}$

We can factorize the expression as

## Float16

February 12, 2013

The possible strategies for data compression fall into two main categories: lossless and lossy compression. Lossless compression means that you retrieve exactly what went in after compression, while lossy means that some information was destroyed to get better compression, meaning that you do not retrieve the original data, but only a reasonable reconstruction (for various definitions of “reasonable”).

Destroying information is usually performed using transforms and quantization. Transforms map the original data onto a space were the unimportant variations are easily identified, and on which quantization can be applied without affecting the original signal too much. For quantization, the first approach is to simply reduce precision, somehow “rounding” the values onto a smaller set of admissible values. For decimal numbers, this operation is rounding (or truncating) to the $n$th digit (with $n$ smaller than the original precision). A much better approach is to minimize an explicit error function, choosing the smaller set of values in a way that minimizes the expected error (or maximum error, depending on how you formulate your problem).

## Fat, Slim Pointers

July 31, 2012

64 bits address space lets us access tons more memory than 32 bits, but with a catch: the pointers themselves are … well, yes, 64 bits. 8 bytes. Which eventually pile up to make a whole lot of memory devoted to pointers if you use pointer-rich data structures. Can we do something about this?

Well, in ye goode olde dayes of 16 bits/32 bits computing, we had some compilers that could deal with near and far pointers; the near, 16-bit pointers being relative to one of the segments, possibly the stack segment, and the far, 32-bits pointers being absolute or relative to a segment. This, of course, made programming pointlessly complicated as each pointer was to be used in its correct context to point to the right thing.

## Faster Collatz

May 1, 2012

Quite a while ago, I presented the Collatz conjecture and I was then interested in the graphical representation of the problem—and not really going anywhere with it.

In this entry, let us have a look at the implementation of the Collatz function.

## Trigonometric Tables Reconsidered

February 28, 2012

A couple of months ago (already!) 0xjfdube produced an excellent piece on table-based trigonometric function computations. The basic idea was that you can look-up the value of a trigonometric function rather than actually computing it, on the premise that computing such functions directly is inherently slow. Turns out that’s actually the case, even on fancy CPUs. He discusses the precision of the estimate based on the size of the table and shows that you needn’t a terrible number of entries to get decent precision. He muses over the possibility of using interpolation to augment precision, speculating that it might be slow anyway.

I started thinking about how to interpolate efficiently between table entries but then I realized that it’s not fundamentally more complicated to compute a polynomial approximation of the functions than to search the table then use polynomial interpolation between entries.

## (Random Musings) On Floats and Encodings

January 31, 2012

The float and double floating-point data types have been present for a long time in the C (and C++) standard. While neither the C nor C++ standards do not enforce it, virtually all implementations comply to the IEEE 754—or try very hard to. In fact, I do not know as of today of an implementation that uses something very different. But the IEEE 754-type floats are aging. GPU started to add extensions such as short floats for evident reasons. Should we start considering adding new types on both ends of the spectrum?

The next step up, the quadruple precision float, is already part of the standard, but, as far as I know, not implemented anywhere. Intel x86 does have something in between for its internal float format on 80 bits, the so-called extended precision, but it’s not really standard as it is not sanctioned by the IEEE standards, and, generally speaking, and surprisingly enough, not really supported well by the instruction set. It’s sometimes supported by the long double C type. But, anyway, what’s in a floating point number?

## Medians (Part III)

January 24, 2012

So in the two previous parts of this series, we have looked at the selection algorithm and at sorting networks for determining efficiently the (sample) median of a series of values.

In this last installment of the series, I consider an efficient (but approximate) algorithm based on heaps to compute the median.

## Medians (Part II)

January 10, 2012

In the previous post of this series, we left off where we were asking ourselves if there was a better way than the selection algorithm of finding the median.

Computing the median of three numbers is a simple as sorting the three numbers (an operation that can be done in constant time, after all, if comparing and swapping are constant time) and picking the middle. However, if the objects compared are “heavy”, comparing and (especially) moving them around may be expensive.

## Medians (Part I)

December 27, 2011

In a previous installment, about filtering noise, we discussed how to use a moving average to even out irregularities in a sequence. Averaging over a window is tricky. First, the window size must make sense in terms of the speed at which the signal changes (in samples), and the average is (overly) sensitive to outliers.

One way to limit the influence of the outliers for denoising is to use the median. However, computing the median is usually more tricky than computing the average, or mean, and this first post (in a series of three, in the next few weeks), discusses how to compute the median efficiently using the selection algorithm.