While flipping the pages of a “Win this interview” book—just being curious, not looking for a new job—I saw this seemingly simple question: how would you compute the sum of a series of floats contained in a array? The book proceeded with the simple, obvious answer. But… is it that obvious?
In a previous episode, we looked at how we could use random sampling to get a good estimate of , and found that we couldn’t, really, unless using zillions of samples. The imprecision of the estimated had nothing to do with the “machine precision”, the precision at which the machine represents numbers. The method essentially counted (using size_t, a 64-bits unsigned integer—at least on my box) the number of (random) points inside the circle versus the total number of points drawn.
Can we increase the precision of the estimate by using a better method? Maybe something like numerical integration?
The float and double floating-point data types have been present for a long time in the C (and C++) standard. While neither the C nor C++ standards do not enforce it, virtually all implementations comply to the IEEE 754—or try very hard to. In fact, I do not know as of today of an implementation that uses something very different. But the IEEE 754-type floats are aging. GPU started to add extensions such as short floats for evident reasons. Should we start considering adding new types on both ends of the spectrum?
The next step up, the quadruple precision float, is already part of the standard, but, as far as I know, not implemented anywhere. Intel x86 does have something in between for its internal float format on 80 bits, the so-called extended precision, but it’s not really standard as it is not sanctioned by the IEEE standards, and, generally speaking, and surprisingly enough, not really supported well by the instruction set. It’s sometimes supported by the long double C type. But, anyway, what’s in a floating point number?
Experiments do not always work as planned. Sometimes you may invest a lot of time into a (sub)project only to get no, or only moderately interesting results. Such a (moderately) failed experiment is the topic of this week’s blog post.
Some time ago I wrote a CSV exporter for an application I was writing and, amongst the fields I needed to export, were floating point values. The application was developed under Visual Studio 2005 and I really didn’t like how VS2005’s printf function handled the formats for floats. To export values losslessly, that is, you could read back exactly what you wrote to file, I decided to use the "%e" format specifier for printf. Turned out that it was neither lossless nor minimal!