Failed Experiment

11/05/2010

Experiments do not always work as planned. Sometimes you may invest a lot of time into a (sub)project only to get no, or only moderately interesting results. Such a (moderately) failed experiment is the topic of this week’s blog post.

Some time ago I wrote a CSV exporter for an application I was writing and, amongst the fields I needed to export, were floating point values. The application was developed under Visual Studio 2005 and I really didn’t like how VS2005’s printf function handled the formats for floats. To export values losslessly, that is, you could read back exactly what you wrote to file, I decided to use the "%e" format specifier for printf. Turned out that it was neither lossless nor minimal!

Read the rest of this entry »


Suggested Reading: Algorithms for Memory Hierarchies

07/03/2010

Ulrich Meyer, Peter Sanders, Jop Sibeyn (eds.) — Algorithms for Memory Hierarchies — Springer, (Lectures Notes on Computer Science LNCS # 2625), 2003, 428 pp. ISBN 978-3540-00883-5

(Buy at Amazon.com)

This book is a collection of chapters on various topics pertaining to memory hierarchies and their algorithms, but written by several different authors, without any special uniformity in tone or topics; but that’s OK because it allows the reader to have a broad view of algorithms for memory hierarchies.

Read the rest of this entry »


Suggested Reading: Advanced Data Structures

07/03/2010

Peter Brass — Advanced Data Structures — Cambridge University Press, 2008, 492 pp. ISBN 978-0521-88037-4

(Buy at Amazon.com)

The first part of the book concentrates on search trees and variants, whether balanced trees, interval trees, or heaps. Chapters are dedicated to connected components and like algorithms, one to algorithms for strings, and one for hash tables. Follows appendices on computation, cache oblivious algorithms, etc.

Read the rest of this entry »


Is Python Slow?

10/11/2009

Python is a programming language that I learnt somewhat recently (something like 2, 3 years ago) and that I like very much. It is simple, to the point, and has several functional-like constructs that I am already familiar with. But Python is slow compared to other programming languages. But it was unclear to me just how slow Python was compared to other languages. It just felt slow.

Lewis_chess_queen_

So I have decided to investigate by comparing the implementation of a simple, compute-bound problem, the eight queens puzzle generalized to any board dimensions. This puzzle is most easily solved using, as Dijkstra did, a depth-first backtracking program, using bitmaps to test rapidly whether or not a square is free of attack1. I implemented the same program in C++, Python, and Bash, and got help from friends for the C# and Java versions2. I then compared the resulting speeds.

Read the rest of this entry »


Cargo Cult Programming (part 1)

13/10/2009

Programmers aren’t always the very rational beings they please themselves to believe. Very often, we close our eyes and take decisions based on what we think we know, and based on what have been told by more or less reliable sources. Such as, for example, taking red-black trees rather than AVL trees because they are faster, while not being able to quite justify in the details why it must be so. Programming using this kind of decision I call cargo cult programming.

cargo

Originally, I wanted to talk about red-black vs. AVL trees and how they compare, but I’ll rather talk about the STL std::map that is implemented using red-black trees with G++ 4.2, and std::unordered_map, a hash-table based container introduced in TR1.

Read the rest of this entry »


Powers of Ten (so to speak)

29/06/2009

I am not sure if you are old enough to remember the 1977 IBM movie Powers of Ten (trippy version, without narration) [also at the IMDB and wikipedia], but that’s a movie that sure put things in perspective. Thinking in terms of powers of ten helps me sort things out when I am considering a design problem. Thinking of the scale of a problem in terms of physical scale is a good way to assess its true importance for a project. Sometimes the problem is the one to solve, sometimes, it is not. It’s not because a problem is fun, enticing, or challenging, that it has to be solved optimally right away because, in the correct context, considering its true scale, it may not be as important as first thought.

atomic-cycle

Maybe comparing problems’ scales to powers of ten in the physical realm helps understanding where to put your efforts. So here are the different scales and what I think they should contain:

Read the rest of this entry »


Sorting Lists (part II)

23/06/2009

Last week I showed you the radix sort on simple linked lists. This week, I will present a version of QuickSort modified to sort simply linked lists.

Read the rest of this entry »


Sorting Linked Lists (part I)

16/06/2009

The sorting algorithms we were taught in class were typically simplified versions of the algorithms that assumed that the data remained in memory and in contiguous memory locations, also known as an array. What we often have, however, are linked lists. Generalizing algorithms like the QuickSort to lists is not very hard in fact if we use a modification of the basic algorithm.

For lists with numerical keys (or values), there might be a simpler algorithm that scans the list in only one direction, so that simply linked lists are a perfectly cromulent data structure, that is, the radix sort.

Read the rest of this entry »


Not Loosing the Perspective

19/05/2009

My first true computer, a TI 99/4a (which I talk about also in a previous entry), had 16 or 48 KB of RAM, depending whether or not you had one of the memory expansion cartridges, and that limited quantity of memory severely curbed the complexity of the programs one could write on the machine. However, the limited complexity of the programs, the relative crudeness of the development environment (a BASIC shell) and the slow execution speeds weren’t very obvious to me back then. They were somewhat mitigated by the novelty of the computer itself as a machine, and by the perpetual intense excitement of discovery. The arduous design of programs to save memory, fit more graphics or more code, or even getting our programs to work at all was less about constraints than challenge.

The same kind of constraints—or challenge—followed me over the years as I moved on to different computers. Despite their being more powerful, both faster and sporting more memory, the challenge remained there because while the computers got better, so did I at programming. I kept asking more out of the machine, writing increasingly complex programs needing either more memory or more speed, often both. That meant better algorithms, better data structures, and better implementations1.

Read the rest of this entry »


Compact Tree Storage

07/04/2009

Implementing data structures in a way that uses efficiently memory should always be on your mind. I do not mean going overboard and micro-optimizing memory allocation right down to the bit. I mean organize data structures in memory so that you can avoid pointers, so that you can use contiguous memory segments, etc. Normally, minimizing storage by avoiding extra pointers when possible will benefit your program in at least two ways.

First, the reduced memory requirement will make your data structure fit in cache more easily. Remember that if pointers are 4 bytes long in 32 bits programming, they are 8 bytes long in 64 bits environments. This yields better run time performance because you maximize your chances of having the data you need in cache.

Second, contiguous memory layouts also allow for efficient scans of data structures. For example, if you have a classical binary tree, implemented using nodes having each two pointers, you will have to use a tree traversal algorithm, possibly recursive, to enumerate the tree’s content. If you don’t really care about the order in which the nodes are visited, what’s quite cumbersome.

It turns out that for special classes of trees, complete trees, there is a contiguous, and quite simple, layout.

Read the rest of this entry »