11/05/2010
Experiments do not always work as planned. Sometimes you may invest a lot of time into a (sub)project only to get no, or only moderately interesting results. Such a (moderately) failed experiment is the topic of this week’s blog post.

Some time ago I wrote a CSV exporter for an application I was writing and, amongst the fields I needed to export, were floating point values. The application was developed under Visual Studio 2005 and I really didn’t like how VS2005’s printf function handled the formats for floats. To export values losslessly, that is, you could read back exactly what you wrote to file, I decided to use the "%e" format specifier for printf. Turned out that it was neither lossless nor minimal!
Read the rest of this entry »
8 Comments |
algorithms, C, data structures, Portable Code | Tagged: CSV, double, fail, float, print format, printf |
Permalink
Posted by Steven Pigeon
07/03/2010
Ulrich Meyer, Peter Sanders, Jop Sibeyn (eds.) — Algorithms for Memory Hierarchies — Springer, (Lectures Notes on Computer Science LNCS # 2625), 2003, 428 pp. ISBN 978-3540-00883-5

(Buy at Amazon.com)
This book is a collection of chapters on various topics pertaining to memory hierarchies and their algorithms, but written by several different authors, without any special uniformity in tone or topics; but that’s OK because it allows the reader to have a broad view of algorithms for memory hierarchies.
Read the rest of this entry »
Leave a Comment » |
algorithms, data structures, Mathematics, Operating System, programming, Suggested Reading, theoretical computer science |
Permalink
Posted by Steven Pigeon
07/03/2010
Peter Brass — Advanced Data Structures — Cambridge University Press, 2008, 492 pp. ISBN 978-0521-88037-4

(Buy at Amazon.com)
The first part of the book concentrates on search trees and variants, whether balanced trees, interval trees, or heaps. Chapters are dedicated to connected components and like algorithms, one to algorithms for strings, and one for hash tables. Follows appendices on computation, cache oblivious algorithms, etc.
Read the rest of this entry »
1 Comment |
algorithms, data structures, Mathematics, Object Oriented Programming, programming, Suggested Reading, theoretical computer science |
Permalink
Posted by Steven Pigeon
10/11/2009
Python is a programming language that I learnt somewhat recently (something like 2, 3 years ago) and that I like very much. It is simple, to the point, and has several functional-like constructs that I am already familiar with. But Python is slow compared to other programming languages. But it was unclear to me just how slow Python was compared to other languages. It just felt slow.

So I have decided to investigate by comparing the implementation of a simple, compute-bound problem, the eight queens puzzle generalized to any board dimensions. This puzzle is most easily solved using, as Dijkstra did, a depth-first backtracking program, using bitmaps to test rapidly whether or not a square is free of attack1. I implemented the same program in C++, Python, and Bash, and got help from friends for the C# and Java versions2. I then compared the resulting speeds.
Read the rest of this entry »
88 Comments |
algorithms, Artificial Intelligence, Bash (Shell), bit twiddling, C, data structures, hacks, programming, Python | Tagged: Artificial Intelligence, backtracking, bash, C, Chess, chess problem, chess puzzle, constant propagation, constant-folding, eight queens, eight queens problem, eight queens puzzle, g++, gcj, gmcs, java, puzzle, pygame, Python, python 2.6, recursion, recursive algorithm |
Permalink
Posted by Steven Pigeon
13/10/2009
Programmers aren’t always the very rational beings they please themselves to believe. Very often, we close our eyes and take decisions based on what we think we know, and based on what have been told by more or less reliable sources. Such as, for example, taking red-black trees rather than AVL trees because they are faster, while not being able to quite justify in the details why it must be so. Programming using this kind of decision I call cargo cult programming.

Originally, I wanted to talk about red-black vs. AVL trees and how they compare, but I’ll rather talk about the STL std::map that is implemented using red-black trees with G++ 4.2, and std::unordered_map, a hash-table based container introduced in TR1.
Read the rest of this entry »
3 Comments |
algorithms, data structures, Design, hacks, Mathematics, programming, theoretical computer science | Tagged: algorithmics, assumptions, AVL tree, balancing tree, C, cargo, cargo cult, complexity, hash table, magic, magic thinking, red-black tree, Scrabble, Splay Tree, std::map, std::unordered_map, stl, TR1 |
Permalink
Posted by Steven Pigeon
29/06/2009
I am not sure if you are old enough to remember the 1977 IBM movie Powers of Ten (trippy version, without narration) [also at the IMDB and wikipedia], but that’s a movie that sure put things in perspective. Thinking in terms of powers of ten helps me sort things out when I am considering a design problem. Thinking of the scale of a problem in terms of physical scale is a good way to assess its true importance for a project. Sometimes the problem is the one to solve, sometimes, it is not. It’s not because a problem is fun, enticing, or challenging, that it has to be solved optimally right away because, in the correct context, considering its true scale, it may not be as important as first thought.

Maybe comparing problems’ scales to powers of ten in the physical realm helps understanding where to put your efforts. So here are the different scales and what I think they should contain:
Read the rest of this entry »
1 Comment |
algorithms, assembly language, bit twiddling, CPU Architecture, data structures, Design, hacks, Instruction Sets, Life in the workplace, Object Oriented Programming, Operating System, Portable Code, programming, theoretical computer science, Zen | Tagged: 1977, atomic, bit twiddling, branch prediction, C Standard Library, class, classes, coding, compatibility, CPU, ecosystem, global, graphical user interface, GUI, IBM, instruction, instruction set, interoperability, macroscopic, mesoscopic, methods, micro-code, micro-instruction, micro-optimization, microscopic, molecular, networking, Object Oriented Programming, Operating System, optimization, out of order execution, POD, powers of ten, premature optimization, registers, speculative execution, string, subatomic, system |
Permalink
Posted by Steven Pigeon
23/06/2009
Last week I showed you the radix sort on simple linked lists. This week, I will present a version of QuickSort modified to sort simply linked lists.
Read the rest of this entry »
2 Comments |
algorithms, bit twiddling, C, C99, data structures, hacks, Mathematics, programming, theoretical computer science, Uncategorized | Tagged: address transformation, box plot, box-and-whiskers, boxplot, comparison sort, gettimeofday, graphs, median, microsecond, quartile, quartiles, Quick Sort, QuickSort, radix, Radix Sort, tests, timer, trials |
Permalink
Posted by Steven Pigeon
16/06/2009
The sorting algorithms we were taught in class were typically simplified versions of the algorithms that assumed that the data remained in memory and in contiguous memory locations, also known as an array. What we often have, however, are linked lists. Generalizing algorithms like the QuickSort to lists is not very hard in fact if we use a modification of the basic algorithm.
For lists with numerical keys (or values), there might be a simpler algorithm that scans the list in only one direction, so that simply linked lists are a perfectly cromulent data structure, that is, the radix sort.
Read the rest of this entry »
2 Comments |
algorithms, C99, data structures, database, Mathematics | Tagged: aces, card deck, cromulent, data mining, data set, deuces, external sorting, French suits, kings, large data set, linked lists, lists, playing cards, QuickSort, radix, Radix Sort, sorting cards |
Permalink
Posted by Steven Pigeon
19/05/2009
My first true computer, a TI 99/4a (which I talk about also in a previous entry), had 16 or 48 KB of RAM, depending whether or not you had one of the memory expansion cartridges, and that limited quantity of memory severely curbed the complexity of the programs one could write on the machine. However, the limited complexity of the programs, the relative crudeness of the development environment (a BASIC shell) and the slow execution speeds weren’t very obvious to me back then. They were somewhat mitigated by the novelty of the computer itself as a machine, and by the perpetual intense excitement of discovery. The arduous design of programs to save memory, fit more graphics or more code, or even getting our programs to work at all was less about constraints than challenge.
The same kind of constraints—or challenge—followed me over the years as I moved on to different computers. Despite their being more powerful, both faster and sporting more memory, the challenge remained there because while the computers got better, so did I at programming. I kept asking more out of the machine, writing increasingly complex programs needing either more memory or more speed, often both. That meant better algorithms, better data structures, and better implementations1.
Read the rest of this entry »
Leave a Comment » |
algorithms, bit twiddling, data structures, embedded programming, hacks, Life in the workplace, programming, Zen | Tagged: bad design, computer, Daisy Owl, frugal programming, laziness, memory usage, novelty, optimization, TI-99/4A |
Permalink
Posted by Steven Pigeon
07/04/2009
Implementing data structures in a way that uses efficiently memory should always be on your mind. I do not mean going overboard and micro-optimizing memory allocation right down to the bit. I mean organize data structures in memory so that you can avoid pointers, so that you can use contiguous memory segments, etc. Normally, minimizing storage by avoiding extra pointers when possible will benefit your program in at least two ways.
First, the reduced memory requirement will make your data structure fit in cache more easily. Remember that if pointers are 4 bytes long in 32 bits programming, they are 8 bytes long in 64 bits environments. This yields better run time performance because you maximize your chances of having the data you need in cache.
Second, contiguous memory layouts also allow for efficient scans of data structures. For example, if you have a classical binary tree, implemented using nodes having each two pointers, you will have to use a tree traversal algorithm, possibly recursive, to enumerate the tree’s content. If you don’t really care about the order in which the nodes are visited, what’s quite cumbersome.
It turns out that for special classes of trees, complete trees, there is a contiguous, and quite simple, layout.
Read the rest of this entry »
18 Comments |
algorithms, assembly language, data structures, Mathematics, programming | Tagged: 1964, ACM, address calculation, alternative addressing, CACM, Communications of the ACM, complete binary trees, Contiguous memory layout, efficient code, heap, heap sort, Memory layout, micro-optimization, octrees, optimization, quadtrees, Williams |
Permalink
Posted by Steven Pigeon