July 31, 2012
64 bits address space lets us access tons more memory than 32 bits, but with a catch: the pointers themselves are … well, yes, 64 bits. 8 bytes. Which eventually pile up to make a whole lot of memory devoted to pointers if you use pointer-rich data structures. Can we do something about this?

Well, in ye goode olde dayes of 16 bits/32 bits computing, we had some compilers that could deal with near and far pointers; the near, 16-bit pointers being relative to one of the segments, possibly the stack segment, and the far, 32-bits pointers being absolute or relative to a segment. This, of course, made programming pointlessly complicated as each pointer was to be used in its correct context to point to the right thing.
Read the rest of this entry »
Leave a Comment » |
bit twiddling, C, C-plus-plus, hacks | Tagged: 32 bits, 64 bits, AMD64, Far Pointer, Near Pointer, optimization, Pointer, Pointer Arithmetic |
Permalink
Posted by Steven Pigeon
February 14, 2012
We do not know closed form solutions for all optimization problems, even when they are somewhat innoccent-looking. One of the many possible methods in such as case is to use (stochastic) gradient descent to iteratively refine the solution to the problem. This involves the computation of… yes, the gradient.

In its simplest form, the gradient descent algorithm computes the gradient of an objective function relative to the parameters, evaluated on all training examples, and uses that gradient to adjust the model’s parameters. The gradient of a (not necessarily objective) function
has the general form
Read the rest of this entry »
Leave a Comment » |
algorithms, machine learning, Mathematics | Tagged: Calculus, Derivative, gradient descent, Intregral, Leibniz, objective function, optimization, stochastic gradient descent |
Permalink
Posted by Steven Pigeon
November 15, 2011
Today I am going to talk about my day job a bit. Contrary to previous jobs, a good part (but not all) of what I do now is either public domain or open-source. Two the projects I joined recently are Theano and PyLearn.

Theano is a mathematical expression compiler that maps expressions described in Python to machine-efficient code, either targeting the CPU or the GPU. PyLearn is a work in progress that aims to provide a comprehensive machine-learning framework for Theano.
Read the rest of this entry »
Leave a Comment » |
machine learning, Mathematics, programming, Python, Science | Tagged: ATLAS, CUDA, optimization, PyLearn, Theano |
Permalink
Posted by Steven Pigeon
September 13, 2011
I still can’t figure out exactly which operations are expensive in Python. My C/C++ can’t help me much because it seems that things aren’t implemented like I’d've expected—like lists that aren’t lists, but array lists, leading to
for operations you would otherwise expect to be
.

But a friend of mine—Olivier—showed me a simple, basic, yet rather effective tool to profile Python programs (I’m not sure if I should say script or not).
Read the rest of this entry »
6 Comments |
Bash (Shell), Life in the workplace, programming, Python | Tagged: easy_install, Kcachegrind, optimization, Profiling, SquareMap, Trace, Valgrind |
Permalink
Posted by Steven Pigeon
January 18, 2011
There are times when part of the message, the gist, must be communicated to the reader in an out-of-band fashion, so to speak. One way of doing this is to use an epigraph to open a chapter or section, carefully chosen to convey the intended message but in the voice of another author (self-epigraphs are of very bad taste in my opinion).

is the preferred document preparation system of computer scientists, physicists, and mathematicians and if you intend to follow a career into the academia, it’s pretty much unavoidable. One day, you’ll have to learn
. The thing is,
is pretty much like C++: it can do just about anything, but it’s not going to help you do it. You have to rely on the innumerable packages or, if you really can’t find what you need, you can code it yourself. Let us have a look on how to code an epigraph macro in
.
Read the rest of this entry »
5 Comments |
hacks, LaTeX, Life in the workplace, Science, Zen | Tagged: Epigraph, Knuth, LaTeX, optimization |
Permalink
Posted by Steven Pigeon
December 4, 2010
Randy Allen, Ken Kennedy — Optimizing Compilers for Modern Architectures — Morgan Kaufmann, 2002, 790 pp. ISBN 1-55860-286-0

(Buy at Amazon.com)
The book presents all the high-performance and vectorizing optimizations a compiler should be able to perform on source code while using trade-offs from the underlying architecture (with considerations such as the memory hierarchy and the instruction set) and the semantics of the language.
Read the rest of this entry »
Leave a Comment » |
C, programming, Suggested Reading | Tagged: C, Compiler, Fortran, optimization, pseudo-code, pseudocode |
Permalink
Posted by Steven Pigeon
June 29, 2009
I am not sure if you are old enough to remember the 1977 IBM movie Powers of Ten (trippy version, without narration) [also at the IMDB and wikipedia], but that’s a movie that sure put things in perspective. Thinking in terms of powers of ten helps me sort things out when I am considering a design problem. Thinking of the scale of a problem in terms of physical scale is a good way to assess its true importance for a project. Sometimes the problem is the one to solve, sometimes, it is not. It’s not because a problem is fun, enticing, or challenging, that it has to be solved optimally right away because, in the correct context, considering its true scale, it may not be as important as first thought.

Maybe comparing problems’ scales to powers of ten in the physical realm helps understanding where to put your efforts. So here are the different scales and what I think they should contain:
Read the rest of this entry »
1 Comment |
algorithms, assembly language, bit twiddling, CPU Architecture, data structures, Design, hacks, Instruction Sets, Life in the workplace, Object Oriented Programming, Operating System, Portable Code, programming, theoretical computer science, Zen | Tagged: 1977, atomic, bit twiddling, branch prediction, C Standard Library, class, classes, coding, compatibility, CPU, ecosystem, global, graphical user interface, GUI, IBM, instruction, instruction set, interoperability, macroscopic, mesoscopic, methods, micro-code, micro-instruction, micro-optimization, microscopic, molecular, networking, Object Oriented Programming, Operating System, optimization, out of order execution, POD, powers of ten, premature optimization, registers, speculative execution, string, subatomic, system |
Permalink
Posted by Steven Pigeon
May 19, 2009
My first true computer, a TI 99/4a (which I talk about also in a previous entry), had 16 or 48 KB of RAM, depending whether or not you had one of the memory expansion cartridges, and that limited quantity of memory severely curbed the complexity of the programs one could write on the machine. However, the limited complexity of the programs, the relative crudeness of the development environment (a BASIC shell) and the slow execution speeds weren’t very obvious to me back then. They were somewhat mitigated by the novelty of the computer itself as a machine, and by the perpetual intense excitement of discovery. The arduous design of programs to save memory, fit more graphics or more code, or even getting our programs to work at all was less about constraints than challenge.
The same kind of constraints—or challenge—followed me over the years as I moved on to different computers. Despite their being more powerful, both faster and sporting more memory, the challenge remained there because while the computers got better, so did I at programming. I kept asking more out of the machine, writing increasingly complex programs needing either more memory or more speed, often both. That meant better algorithms, better data structures, and better implementations1.
Read the rest of this entry »
Leave a Comment » |
algorithms, bit twiddling, data structures, embedded programming, hacks, Life in the workplace, programming, Zen | Tagged: bad design, computer, Daisy Owl, frugal programming, laziness, memory usage, novelty, optimization, TI-99/4A |
Permalink
Posted by Steven Pigeon
May 9, 2009
Riccardo Poli, William B. Langdon, Nicholas F. McPhee — A Field Guide to Genetic Programming — Lulu, 2008, 240 pp. ISBN 978-1-4092-0073-4

(Buy at Amazon.com)
This is not an ordinary textbook as it does not follow the expected pattern but is rather an extensive survey of the field of genetic programming. Each chapter introduces a major concept or issue in genetic programming and covers the subject in a rather authoritative way, supported by copious documentation—the last 57 pages of the books are occupied by the bibliography.
Read the rest of this entry »
Leave a Comment » |
algorithms, Artificial Intelligence, machine learning, Mathematics, Suggested Reading | Tagged: cross-over, Darwin, elitism, genetic programming, heuristics, mutation, optimization, pareto, pareto optimality, selection, survey |
Permalink
Posted by Steven Pigeon
April 7, 2009
Implementing data structures in a way that uses efficiently memory should always be on your mind. I do not mean going overboard and micro-optimizing memory allocation right down to the bit. I mean organize data structures in memory so that you can avoid pointers, so that you can use contiguous memory segments, etc. Normally, minimizing storage by avoiding extra pointers when possible will benefit your program in at least two ways.
First, the reduced memory requirement will make your data structure fit in cache more easily. Remember that if pointers are 4 bytes long in 32 bits programming, they are 8 bytes long in 64 bits environments. This yields better run time performance because you maximize your chances of having the data you need in cache.
Second, contiguous memory layouts also allow for efficient scans of data structures. For example, if you have a classical binary tree, implemented using nodes having each two pointers, you will have to use a tree traversal algorithm, possibly recursive, to enumerate the tree’s content. If you don’t really care about the order in which the nodes are visited, what’s quite cumbersome.
It turns out that for special classes of trees, complete trees, there is a contiguous, and quite simple, layout.
Read the rest of this entry »
16 Comments |
algorithms, assembly language, data structures, Mathematics, programming | Tagged: 1964, ACM, address calculation, alternative addressing, CACM, Communications of the ACM, complete binary trees, Contiguous memory layout, efficient code, heap, heap sort, Memory layout, micro-optimization, octrees, optimization, quadtrees, Williams |
Permalink
Posted by Steven Pigeon