One thing I really like about Python is how it makes manipulating lists quite easily, especially via slices and list comprehensions. List comprehensions, especially in generator notation form, are an easy way to create lists with specific content. Other functional programming languages (such as Mathematica, amongst others) have taught me to use “map” to transform lists. But is it always the clearest and fastest way?
The basic list comprehension to generate a list of integers looks like:
[i for i in xrange(1000)]
where xrange(n) is an iterator that does not create a list as would range(n). (The range(n) expression creates an explicit list [0,1,...,n] and therefore using potentially tons of memory, while the xrange(n) merely create an “iterable” (something of a pair composed of an iterator and an iterator limit).) So the above creates a list. Let’s make it large:
fat_list=[i for i in xrange(1000000)]
Let us say that, later on, we need to apply an operation on it, for example negate its elements. The most pythonèsque way to this is:
a_list=[-x for x in fat_list]
Note that fat_list might have been modified at some point in the past, but it still has 1000000 elements. Hmm, so what if we used map? With a lambda, it becomes:
or maybe an external function?
def invert(x): return -x ... c_list=map(invert,fat_list)
or a call to invert from a list comprehension?
d_list=[ invert(x) for x in fat_list]
Which one is the most legible? Probably the [-x for x in fat_list] version. But which is the fastest? Let us run each test 100 times to drown-out OS noise. With Python 2.7.1, on my AMD box, we get:
“Create” represents the time needed to just create the list, in seconds. “Negate” is the simple list comprehension version. “Lambda” is the map version using a lambda. “Map+Call” is the map calling the invert function. Lastly, “[Call]” is the version of the list comprehension that calls invert.
If we factor out list creation time, we get:
The simple list comprehension is much faster than the other variants, without dispute. The lambda version and the map calling invert are much slower, but compared to each other, with the calling version just marginally slower. But the big surprise comes from the list comprehension calling invert: it is much slower.
I am not sure if the results are following my intuition or go against it. The simplest expression is the fasted, as expected, but in some measure, you’d expect the map and list comprehension method calling the function would map to essentially the same interpreter code, but for some reason, they do not. The list comprehension calling a function is approximately 25% slower.
I would like to say that the morale of the story is that you should always prefer legibility over execution speed, but my experience tells me that it’s a delicate trade-off. In this quite simple experiment, the naïve list comprehension beats the other methods flat; but it is also clear that if you’re to apply a (non-lambda) function to a list, map is better than a list comprehension (and for some reason, I’d rather have this function called apply than map).