[This is a piece I initially wrote while at the LISA at U de M, for the newbie coders in the lab.]
One of the major challenges in writing (somewhat) large-scale Python programs, is to keep memory usage at a minimum. However, managing memory in Python is easy—if you just don’t care. Python allocates memory transparently, manages objects using a reference count system, and frees memory when an object’s reference count falls to zero. In theory, it’s swell. In practice, you need to know a few things about Python memory management to get a memory-efficient program running. One of the things you should know, or at least get a good feel about, is the sizes of basic Python objects. Another thing is how Python manages its memory internally.
So let us begin with the size of basic objects. In Python, there’s not a lot of primitive data types: there are ints, longs (an unlimited precision version of int), floats (which are doubles), tuples, strings, lists, dictionaries, and classes.
What is the size of int? A programmer with a C or C++ background will probably guess that the size of a machine-specific int is something like 32 bits, maybe 64; and that therefore it occupies at most 8 bytes. But is that so?
Let us first write a function that shows the sizes of objects (recursively if necessary):
import sys def show_sizeof(x,level=0): print "\t"*level,x.__class__, sys.getsizeof(x), x if hasattr(x,'__iter__'): if hasattr(x,'items'): for xx in x.items(): show_sizeof(xx,level+1) else: for xx in x: show_sizeof(xx,level+1)
We can now use the function to inspect the sizes of the different basic data types:
show_sizeof(None) show_sizeof(3) show_sizeof(2**63) show_sizeof(102947298469128649161972364837164) show_sizeof(918659326943756134897561304875610348756384756193485761304875613948576297485698417)
If you have a 32-bits 2.7x Python, you’ll see:
<type 'NoneType'> 8 None <type 'int'> 12 3 <type 'long'> 22 9223372036854775808 <type 'long'> 28 102947298469128649161972364837164 <type 'long'> 48 918659326943756134897561304875610348756384756193485761304875613948576297485698417
and if you have a 64-bits 2.7x Python, you’ll see:
<type 'NoneType'> 16 None <type 'int'> 24 3 <type 'long'> 36 9223372036854775808 <type 'long'> 40 102947298469128649161972364837164 <type 'long'> 60 918659326943756134897561304875610348756384756193485761304875613948576297485698417
Let us focus on the 64-bits version (mainly because that’s what we need the most often in our case). None takes 16 bytes. int take 24 bytes, three times as much memory as a C int64_t, despite being some king of “machine-friendly” integer. Long integers (unbounded precision), used to represent integers larger than 263-1, have a minimum size of 36 bytes. Then it grows linearly in the logarithm of the integer represented.
Python’s floats are implementation-specific but seems to be C doubles. However, they do not eat up only 8 bytes:
show_sizeof(3.14159265358979323846264338327950288)
Outputs
<type 'float'> 16 3.14159265359
on a 32-bits platform and
<type 'float'> 24 3.14159265359
on a 64-bits platform. That’s again, three times the size a C programmer would expect. Now, what about strings?
show_sizeof("") show_sizeof("My hovercraft is full of eels")
outputs, on a 32 bits platform:
<type 'str'> 21 <type 'str'> 50 My hovercraft is full of eels
and
<type 'str'> 37 <type 'str'> 66 My hovercraft is full of eels
An empty string costs 37 bytes in a 64-bits environment! Memory used by string then linearly grow in the length of the (useful) string.
*
* *
Other structures commonly used, tuples, lists, and dictionary are worthwhile to examine. Lists (which are implemented as array lists, not as linked lists, with everything it entails) are arrays of references to Python objects, allowing them to be heterogeneous. Let us look at our sizes:
show_sizeof([]) show_sizeof([4,"toaster",230.1,])
outputs
<type 'list'> 32 [] <type 'list'> 44 [4, 'toaster', 230.1]
on a 32-bits platform and
<type 'list'> 72 [] <type 'list'> 96 [4, 'toaster', 230.1]
on a 64-bits platform. An empty list eats up 72 bytes. The size of an empty, 64-bits C++ std::list()is only 16 bytes, 4-5 times less. What about tuples? (and dictionaries?):
show_sizeof({}) show_sizeof({'a':213,'b':2131})
outputs, on a 32-bits box
<type 'dict'> 136 {} <type 'dict'> 136 {'a': 213, 'b': 2131} <type 'tuple'> 32 ('a', 213) <type 'str'> 22 a <type 'int'> 12 213 <type 'tuple'> 32 ('b', 2131) <type 'str'> 22 b <type 'int'> 12 2131
and
<type 'dict'> 280 {} <type 'dict'> 280 {'a': 213, 'b': 2131} <type 'tuple'> 72 ('a', 213) <type 'str'> 38 a <type 'int'> 24 213 <type 'tuple'> 72 ('b', 2131) <type 'str'> 38 b <type 'int'> 24 2131
for a 64-bits box.
This last example is particularly interesting because it “doesn’t add up.” If we look at individual tuples, they take 72 bytes (while their components take 38+24=62 bytes, leaving 10 bytes for the tuple itself), but the dictionary takes 280 bytes (rather than a strict minimum of 144=72×2 bytes). The dictionary is supposed to be an efficient data structure for search and the two likely implementations will use more space that strictly necessary. If it’s some kind of tree, then we should pay the cost of internal nodes that contain a key and two pointers to children nodes; if it’s a hash table, then we must have some room with free entries to ensure good performance.
The (somewhat) equivalent std::map C++ structure takes 48 bytes when created (that is, empty). An empty C++ string takes 8 bytes (then allocated size grows linearly the size of the string). An integer takes 32 bits.
*
* *
Why does all this matter? It seems that whether an empty string takes 8 bytes or 37 doesn’t change anything much. That’s true. That’s true until you need to scale. Then, you need to be really careful about how many objects you create to limit the quantity of memory you program uses. It is a problem in real-life applications.
However, to devise a really good strategy about memory management, we mustn’t only consider the sizes of objects, but how many and in which order they are created. It turns out to be very important for Python programs. One key element to understand is how Python allocates its memory internally, which we will discuss next week.
To be continued…
You do realize sys.getsizeof doesn’t descend into what the object contains, right? When you say “If we look at individual tuples, they take 72 bytes (while their components take 38+24=62 bytes, leaving 10 bytes for the tuple itself)”, you shouldn’t expect them to add up. The 72 bytes is the size of a tuple holding two python objects. The 38 and 24 are the size of the two objects it contains.
Try this example:
In [2]: t = (“The quick brown fox jumps over the lazy dog”, 2)
In [3]: sys.getsizeof(t)
Out[3]: 72
In [4]: sys.getsizeof(t[0])
Out[4]: 80
In [5]: sys.getsizeof(t[1])
Out[5]: 24
By your logic, would you say python is “leaving -32 bytes for the tuple itself”? Of course not.
The same logic holds for the dictionary. Even an empty dictionary takes 280 bytes. Only the number of contents matter, not their sizes.
It seems that memory used for index structures of a dictionary in python 2.7 is not made available again if you delete most of the
records (key/value pairs), except for when you call the clear() method on the dict or remove the last referense to the dictionary
itself.
I had a program looping 100000 times and creating a dictionary
each time and added 1000 key/value pairs to it and then removed
about 975 of them in an inner loop.
The memory usage reached about 5GB at the end of the outher
loop, but only about 500MB if i ended the outher loop with a
three step procedure, that copied the dict into a temporary
dict, did clear() on the inner dict and looped through the copy, copying back the still valid key/value pairs from the copy.
The outher loop appended the inner dicts to a list, resulting in
a list with 100000 dictionaries.
I think that it is not ok/optimal to consume so much more memory
in tootal by not trying to automatically release some memory from dict index structures.
Maybe this is a default behaviour, that I can configure away and have the interpreter a little more memory effective than CPU effective ?
Evan, you should mention that array.array can be used to store integers and floats with minimal overhead, packed in memory just like C arrays, using just the necessary bytes per item. Anyone handling millions of numbers in Python should be using array.array or — even better — NumPy.