How is tuple implemented in CPython?

Question

Because in the course of normal operations Python will create and destroy a lot of small tuples, Python keeps an internal cache of small tuples for that purpose. This helps cut down on a lot of memory allocation and deallocation churn. For the same reasons small integers from -5 to 255 are interned (made into singletons).

The PyTuple_MAXSAVESIZE definition controls at the maximum size of tuples that qualify for this optimization, and the PyTuple_MAXFREELIST definition controls how many of these tuples keeps around in memory. When a tuple of length < PyTuple_MAXSAVESIZE is discarded, it is added to the free list if there is still room for one (in tupledealloc), to be re-used when Python creates a new small tuple (in PyTuple_New).

Python is being a little clever about how it stores these; for each tuple of length > 0, it’ll reuse the first element of each cached tuple to chain up to PyTuple_MAXFREELIST tuples together into a linked list. So each element in the free_list array is a linked list of Python tuple objects, and all tuples in such a linked list are of the same size. The only exception is the empty tuple (length 0); only one is ever needed of these, it is a singleton.

So, yes, for tuples over length PyTuple_MAXSAVESIZE python is guaranteed to have to allocate memory separately for a new C structure, and that could affect performance if you create and discard such tuples a lot.

If you want to understand Python C internals, I do recommend you study the Python C API; it’ll make it easier to understand the various structures Python uses to define objects, functions and methods in C.

Leave a Comment Cancel reply