Your list
of tuple
s adds an extra layer. You have 3 layers of items:
- The outer list of length 1 million, so 1 million pointers
- 1 million 2-slot tuples, so 2 million pointers
- 2 million references to 1 million integer values
- 1 million 2-slot tuples, so 2 million pointers
while your dict
only holds:
- The dict (including 1 million cached hashes) with 2 million pointers + extra space to grow the table
- 2 million references to 1 million integer values
It’s those 1 million tuples plus the list to hold the references to them that take up more memory than the 1 million cached hashes. There are some 50% more pointers involved here, easily accounting for the 50% more memory use you see.
There is another downside to your list of tuples: lookup time. To find a matching key in the dict, there is a O(1) complexity cost. To do the same in the list of tuples, you have to potentially scan the whole list for a O(n) cost. Don’t use a list of tuples if you need to map keys to values.