Python memory consumption: dict VS list of tuples

Your list of tuples adds an extra layer. You have 3 layers of items:

  • The outer list of length 1 million, so 1 million pointers
    • 1 million 2-slot tuples, so 2 million pointers
      • 2 million references to 1 million integer values

while your dict only holds:

  • The dict (including 1 million cached hashes) with 2 million pointers + extra space to grow the table
    • 2 million references to 1 million integer values

It’s those 1 million tuples plus the list to hold the references to them that take up more memory than the 1 million cached hashes. There are some 50% more pointers involved here, easily accounting for the 50% more memory use you see.

There is another downside to your list of tuples: lookup time. To find a matching key in the dict, there is a O(1) complexity cost. To do the same in the list of tuples, you have to potentially scan the whole list for a O(n) cost. Don’t use a list of tuples if you need to map keys to values.

Leave a Comment

tech