Why is the size of 2⁶³ 36 bytes, but 2⁶³-1 is only 24 bytes?

Question

why does it get 12 more bytes for 2⁶³ compared too 2⁶³ – 1 and not just one?

On an LP64 system¹, a Python 2 int consists of exactly three pointer-sized pieces:

type pointer
reference count
actual value, a C long int

That’s 24 bytes in total. On the other hand, a Python long consists of:

type pointer
reference count
digit count, a pointer-sized integer
inline array of value digits, each holding 30 bits of value, but stored in 32-bit units (one of the unused bits gets used for efficient carry/borrow during addition and subtraction)

2**63 requires 64 bits to store, so it fits in three 30-bit digits. Since each digit is 4 bytes wide, the whole Python long will take 24+3*4 = 36 bytes.

In other words, the difference comes from long having to separately store the size of the number (8 additional bytes) and from it being slightly less space-efficient about storing the value (12 bytes to store the digits of 2**63). Including the size, the value 2**63 in a long occupies 20 bytes. Comparing that to the 8 bytes occupied by any value of the simple int yields the observed 12-byte difference.

It is worth noting that Python 3 only has one integer type, called int, which is variable-width, and implemented the same way as Python 2 long.

¹
64-bit Windows differs in that it retains a 32-bit long int, presumably for source compatibility with a large body of older code that used char, short, and long as “convenient” aliases for 8, 16, and 32-bit values that happened to work on both 16 and 32-bit systems. To get an actual 64-bit type on x86-64 Windows, one must use __int64 or (on newer compiler versions) long long or int64_t. Since Python 2 internally depends on Python int fitting into a C long in various places, sys.maxint remains 2**31-1, even on 64-bit Windows. This quirk is also fixed in Python 3, which has no concept of maxint.

Leave a Comment Cancel reply