Why is the size 127 (prime) better than 128 for a hash-table?

All numbers (when hashed) are still going to be the p lowest-order bits of k for 127 too.

That is wrong (or I misunderstood..). k % 127 depends on all bits of k. k % 128 only depends on the 7 lowest bits.


EDIT:

If you have a perfect distribution between 1 and 10,000. 10,000 % 127 and 10,000 % 128 both will turn this in a excellent smaller distribution. All buckets will contain 10,000 /128 = 78 (or 79) items.

If you have a distribution between 1 and 10,000 that is biased, because {x, 2x, 3x, ..} occur more often. Then a prime size will give a much, much better distribution as explained in this answer. (Unless x is exactly that prime size.)

Thus, cutting off the high bits (using a size of 128) is no problem whatsoever if the distribution in the lower bits is good enough. But, with real data and real badly designed hash functions, you will need those high bits.

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)