The problem is that people are really sloppy with terminology. There are 3 important but distinct classes here:
O(1) worst-case
This is simple – all operations take no more than a constant amount of time in the worst case, and therefore in all cases. Accessing an element of an array is O(1)
worst-case.
O(1) amortized worst-case
Amortized means that not every operation is O(1)
in the worst case, but for any sequence of N operations, the total cost of the sequence is no O(N)
in the worst case. This means that even though we can’t bound the cost of any single operation by a constant, there will always be enough “quick” operations to make up for the “slow” operations such that the running time of the sequence of operations is linear in the number of operations.
For example, the standard Dynamic Array which doubles its capacity when it fills up requires O(1)
amortized time to insert an element at the end, even though some insertions require O(N)
time – there are always enough O(1)
insertions that inserting N items always takes O(N)
time total.
O(1) average-case
This one is the trickiest. There are two possible definitions of average-case: one for randomized algorithms with fixed inputs, and one for deterministic algorithms with randomized inputs.
For randomized algorithms with fixed inputs, we can calculate the average-case running time for any given input by analyzing the algorithm and determining the probability distribution of all possible running times and taking the average over that distribution (depending on the algorithm, this may or may not be possible due to the Halting Problem).
In the other case, we need a probability distribution over the inputs. For example, if we were to measure a sorting algorithm, one such probability distribution would be the distribution that has all N! possible permutations of the input equally likely. Then, the average-case running time is the average running time over all possible inputs, weighted by the probability of each input.
Since the subject of this question is hash tables, which are deterministic, I’m going to focus on the second definition of average-case. Now, we can’t always determine the probability distribution of the inputs because, well, we could be hashing just about anything, and those items could be coming from a user typing them in or from a file system. Therefore, when talking about hash tables, most people just assume that the inputs are well-behaved and the hash function is well behaved such that the hash value of any input is essentially randomly distributed uniformly over the range of possible hash values.
Take a moment and let that last point sink in – the O(1)
average-case performance for hash tables comes from assuming all hash values are uniformly distributed. If this assumption is violated (which it usually isn’t, but it certainly can and does happen), the running time is no longer O(1)
on average.
See also Denial of Service by Algorithmic Complexity. In this paper, the authors discuss how they exploited some weaknesses in the default hash functions used by two versions of Perl to generate large numbers of strings with hash collisions. Armed with this list of strings, they generated a denial-of-service attack on some webservers by feeding them these strings that resulted in the worst-case O(N)
behavior in the hash tables used by the webservers.