Data Structures – Randomized Queues

Question

In your array implementation, your Query 1.1 seems to be the best way to do things. The only other way to remove a random element would be to move everything up to fill its spot. So if you had [1,2,3,4,5] and you removed 2, your code would move items 3, 4, and 5 up and you’d decrease the count. That will take, on average n/2 item moves for every removal. So removal is O(n). Bad.

If you won’t be adding and removing items while iterating, then just use a Fisher-Yates shuffle on the existing array, and start returning items from front to back. There’s no reason to make a copy. It really depends on your usage pattern. If you envision adding and removing items from the queue while you’re iterating, then things get wonky if you don’t make a copy.

With the linked list approach, the random dequeue operation is difficult to implement efficiently because in order to get to a random item, you have to traverse the list from the front. So if you have 100 items in the queue and you want to remove the 85th item, you’ll have to start at the front and follow 85 links before you get to the one you want to remove. Since you’re using a double-linked list, you could potentially cut that time in half by counting backwards from the end when the item to be removed is beyond the halfway point, but it’s still horribly inefficient when the number of items in your queue is large. Imagine removing the 500,000th item from a queue of one million items.

For the random iterator, you can shuffle the linked list in-place before you start iterating. That takes O(n log n) time, but just O(1) extra space. Again, you have the problem of iterating at the same time you’re adding or removing. If you want that ability, then you need to make a copy.

Leave a Comment Cancel reply