Smallest number that cannot be formed from sum of numbers from array

There’s a beautiful algorithm for solving this problem in time O(n + Sort), where Sort is the amount of time required to sort the input array.

The idea behind the algorithm is to sort the array and then ask the following question: what is the smallest positive integer you cannot make using the first k elements of the array? You then scan forward through the array from left to right, updating your answer to this question, until you find the smallest number you can’t make.

Here’s how it works. Initially, the smallest number you can’t make is 1. Then, going from left to right, do the following:

  • If the current number is bigger than the smallest number you can’t make so far, then you know the smallest number you can’t make – it’s the one you’ve got recorded, and you’re done.
  • Otherwise, the current number is less than or equal to the smallest number you can’t make. The claim is that you can indeed make this number. Right now, you know the smallest number you can’t make with the first k elements of the array (call it candidate) and are now looking at value A[k]. The number candidate - A[k] therefore must be some number that you can indeed make with the first k elements of the array, since otherwise candidate - A[k] would be a smaller number than the smallest number you allegedly can’t make with the first k numbers in the array. Moreover, you can make any number in the range candidate to candidate + A[k], inclusive, because you can start with any number in the range from 1 to A[k], inclusive, and then add candidate - 1 to it. Therefore, set candidate to candidate + A[k] and increment k.

In pseudocode:

Sort(A)
candidate = 1
for i from 1 to length(A):
   if A[i] > candidate: return candidate
   else: candidate = candidate + A[i]
return candidate

Here’s a test run on [4, 13, 2, 1, 3]. Sort the array to get [1, 2, 3, 4, 13]. Then, set candidate to 1. We then do the following:

  • A[1] = 1, candidate = 1:
    • A[1] ≤ candidate, so set candidate = candidate + A[1] = 2
  • A[2] = 2, candidate = 2:
    • A[2] ≤ candidate, so set candidate = candidate + A[2] = 4
  • A[3] = 3, candidate = 4:
    • A[3] ≤ candidate, so set candidate = candidate + A[3] = 7
  • A[4] = 4, candidate = 7:
    • A[4] ≤ candidate, so set candidate = candidate + A[4] = 11
  • A[5] = 13, candidate = 11:
    • A[5] > candidate, so return candidate (11).

So the answer is 11.

The runtime here is O(n + Sort) because outside of sorting, the runtime is O(n). You can clearly sort in O(n log n) time using heapsort, and if you know some upper bound on the numbers you can sort in time O(n log U) (where U is the maximum possible number) by using radix sort. If U is a fixed constant, (say, 109), then radix sort runs in time O(n) and this entire algorithm then runs in time O(n) as well.

Hope this helps!

Leave a Comment

tech