Smallest missing integer algorithm that runs in O(n)?

Question

What algorithm might find a missing integer in O(n) time, from an array?

Say we have an array A with elements in a value-range {1,2,3...2n}. Half the elements are missing so length of A = n.

E.g:

A = [1,2,5,3,10] , n=5

Output = 4

No, i study physics. This is spare time. I found the question from a homework set online tho... — hanko
– hanko, Commented Feb 22, 2017 at 22:26
Possible duplicate of Missing integer variation - O(n) solution needed — Gassa
– Gassa, Commented Feb 22, 2017 at 22:26
@hanko How much extra storage is permitted : e.g. it is simple to have an array of flags - as shown in the answer below. But it is not obvious how to do with this only O(1) extra storage. — WestCoastProjects
– WestCoastProjects, Commented Feb 22, 2017 at 22:30
I have a solution that genuinely uses O(1) extra storage and O(N) time, without stealing bits from the array. It's a variation of quickselect. — Paul Hankin
– Paul Hankin, Commented Feb 23, 2017 at 1:27

Ted Hopp · Accepted Answer · 2017-02-23 01:05:29Z

8

The smallest missing integer must be in the range [1, ..., n+1]. So create an array of flags, all initially false, indicating the presence of that integer. Then an algorithm is:

Scan the input array, setting flags to true as you encounter values in the range. This operation is O(n). (That is, set flag[A[i]] to true for each position i in the input array, provided A[i] <= n.)
Scan the flag array for the first false flag. This operation is also O(n). The index of the first false flag is the smallest missing integer.

EDIT: O(n) time algorithm with O(1) extra space:

If A is writable and there are some extra bits available in the elements of A, then a constant-extra-space algorithm is possible. For instance, if the elements of A are signed values, and since all the numbers are positive, we can use the sign bit of the numbers in the original array as the flags, rather than creating a new flag array. So the algorithm would be:

For each position i of the original array, if abs(A[i]) < n+1, make the value at A[abs(A[i])] negative. (This assumes array indexes are based at 1. Adjust in the obvious way if you are using 0-based arrays.) Don't just negate the value, in case there are duplicate values in A.
Find the index of the first element of A that is positive. That index is the smallest missing number in A. If all positions are negative, then A must be a permutation of {1, ..., n} and hence the smallest missing number is n+1.

If the elements are unsigned, but can hold values as high as 4 n + 1, then in step 1, instead of making the element negative, add 2 n + 1 (provided the element is <= 2 n) and use (A[i] mod (2n+1)) instead of abs(A[i]). Then in step 2, find the first element < 2 n + 1 instead of the first positive element. Other such tricks are possible as well.

edited Feb 23, 2017 at 1:05

answered Feb 22, 2017 at 22:26

Ted Hopp

235k48 gold badges413 silver badges533 bronze badges

Sign up to request clarification or add additional context in comments.

13 Comments

WestCoastProjects Over a year ago

That is an obvious answer but requires O(n) extra storage. Any way to do this with o(1) extra storage (i'm not sure).

hanko Over a year ago

Would i not need to sort the array first with this method?

ruakh Over a year ago

@hanko: No, you wouldn't. What makes you think that you would?

Ted Hopp Over a year ago

@javadba - Not that I can think of, off the top of my head.

Ted Hopp Over a year ago

@AnT - Well, you can use other tricks, such as adding 2n+1 to a value in the original array in lieu of setting a flag (and using A[i] mod 2n+1 instead of abs(A[i])). Since n is variable, there's likely to be extra bits somewhere for any particular n. If not, then of course the O(1)-space algorithm can't be implemented.

|

Paul Hankin · Accepted Answer · 2017-02-23 16:56:59Z

5

You can do this in O(1) additional space, assuming that the only valid operations on the array is to read elements, and to swap pairs of elements.

First note that the specification of the problem excludes the possibility of the array containing duplicates: it contains half of the numbers from 1 to 2N.

We perform a quick-select type algorithm. Start with m=1, M=2N+1, and pivot the array on (m + M)/2. If the size of the left part of the array (elements <= (m+M)/2) is less than (m + M)/2 - m + 1, then the first missing number must be there. Otherwise, it must be in the right part of the array. Repeat on the left or right side accordingly until you find the missing number.

The size of the slice of the array under consideration halves each time and pivoting an array of size n can be done in O(n) time and O(1) space. So overall, the time complexity is 2N + N + N/2 + ... + 1 <= 4N = O(N).

edited Feb 23, 2017 at 16:56

answered Feb 23, 2017 at 1:18

Paul Hankin

58.8k11 gold badges100 silver badges126 bronze badges

6 Comments

WestCoastProjects Over a year ago

NIce answer: use the characteristics of the problem to find what appears to be an optimal solution.

Ted Hopp Over a year ago

Nice approach, but I think the algorithm needs some tweaking. Suppose n = 4 and A is [1, 2, 3, 4]. Then on the first iteration, m = 1, M = 9, and the first pivot will be on 5. This will produce the partition [1, 2, 3, 4], []. Since the left part has 4 elements, which is less than (m + M)/2 - m + 1 = 5, the first missing element is supposedly in [1, 2, 3, 4]. But it's not.

Paul Hankin Over a year ago

@TedHopp The left half should contain 5 elements (elements <= 5), but it only contains 4. You're right the left half doesn't actually contain the missing element, but it would do if it weren't missing!

Ted Hopp Over a year ago

Yeah, I guess on the first iteration, one will have to select the left part of necessity. The first pivot value is n+1 and there are only n elements in the array. Because of this, though, I don't think the complexity analysis formula is quite right (at least for the coefficient; it's still O(n)).

hanko Over a year ago

Damn, this is great. Thank you.

|

Ivan Gritsenko · Accepted Answer · 2017-02-23 17:04:52Z

An implementation of Paul Hankin's idea in C++

#include <iostream>
using namespace std;
const int MAX = 1000;
int a[MAX];
int n;

void swap(int &a, int &b) {
    int tmp = a;
    a = b;
    b = tmp;
}

// Rearranges elements of a[l..r] in such a way that first come elements 
// lower or equal to M, next come elements greater than M. Elements in each group
// come in no particular order.
// Returns an index of the first element among a[l..r] which is greater than M.
int rearrange(int l, int r, int M) {
    int i = l, j = r;
    while (i <= j)
        if (a[i] <= M) i++;
        else swap(a[i], a[j--]);
    return i;
}

int main() {
    cin >> n;
    for (int i = 0; i < n; i++) cin >> a[i];

    int L = 1, R = 2 * n;
    int l = 0, r = n - 1;
    while (L < R) {
        int M = (L + R) / 2; // pivot element
        int m = rearrange(l, r, M); 
        if (m - l == M - L + 1) 
            l = m, L = M + 1;
        else
            r = m - 1, R = M;
    }

    cout << L;
    return 0;
}

Collectives™ on Stack Overflow

Smallest missing integer algorithm that runs in O(n)?

3 Answers 3

13 Comments

6 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

13 Comments

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related