PermCheck at app.codility.com-programmers in C Plus Plus Explained - Check whether array A is a permutation
Foreword: Category of Article - Technology, Computers and Software, Algorithm
By: Chrysanthus Date Published: 24 Jan 2024
Task
Detected time complexity : O(N+N)
Lesson 4 at app.codility.com/programmers
The solution and its explanation are given below.
Problem
A non-empty vector A consisting of N integers is given.
A permutation is a sequence containing each element from 1 to N once, and only once.
For example, vector A such that:
A[0] = 4
A[1] = 1
A[2] = 3
A[3] = 2
is a permutation, but vector A such that:
A[0] = 4
A[1] = 1
A[2] = 3
is not a permutation, because value 2 is missing.
The goal is to check whether vector A is a permutation.
Write a function:
int solution(vector<int> &A);
that, given a vector A, returns 1 if vector A is a permutation and 0 if it is not.
For example, given vector A such that:
A[0] = 4
A[1] = 1
A[2] = 3
A[3] = 2
the function should return 1.
Given vector A such that:
A[0] = 4
A[1] = 1
A[2] = 3
the function should return 0.
Write an efficient algorithm for the following assumptions:
- N is an integer within the range [1..100,000];
- each element of vector A is an integer within the range [1..1,000,000,000].
Note:
For the permutation above, N = 4. This means that all the numbers in the vector are 1, 2, 3, 4, but in any order. The lesson for this problem is Counting Elements. The candidate should expect that the solution to this problem, needs a count vector. With this, the highest possible value in a given vector should be N. For a permutation, the number of occurrence of each value in the given vector has to be 1 (verified in the count vector).
In the non-permutation table in the question, it is true that 2 is missing, but since N, the number of elements is 3, it means that 4 is bigger than N = 3, and that is an accompanied error. Some other number, even bigger than 4 could have been in the place of 4.
Another possibility of error is that, an integer less than N can occur more than once, with some numbers missing, as in the following list:
1, 1, 1, 4
Here, 1 has occurred three times with 2 and 3 missing. The highest expected number, 4 is still there.
Strategy
The count tables for the first two vectors above are as follows:
Count | 0 | 1 | 1 | 1 | 1 |
Index | 0 | 1 | 2 | 3 | 4 |
Count | 0 | 1 | 0 | 1 | 1 |
Index | 0 | 1 | 2 | 3 | 4 |
Remember that index 0 and its value of 0 is in the two tables for convenience. Note that in the non-permutation table, the value (number of occurrence) of the index 2, is 0.
So, to know if a vector is not a permutation, check if there is an integer which is higher than N; check if any integer between 1 and N (inclusive) is missing; and check if any integer between 1 and N (inclusive) occurs more than once.
In order to know if any integer is missing and if any integer occurs more than once, a count vector is needed. Both of these requirements are just checked, whether all the values in the count vector are 1.
Checking if an integer is greater than N should be done as the given vector is scanned to produce the count vector. The given vector is scanned in N time to produce the count vector of m + 1 cells (including index 0). Another complete N scanning should not waste operations (time complexity) just to know if there is a number in the list that is greater than N, in another N time.
The given vector is scanned in N time and the count vector is scanned in M time, to make sure all the values are 1. However, since the size (length) of the count vector in this case is M = N plus 1 (for the convenience 0), the time complexity for the above algorithm may be given as O(N+N) and not O(N+M), because M is approximately N in this case. The O(N) time to create the count vector, initializing everything (element) to zero, is ignored here.
O(N+N) Time Solution
#include <iostream>
#include <vector>
using namespace std;
int solution(vector<int> &A) {
int N = A.size();
vector<int> count(N+1, 0);
for (int i=0; i<N; i++) {
if (A[i] > N)
return 0;
count[A[i]]++; //i.e. count[A[i]] = count[A[i]] + 1;
}
for (int i=1; i<=N; i++) {
if (count[i] != 1)
return 0;
}
return 1;
}
int main()
{
vector<int> P{4, 1, 3, 2};
int ret1 = solution(P);
cout << ret1 << endl;
vector<int> Q{4, 1, 3};
int ret2 = solution(Q);
cout << ret2 << endl;
return 0;
}
At app.codility.com/programmers/, the result is given as: O(N) or O(N * log(N)) detected time complexity. The O(N) time to create the count vector, initializing everything (element) to zero, is ignored there, as it has been ignored here (in this tutorial).
Conclusion
It may also happen that one or more of the integers in the given vector, is greater or much greater than N. In this case, finding its index in the count vector is impossible, because the maximum index in the count vector is M, which in this problem, is N. Any number greater than N can be spotted, while scanning the given vector to create the count vector.
It may also happen that there are N elements between 1 and N in the given vector, but some elements repeat. This is taken care of, by the fact that the solution function checks, if each value from 1 to N occurs only once, in the count vector.
The reader should register at app.codility.com/programmers, so that he/she should be testing his/her own code.
The reader may think that the above problem could have been solved by sorting the list in at least N.log(N) time, and then checking to see that all the elements from 1 to N are in the list in N time. This means O(N.log(N) + N) time. However, the best time complexity is O(N+N). N.log(N) is bigger than O(N). So O(N.log(N) + N) takes longer than O(N+N), and its solution should not be chosen.