Couples and Single

A vector of integers is given. All values are in the range [1..100,000,000] and present in couples, but one that is single. Return it.

A C++11 GoogleTest test case to clarify the problem:
TEST(CouplesAndSingle, Simple)
{
    ASSERT_EQ(42, solution({ 1, 42, 1 }));
}
My solution uses an hash table where I store each new value value I see in input. If I see it a second time, I remove it from the table. In the end, I should have just one element in the hash table, the single value. It is not required, however I decided to return 0 in case of unexpected input.

Here is how I have implemented it:
int solution(const std::vector<int>& input)
{
    std::unordered_set<int> buffer; // 1
    for(int value : input) // 2
    {
        if(value < 1 || value > 100000000) // 3
            return 0;

        auto it = buffer.find(value); // 4
        if(it == buffer.end())
            buffer.insert(value);
        else
            buffer.erase(it);
    }
    return (buffer.size() != 1) ? 0 : *buffer.begin(); // 5
}
1. STL unordered_set implements hash table in C++11.
2. For each element in input.
3. Paranoid check.
4. If the current value is not in buffer, insert it. Otherwise remove it.
5. The single value is the first (and unique) element in buffer.

Go to the full post

Codility Peaks

Given a vector of integers, we want to split it in the maximum numbers of same-sized blocks, such as any block would contain at least one peak (aka local maximum).

The problem, part of the Codility Prime and composite numbers section, is not complicated, however I needed to spend some time thinking over it to understand what really was its point. As usual, I found helpful writing test cases, here are a few of them:
TEST(Peaks, Given) // 1
{
    std::vector<int> input { 1, 2, 3, 4, 3, 4, 1, 2, 3, 4, 6, 2 };
    ASSERT_EQ(3, solution(input));
}

TEST(Peaks, ThreeSpikes) // 2
{
    std::vector<int> input { 0, 1, 0, 0, 1, 0, 0, 1, 0 };
    ASSERT_EQ(3, solution(input));
}

TEST(Peaks, ElevenSaw) // 3
{
    std::vector<int> input { 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0 };
    EXPECT_EQ(1, solution(input));
}

TEST(Peaks, TwelveSaw) // 4
{
    std::vector<int> input { 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1 };
    EXPECT_EQ(4, solution(input));
}

TEST(Peaks, NoPeak) // 5
{
    std::vector<int> input(10);
    std::iota(input.begin(), input.end(), 1);
    ASSERT_EQ(0, solution(input));
}

TEST(Peaks, MaxPlain) // 6
{
    std::vector<int> input(100000);
    for(unsigned i = 1; i < input.size(); i += 2)
        input[i] = 1;
    ASSERT_EQ(25000, solution(input));
}
1. Codility provided test case. In this case the smallest size we can chose to have subsequences of the same size, each of them containing a peak, is four. Leading to a solution of 3.
2. Very plain case. Three blocks sized three are the optimal solution.
3. The input sequence is sized 11. That is, a prime number, we can't find any subinterval in this case, we just have to check if there is at least a peak, to decide to return 0 or 1.
4. Twelve could be split in 2, 3, 4, 6 sized units.
5. I have used iota() to generate a monotone function. No peak in it.
6. This test case is to check the algorithm performance. It should be easy to see that the minimum acceptable block size is four (two won't work because of the last interval).

Here is my C++11 solution:
int solution(std::vector<int>& input)
{
    if(input.size() < 3) // 1
        return 0;

    std::vector<unsigned> peaks(input.size() + 1); // 2
    for(unsigned i = 1; i < input.size() - 1; ++i)
    {
        if(input[i] > input[i - 1] && input[i] > input[i + 1])
            peaks[i+1] = 1;
    }
    std::partial_sum(peaks.begin(), peaks.end(), peaks.begin());

    for(unsigned size = 2; size <= input.size() / 2; ++size) // 3
    {
        if(input.size() % size) // 4
            continue;

        bool hasPeaks = true;
        for(unsigned end = size; end < peaks.size(); end += size) // 5
        {
            if(peaks[end - size] == peaks[end])
            {
                hasPeaks = false;
                break;
            }
        }
        if(hasPeaks) // 6
            return input.size() / size;
    }

    return peaks.back() ? 1 : 0; // 7
}
1. Get rid of trivial cases.
2. I am going to use partial sum to check if there is any peak in a give interval (some more detail on this STL function and its mathematical meaning in a previous post). To simplify the code, I add a spurious element at the beginning, set to zero, and shift all the partial sums one position to the right.
3. Check all the possible subsequence, from 2 to half the size of the input sequence. It won't make sense to check for sized one subsequences, since it is not possible to conceive a sequence composed exclusively of peaks.
4. We want all-equal blocks, so we accept only sizes that are divisor of the input sequence size.
5. Check all the subsequence to ensure they contain at least a peak.
6. If the current block size is good, return the number of them in the input sequence.
7. We couldn't find any way of splitting the sequence, return 1 if there is at least a peak, 0 otherwise.

Go to the full post

Codility MinPerimeterRectangle

Given an integer that represents the area of a rectangle, return the minimal perimeter of a rectangle having that area and integer sides.

You could find and submit your solution to this problem in the Codility Prime and composite numbers section.

It is easy to see that if a rectangle has area 30, its integers sides could only be (1, 30), (2, 15), (3, 10), and (5, 6). The latter giving the minimal perimeter.

Here is my C++ solution:
int solution(int input)
{
    assert(input > 0 && input < 1000000001);

    double root; // 1
    if(std::modf(std::sqrt(input), &root) == 0.0)
        return static_cast<int>(root * 4);

    int perimeter = 2 * (input + 1); // 2
    for(int i = 2; i <= static_cast<int>(root); ++i)
    {
        if(input % i)
            continue;

        int candidate = 2 * ((input / i) + i); // 3
        perimeter = std::min(perimeter, candidate);
    }

    return perimeter;
}
1. Remember that among all rectangles having the same area, the square is the one with minimal perimeter. So, if the input is actually a square, getting the solution is immediate. I use the standard C mathematic function modf() to check if the square root of input has a fractional part.
2. Let's start considering the solution where the shortest side is 1, and then try all the other candidates up to the input square root.
3. Calculate this candidate solution and compare it against the one currently saved.

Go to the full post

Codility CountFactors

Return the number of factors for a given positive integer.

You can find this problem in the Codility Prime and composite numbers section. In the pdf associated to this block of exercises it is explained why we shouldn't check all the possible factors, but we could stop at the square root of the examined number.

Still, the solution there is written in Python. If you want to use a strong typed language, you'd better pay attention to large int's, because overflow is a risk we should directly take care of.

That's the reason why I added to the test case provided with the problem also an explicit check on the biggest integer that could be passed in input:
TEST(CountFactors, Given)
{
    ASSERT_EQ(8, solution(24));
}

TEST(CountFactors, Simple)
{
    EXPECT_EQ(1, solution(1));
    EXPECT_EQ(2, solution(2));
    EXPECT_EQ(2, solution(3));
    EXPECT_EQ(3, solution(4));
    EXPECT_EQ(2, solution(5));
}

TEST(CountFactors, Max)
{
    ASSERT_EQ(2, solution(std::numeric_limits<int>::max()));
}
Here is my C++ solution. Since this is a problem almost completely based on numbers properties, it is very simple to port this piece of code to other languages:
int solution(int input)
{
    assert(input > 0);
    if(input == 1) // 1
        return 1;

    int result = 2; // 2
    unsigned i = 2; // 3
    for(; i * i < static_cast<unsigned>(input); ++i)
        if(input % i == 0)
            result += 2;
    if(i * i == static_cast<unsigned>(input))
        ++result;

    return result;
}
1. Trivial case.
2. The algorithm is designed to be symmetric, any factor we find has its own double that has to be considered. Here we are considering 1, that implies also input itself.
3. Let's loop on all the other possible factors, starting from 2, up to the root of the input value. As I hinted above, I should pay attention to possible overflow, this is why I am using unsigned int instead of plain int.

Go to the full post

Codility MaxSliceSum

Check the passed integer vector for the maximum sum of any non empty possible slice in it. You should expect also possible negative results.

This is a slight variation on the well known Kadane's algorithm, that you can find in the Maximum slice Codility section.

To stress the difference against the standard behavior, I have written a couple of test case that I added to the one provided in the problem definition itself:
TEST(MaxSliceSum, Given)
{
    std::vector<int> input { 3, 2, -6, 4, 0 }; // 1
    ASSERT_EQ(5, solution(input));
}

TEST(MaxSliceSum, Minimal)
{
    std::vector<int> input { -1000000 }; // 2
    ASSERT_EQ(-1000000, solution(input));
}

TEST(MaxSliceSum, Minimal2)
{
    std::vector<int> input { -1000000, -1000000 };
    ASSERT_EQ(-1000000, solution(input));
}
1. As in the normal case, the winning slice is the one including the first two elements.
2. At least an element should be considered. Here the input collection contains just an element, so the winning slice contains it.

And here is my C++11 solution:
int solution(std::vector<int>& input)
{
    assert(!input.empty()); // 1

    int result = input.front(); // 2
    int candidate = input.front();
    for(auto it = input.begin() + 1; it != input.end(); ++it) // 3
    {
        candidate = std::max(*it, candidate + *it); // 4
        result = std::max(result, candidate);
    }

    return result;
}
1. Not required by codility, just to stress the fact that it is assumed the input collection is not empty.
2. The first candidate is a slice that include only the first element.
3. Let's loop on all the other elements.
4. As for the standard Kadane's algorithm, but here the new candidate comes from the comparison against the current element and the sum between it and the previously candidate.

Go to the full post

Codility MaxDoubleSliceSum (Kadane)

Given a vector of integers, return the maximal sum of any double slice.

See the previous post for details and test cases. There I solved the problem using an intuitive but performance-wise terrible algorithm. Here I give a linear solution based on the Kadane's algorithm.

Codility itself gives us a big hint placing this problem in its Maximum slice section. The question is, how to adapt the Kadane's algorithm here, where we have an hole in the interval?

The solution requires us to change perspective. Instead of looking at the central element as an enemy, think to it as the pivot of the solution. This leads almost automatically to a clean specular solution, where we have two sub-intervals that go from one limit of the vector to the central pivot.

Looking at the code that I have written should make things clearer:
int solution(std::vector<int>& input)
{
    if(input.size() < 4)
        return 0;

    std::vector<int> left; // 1
    left.reserve(input.size() - 2);
    left.push_back(0); // 2
    for(auto it = input.begin() + 1; it < input.end() - 2; ++it) // 3
    {
        left.push_back(std::max(0, left.back() + *it));
    }

    std::vector<int> right; // 4
    right.reserve(input.size() - 2);
    right.push_back(0);
    for(auto it = input.rbegin() + 1; it < input.rend() - 2; ++it)
    {
        right.push_back(std::max(0, right.back() + *it));
    }

    int result = 0; // 5
    auto ir = right.rbegin();
    for(auto il = left.begin(); il != left.end(); ++il, ++ir)
    {
        int tentative = *il + *ir;
        result = std::max(result, tentative);
    }
    return result;
}
1. In the left vector I am going to store all the maximum subarray values from the first interesting value up to the last one. Each value represent the best result for the left slice, up to the current pivot position.
2. If the pivot is in the first valid position, there wouldn't be anything to its left, so the first element in the left vector would be a zero.
3. All the other left elements are calculated following the Kadane's algorithm.
4. The right vector is exactly the same of the left one, just mirrored through the pivot. If the pivot was in the most rightwise position, there would be nothing in the right slice, and a zero in the right vector. Than, it is just applying Kadane moving from right to left on the input collection.
5. Now we have to combine the two partial results to get the expected solution. The vector left contains the maximum subarray values up to the pivot, the right one down to it. So, what we need to do is adding the first element in left with the last element in right, and move step by step to check all the possible solutions.

Go to the full post

Codility MaxDoubleSliceSum (naive)

Given a vector of integers, return the maximal sum of any double slice.

What's a double slice is defined in this Codility problem, that you can find in their Maximum slice section, dedicated to exercises that could be solved by the Kadane's algorithm.

In a few words, a double slice is defined by an ordered triplet of indexes of a vector. The elements we have to consider are the ones strictly contained in the sub-intervals. A few examples in form of C++11 GoogleTest test cases should clarify the concept:
TEST(MaxDoubleSliceSum, Empty)
{
    std::vector<int> input { 3, 2, 6 }; // 1
    ASSERT_EQ(0, solution(input));
}

TEST(MaxDoubleSliceSum, Minimal)
{
    std::vector<int> input { 3, 2, 6, 1 }; // 2
    ASSERT_EQ(6, solution(input));
}

TEST(MaxDoubleSliceSum, Given)
{
    std::vector<int> input { 3, 2, 6, -1, 4, 5, -1, 2 }; // 3
    ASSERT_EQ(17, solution(input));
}
1. We can think only of one triplet for this test case, that would contain no element.
2. Two possible interesting triples: (0, 1, 3) and (0, 2, 3). The first one has an empty left sub-interval and a right one containing just the element '6'. The second one has only the left sub-interval non empty, with the element '2' inside. The maximal sum is given by the first case.
3. This is the test case provided by codility. The triplet (0, 3, 6) leads to the best solution.

A naive solution

In this post I provide a natural implementation that ends up to be far too costly to be acceptable on a reasonably large input. I simply check all the possible triples elements:
int simpleSolution(std::vector<int>& input)
{
    if(input.size() < 4) // 1
        return 0;

    int result = 0;
    for(unsigned beg = 1; beg < input.size() - 2; ++beg) // 2
    {
        for(unsigned end = beg + 2; end < input.size(); ++end) // 3
        {
            int max = std::accumulate(input.begin() + beg, input.begin() + end, 0); // 4
            int tentative = 0;
            for(unsigned i = beg; i < end; ++i) // 5
            {
                tentative = max - input[i];
                result = std::max(result, tentative); // 6
            }
        }
    }

    return result;
}
1. As we have seen in the first test case, if container has less than four elements, there is no interesting double slice.
2. The first element in a triplet could be the second element in the vector. Go from this element up to the last useful one.
3. The possible "end" element (last + 1) for a triplet is in this range.
4. Sum all the element in the beg..end interval.
5. Tentatively remove a single element in the interval, to represent the median element in the triplet.
6. If a better possible solution is found, save it.

This solution works fine against the test cased proposed above. But think what would happen in on a larger input:
TEST(MaxDoubleSliceSum, Hundreds)
{
    std::vector<int> input(601);
    std::iota(input.begin(), input.end(), -300);
    ASSERT_EQ(44850, solution(input));
}
A few hundred elements should be enough to perceive a substantial performance degradation.

Removing the most internal loop (5) doesn't seem complicated, still it won't help much. We still have a O(n**2) algorithm to deal with.

We need to take a completely different way. And Kadane is there to pointing us in the right direction. See next post for details.

Go to the full post

Codility MaxProfit

A vector of integers represents the price evolution of a stock share in a period. We should return the highest profit we could had buying and then selling that share.

If you have read the post about the maximum slice problem, you should see how this Codility problem could be solved in linear time, using what is commonly known as Kadane's algorithm.

Here I implemented the given test case in C++11 for GoogleTest:
TEST(MaxProfit, Given)
{
    std::vector<int> input { 23171, 21011, 21123, 21366, 21013, 21367 };
    ASSERT_EQ(356, solution(input));
}
Buying on day 0 and selling on day 1 would lead to a loss, while 1 -> 2 would give a gain of 21123 - 21011 = 112.
The best result that we could get is 21367 − 21013 = 354.

Following the max slice algorithm, I have implemented a solution in this way:
int solution(std::vector<int>& input)
{
    if(input.size() < 2) // 1
        return 0;

    int gain = 0;
    int buy = input.front();

    for(auto it = input.begin() + 1; it != input.end(); ++it) // 2
    {
        int sell = std::max(0, *it - buy); // 3
        gain = std::max(sell, gain); // 4
        buy = std::min(buy, *it); // 5
    }

    return gain;
}
1. Trivial cases, I can't get any real gain.
2. Loop on all the input elements, starting from the second one.
3. It is worthy to sell today only if we should get a positive outcome.
4. If selling today leads to the current best result, store this gain.
5. This is the key point of the algorithm. If the current stock price is lower that the previously stored one, we could get rid of that one and start using the new one.

Go to the full post

Maximum slice problem

On Codility there is a section dedicated to problems about determining the maximum slice in a vector. The generic problem, disscused in a pdf, is this:

Given a vector of integers, find the slice with the largest sum, and return that value.

An example is provided, that I converted on the spot to a C++11 GoogleTest test case:
TEST(MaxSlice, Given)
{
    std::vector<int> input { 5, -7, 3, 5, -2, 4, -1 };
    EXPECT_EQ(10, solution(input));
}
The biggest slice is the one ranging from input[2] to input[5], resulting in a sum of 10.

A quadratic solution

It shouldn't be complicated to think of a solution asymptotically quadratic in time. Loop on all the elements in the sequence. For each of them loop on all the subsequence starting there up to the input end, calculating their sums. The biggest one of them is our result.

Here is a possible implementation of this algorithm:
int solution(const std::vector<int>& input)
{
    int result = 0;
    for(unsigned i = 0; i < input.size(); ++i) // 1
    {
        int tentative = 0; // 2
        for(unsigned j = i; j < input.size(); ++j) // 3
        {
            tentative += input[j];
            result = std::max(result, tentative); // 4
        }
    }
    return result;
}
1. Check all the subsequences, defining i as their begin.
2. Store here the sum for the current subsequence.
3. Try all the possible subsequences starting from i.
4. Save the current subsequence if it is the current biggest one.

A linear solution

The previous solution works fine, but it mindlessly repeat adding on and on the same elements. We could save time just performing a single linear scan of our data, observing that when we get a negative result adding an element, it is more worthy to get rid of that subsequence, and starting from the next element:
int solution(const std::vector<int>& input)
{
    int result = 0;
    int candidate = 0;
    for(int cur : input)
    {
        candidate = std::max(0, candidate + cur); // 1
        result = std::max(result, candidate); // 2
    }

    return result;
}
1. Keep the candidate until adding the current element gives a positive value. Otherwise, reset the sequence. In the worst case, all negative input element, we won't never accept any value, and we would return the total relative to the empty subsequence, namely zero.
2. If the current candidate is a better solution, save it.

This problem is better known as maximum subarray problem, and the linear solution above was firstly found by Jay Kadane. That's way it is called Kadane's algorithm.

Go to the full post

Codility EquiLeader

Given a vector of integers, check if there is a leader (aka dominator) in it. If so, define equi-leader as an element of such a vector that split the vector in two subsequences, each of them has the same leader. Write a function that returns the number of equi-leaders in the passed vector.

You could find this problem, and test your solution (possibly) in your preferred programming language in the Leader codility section.

Actually, if you have already solved the other problem in the codility Leader section, named Dominator, you are already very close to the solution of this one.

To better explain what the point of the problem is, codility provides a test case, here it is, in a C++11 GoogleTest version:
TEST(EquiLeader, Given)
{
    std::vector<int> input { 4, 3, 4, 4, 4, 2 };
    ASSERT_EQ(2, solution(input));
}
The number four dominates the input sequence. If we split it on the first element (index 0) we get two subsequences where four is again a leader. The same if take the third element (index 2) as a pivot. These are the only cases correctly working on these vector, so we want our function to return 2.

I have split my C++11 solution in two parts. Firstly I wrote a utility function that gets a candidate leader for the entire sequence:
int getCandidateLeader(const std::vector<int>& input)
{
    int count = 0;
    int candidate;
    for(int cur : input) // 1
    {
        if(count == 0)
        {
            count = 1;
            candidate = cur;
        }
        else
        { // 2
            count += (candidate == cur) ? 1 : -1;
        }
    }

    return count ? candidate : std::numeric_limits<int>::max(); // 3
}
1. I am still not very used to the new C++11 range based for-loop. This is a place where it helps to make the code more readable. I am looping on the entire vector, and I don't care much about the index of any of its element, I am just interested in their values.
2. This is the central idea of this algorithm. Increasing the counter, when the current element is the same that the previously detected candidate, or throw away both elements, if they are different, let us solve the problem of getting a candidate leader in linear time.
3. If no good candidate has been detected, I return a non-acceptable value.

And this is the solution to the actual problem:
int solution(std::vector<int>& input)
{
    int leader = getCandidateLeader(input);
    if(leader == std::numeric_limits<int>::max()) // 1
        return 0;

    const unsigned count = std::count(input.begin(), input.end(), leader);
    if(count < input.size() / 2) // 2
        return 0;

    int leaders = 0; // 3
    unsigned leftLeaders = 0; // 4
    for(unsigned i = 0; i < input.size(); ++i)
    {
        if(input[i] == leader) // 5
            leftLeaders++;
        unsigned rightLeaders = (count - leftLeaders); // 6
        if(leftLeaders > (i + 1) / 2 && rightLeaders > (input.size() - i - 1) / 2)
            ++leaders; // 7
    }

    return leaders;
}
1. No candidate leader found, no need to go on.
2. The candidate leader is not dominant.
3. Keep memory of the current number of equi-leaders.
4. Here I store the number of leader occurrences in the current left subsequence. In the first loop, it contains just the leftmost element. It would grow till it would get all the sequence.
5. If the current element is a leader, increase the left counter.
6. Get the right subsequence number of leaders by subtraction.
7. If both left and right leader count are dominant in their subsequence, increase the equi-leader counter.

Go to the full post

Codility Dominator

Given a vector of integers, check if there is a dominator, meaning a value that occurs more than half the times, in it.

I have already written a couple of posts about this Codility problem. In the first one I solve it using a map as a buffer, in the second one I use the cheaper golden leader algorithm.

My porting to C++11 didn't change much of the original code. For instance, having to define a synonim to a type, instead of "typedef" it, I referred to the "using" keyword.

Go to the full post

Codility Fish

We are given in input two same sized integer vectors, the first containing fish dimensions, all of them are different, the second their direction, upstream or downstream.
The river is so narrow that fishes going in opposite direction have no choice but colliding, and the smaller one will soccumb, eaten by the other.
We want to know how many fishes will survive in the end.

I have already written a couple of posts on this problem. One where I show a naive, with N squared time complexity, solution, and another one where I implement a linear solution.

I worked on it once again, now that we can use C++11 to solve codility problems. However I didn't get anything different from that time. I just wrote the test cases in a more clean way, using the new syntax that allows creating a vector from an initialization list.

Nothing new to say about it, I am sorry to say.

Go to the full post