This Thread: 2014

Couples and Single

A vector of integers is given. All values are in the range [1..100,000,000] and present in couples, but one that is single. Return it.

A C++11 GoogleTest test case to clarify the problem:

TEST(CouplesAndSingle, Simple)
{
    ASSERT_EQ(42, solution({ 1, 42, 1 }));
}

My solution uses an hash table where I store each new value value I see in input. If I see it a second time, I remove it from the table. In the end, I should have just one element in the hash table, the single value. It is not required, however I decided to return 0 in case of unexpected input.

Here is how I have implemented it:

int solution(const std::vector<int>& input)
{
    std::unordered_set<int> buffer; // 1
    for(int value : input) // 2
    {
        if(value < 1 || value > 100000000) // 3
            return 0;

        auto it = buffer.find(value); // 4
        if(it == buffer.end())
            buffer.insert(value);
        else
            buffer.erase(it);
    }
    return (buffer.size() != 1) ? 0 : *buffer.begin(); // 5
}

1. STL unordered_set implements hash table in C++11.
2. For each element in input.
3. Paranoid check.
4. If the current value is not in buffer, insert it. Otherwise remove it.
5. The single value is the first (and unique) element in buffer.

Go to the full post

Codility Peaks

Given a vector of integers, we want to split it in the maximum numbers of same-sized blocks, such as any block would contain at least one peak (aka local maximum).

The problem, part of the Codility Prime and composite numbers section, is not complicated, however I needed to spend some time thinking over it to understand what really was its point. As usual, I found helpful writing test cases, here are a few of them:

TEST(Peaks, Given) // 1
{
    std::vector<int> input { 1, 2, 3, 4, 3, 4, 1, 2, 3, 4, 6, 2 };
    ASSERT_EQ(3, solution(input));
}

TEST(Peaks, ThreeSpikes) // 2
{
    std::vector<int> input { 0, 1, 0, 0, 1, 0, 0, 1, 0 };
    ASSERT_EQ(3, solution(input));
}

TEST(Peaks, ElevenSaw) // 3
{
    std::vector<int> input { 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0 };
    EXPECT_EQ(1, solution(input));
}

TEST(Peaks, TwelveSaw) // 4
{
    std::vector<int> input { 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1 };
    EXPECT_EQ(4, solution(input));
}

TEST(Peaks, NoPeak) // 5
{
    std::vector<int> input(10);
    std::iota(input.begin(), input.end(), 1);
    ASSERT_EQ(0, solution(input));
}

TEST(Peaks, MaxPlain) // 6
{
    std::vector<int> input(100000);
    for(unsigned i = 1; i < input.size(); i += 2)
        input[i] = 1;
    ASSERT_EQ(25000, solution(input));
}

1. Codility provided test case. In this case the smallest size we can chose to have subsequences of the same size, each of them containing a peak, is four. Leading to a solution of 3.
2. Very plain case. Three blocks sized three are the optimal solution.
3. The input sequence is sized 11. That is, a prime number, we can't find any subinterval in this case, we just have to check if there is at least a peak, to decide to return 0 or 1.
4. Twelve could be split in 2, 3, 4, 6 sized units.
5. I have used iota() to generate a monotone function. No peak in it.
6. This test case is to check the algorithm performance. It should be easy to see that the minimum acceptable block size is four (two won't work because of the last interval).

Here is my C++11 solution:

int solution(std::vector<int>& input)
{
    if(input.size() < 3) // 1
        return 0;

    std::vector<unsigned> peaks(input.size() + 1); // 2
    for(std::size_t i = 1; i < input.size() - 1; ++i)
    {
        if(input[i] > input[i - 1] && input[i] > input[i + 1])
            peaks[i+1] = 1;
    }
    std::partial_sum(peaks.begin(), peaks.end(), peaks.begin());

    for(std::size_t size = 3; size <= input.size() / 2; ++size) // 3
    {
        if(input.size() % size) // 4
            continue;

        bool hasPeaks = true;
        for(std::size_t end = size; end < peaks.size(); end += size) // 5
        {
            if(peaks[end - size] == peaks[end])
            {
                hasPeaks = false;
                break;
            }
        }
        if(hasPeaks) // 6
            return input.size() / size;
    }

    return peaks.back() ? 1 : 0; // 7
}

1. Get rid of trivial cases.
2. I am going to use partial sum to check if there is any peak in a give interval (some more detail on this STL function and its mathematical meaning in a previous post). To simplify the code, I add a spurious element at the beginning, set to zero, and shift all the partial sums one position to the right.
3. Check all the possible subsequence, from 3 to half the size of the input sequence. It won't make sense to check for subsequences sized less than three, since it is not possible to conceive a to have a local maximum in this case - see comments for more details.
4. We want all-equal blocks, so we accept only sizes that are divisor of the input sequence size.
5. Check all the subsequence to ensure they contain at least a peak.
6. If the current block size is good, return the number of them in the input sequence.
7. We couldn't find any way of splitting the sequence, return 1 if there is at least a peak, 0 otherwise.

Go to the full post

Codility MinPerimeterRectangle

Given an integer that represents the area of a rectangle, return the minimal perimeter of a rectangle having that area and integer sides.

You could find and submit your solution to this problem in the Codility Prime and composite numbers section.

It is easy to see that if a rectangle has area 30, its integers sides could only be (1, 30), (2, 15), (3, 10), and (5, 6). The latter giving the minimal perimeter.

My original C++ solution was clumsier than the one you are seeing now. I was checking the input for squares and worse, I started looping from the wrong side. Thanks to the anonymous suggestion below, I have cleaned up my code in this way:

int solution(int input)
{
    assert(input > 0 && input < 1000000001); // 1

    for(int i = static_cast<int>(std::sqrt(input)); i > 0; --i) // 2
    {
        if(input % i == 0) // 3
            return 2 * ((input / i) + i);
    }

    return 0; // 4
}

1. Enforce the problem requirements in the code through an assert.
2. Remembering that among all rectangles having the same area, the square is the one with minimal perimeter, I should start checking from the closest to the optimum down to the minimal rectangle.
3. If the current candidate divides the area with no remainder, it is the best solution. Calculate the perimeter and return it.
4. We should never reach this line. However usually the C++ compilers complain for a missing return value if they don't see it.

Go to the full post

Codility CountFactors

Return the number of factors for a given positive integer.

You can find this problem in the Codility Prime and composite numbers section. In the pdf associated to this block of exercises it is explained why we shouldn't check all the possible factors, but we could stop at the square root of the examined number.

Still, the solution there is written in Python. If you want to use a strong typed language, you'd better pay attention to large int's, because overflow is a risk we should directly take care of.

That's the reason why I added to the test case provided with the problem also an explicit check on the biggest integer that could be passed in input:

TEST(CountFactors, Given)
{
    ASSERT_EQ(8, solution(24));
}

TEST(CountFactors, Simple)
{
    EXPECT_EQ(1, solution(1));
    EXPECT_EQ(2, solution(2));
    EXPECT_EQ(2, solution(3));
    EXPECT_EQ(3, solution(4));
    EXPECT_EQ(2, solution(5));
}

TEST(CountFactors, Max)
{
    ASSERT_EQ(2, solution(std::numeric_limits<int>::max()));
}

Here is my C++ solution. Since this is a problem almost completely based on numbers properties, it is very simple to port this piece of code to other languages:

int solution(int input)
{
    assert(input > 0);
    if(input == 1) // 1
        return 1;

    int result = 2; // 2
    unsigned i = 2; // 3
    for(; i * i < static_cast<unsigned>(input); ++i)
        if(input % i == 0)
            result += 2;
    if(i * i == static_cast<unsigned>(input))
        ++result;

    return result;
}

1. Trivial case.
2. The algorithm is designed to be symmetric, any factor we find has its own double that has to be considered. Here we are considering 1, that implies also input itself.
3. Let's loop on all the other possible factors, starting from 2, up to the root of the input value. As I hinted above, I should pay attention to possible overflow, this is why I am using unsigned int instead of plain int.

Go to the full post

Codility MaxSliceSum

Check the passed integer vector for the maximum sum of any non empty possible slice in it. You should expect also possible negative results.

This is a slight variation on the well known Kadane's algorithm, that you can find in the Maximum slice Codility section.

To stress the difference against the standard behavior, I have written a couple of test case that I added to the one provided in the problem definition itself:

TEST(MaxSliceSum, Given)
{
    std::vector<int> input { 3, 2, -6, 4, 0 }; // 1
    ASSERT_EQ(5, solution(input));
}

TEST(MaxSliceSum, Minimal)
{
    std::vector<int> input { -1000000 }; // 2
    ASSERT_EQ(-1000000, solution(input));
}

TEST(MaxSliceSum, Minimal2)
{
    std::vector<int> input { -1000000, -1000000 };
    ASSERT_EQ(-1000000, solution(input));
}

1. As in the normal case, the winning slice is the one including the first two elements.
2. At least an element should be considered. Here the input collection contains just an element, so the winning slice contains it.

And here is my C++11 solution:

int solution(std::vector<int>& input)
{
    assert(!input.empty()); // 1

    int result = input.front(); // 2
    int candidate = input.front();
    for(auto it = input.begin() + 1; it != input.end(); ++it) // 3
    {
        candidate = std::max(*it, candidate + *it); // 4
        result = std::max(result, candidate);
    }

    return result;
}

1. Not required by codility, just to stress the fact that it is assumed the input collection is not empty.
2. The first candidate is a slice that include only the first element.
3. Let's loop on all the other elements.
4. As for the standard Kadane's algorithm, but here the new candidate comes from the comparison against the current element and the sum between it and the previously candidate.

Go to the full post

Codility MaxDoubleSliceSum (Kadane)

Given a vector of integers, return the maximal sum of any double slice.

See the previous post for details and test cases. There I solved the problem using an intuitive but performance-wise terrible algorithm. Here I give a linear solution based on the Kadane's algorithm.

Codility itself gives us a big hint placing this problem in its Maximum slice section. The question is, how to adapt the Kadane's algorithm here, where we have an hole in the interval?

The solution requires us to change perspective. Instead of looking at the central element as an enemy, think to it as the pivot of the solution. This leads almost automatically to a clean specular solution, where we have two sub-intervals that go from one limit of the vector to the central pivot.

Looking at the code that I have written should make things clearer:

int solution(std::vector<int>& input)
{
    if(input.size() < 4)
        return 0;

    std::vector<int> left; // 1
    left.reserve(input.size() - 2);
    left.push_back(0); // 2
    for(auto it = input.begin() + 1; it < input.end() - 2; ++it) // 3
    {
        left.push_back(std::max(0, left.back() + *it));
    }

    std::vector<int> right; // 4
    right.reserve(input.size() - 2);
    right.push_back(0);
    for(auto it = input.rbegin() + 1; it < input.rend() - 2; ++it)
    {
        right.push_back(std::max(0, right.back() + *it));
    }

    int result = 0; // 5
    auto ir = right.rbegin();
    for(auto il = left.begin(); il != left.end(); ++il, ++ir)
    {
        int tentative = *il + *ir;
        result = std::max(result, tentative);
    }
    return result;
}

1. In the left vector I am going to store all the maximum subarray values from the first interesting value up to the last one. Each value represent the best result for the left slice, up to the current pivot position.
2. If the pivot is in the first valid position, there wouldn't be anything to its left, so the first element in the left vector would be a zero.
3. All the other left elements are calculated following the Kadane's algorithm.
4. The right vector is exactly the same of the left one, just mirrored through the pivot. If the pivot was in the most rightwise position, there would be nothing in the right slice, and a zero in the right vector. Than, it is just applying Kadane moving from right to left on the input collection.
5. Now we have to combine the two partial results to get the expected solution. The vector left contains the maximum subarray values up to the pivot, the right one down to it. So, what we need to do is adding the first element in left with the last element in right, and move step by step to check all the possible solutions.

Go to the full post

Codility MaxDoubleSliceSum (naive)

Given a vector of integers, return the maximal sum of any double slice.

What's a double slice is defined in this Codility problem, that you can find in their Maximum slice section, dedicated to exercises that could be solved by the Kadane's algorithm.

In a few words, a double slice is defined by an ordered triplet of indexes of a vector. The elements we have to consider are the ones strictly contained in the sub-intervals. A few examples in form of C++11 GoogleTest test cases should clarify the concept:

TEST(MaxDoubleSliceSum, Empty)
{
    std::vector<int> input { 3, 2, 6 }; // 1
    ASSERT_EQ(0, solution(input));
}

TEST(MaxDoubleSliceSum, Minimal)
{
    std::vector<int> input { 3, 2, 6, 1 }; // 2
    ASSERT_EQ(6, solution(input));
}

TEST(MaxDoubleSliceSum, Given)
{
    std::vector<int> input { 3, 2, 6, -1, 4, 5, -1, 2 }; // 3
    ASSERT_EQ(17, solution(input));
}

1. We can think only of one triplet for this test case, that would contain no element.
2. Two possible interesting triples: (0, 1, 3) and (0, 2, 3). The first one has an empty left sub-interval and a right one containing just the element '6'. The second one has only the left sub-interval non empty, with the element '2' inside. The maximal sum is given by the first case.
3. This is the test case provided by codility. The triplet (0, 3, 6) leads to the best solution.

A naive solution

In this post I provide a natural implementation that ends up to be far too costly to be acceptable on a reasonably large input. I simply check all the possible triples elements:

int simpleSolution(std::vector<int>& input)
{
    if(input.size() < 4) // 1
        return 0;

    int result = 0;
    for(unsigned beg = 1; beg < input.size() - 2; ++beg) // 2
    {
        for(unsigned end = beg + 2; end < input.size(); ++end) // 3
        {
            int max = std::accumulate(input.begin() + beg, input.begin() + end, 0); // 4
            int tentative = 0;
            for(unsigned i = beg; i < end; ++i) // 5
            {
                tentative = max - input[i];
                result = std::max(result, tentative); // 6
            }
        }
    }

    return result;
}

1. As we have seen in the first test case, if container has less than four elements, there is no interesting double slice.
2. The first element in a triplet could be the second element in the vector. Go from this element up to the last useful one.
3. The possible "end" element (last + 1) for a triplet is in this range.
4. Sum all the element in the beg..end interval.
5. Tentatively remove a single element in the interval, to represent the median element in the triplet.
6. If a better possible solution is found, save it.

This solution works fine against the test cased proposed above. But think what would happen in on a larger input:

TEST(MaxDoubleSliceSum, Hundreds)
{
    std::vector<int> input(601);
    std::iota(input.begin(), input.end(), -300);
    ASSERT_EQ(44850, solution(input));
}

A few hundred elements should be enough to perceive a substantial performance degradation.

Removing the most internal loop (5) doesn't seem complicated, still it won't help much. We still have a O(n**2) algorithm to deal with.

We need to take a completely different way. And Kadane is there to pointing us in the right direction. See next post for details.

Go to the full post

Codility MaxProfit

A vector of integers represents the price evolution of a stock share in a period. We should return the highest profit we could had buying and then selling that share.

If you have read the post about the maximum slice problem, you should see how this Codility problem could be solved in linear time, using what is commonly known as Kadane's algorithm.

Here I implemented the given test case in C++11 for GoogleTest:

TEST(MaxProfit, Given)
{
    std::vector<int> input { 23171, 21011, 21123, 21366, 21013, 21367 };
    ASSERT_EQ(356, solution(input));
}

Buying on day 0 and selling on day 1 would lead to a loss, while 1 -> 2 would give a gain of 21123 - 21011 = 112.
The best result that we could get is 21367 − 21013 = 354.

Following the max slice algorithm, I have implemented a solution in this way:

int solution(std::vector<int>& input)
{
    if(input.size() < 2) // 1
        return 0;

    int gain = 0;
    int buy = input.front();

    for(auto it = input.begin() + 1; it != input.end(); ++it) // 2
    {
        int sell = std::max(0, *it - buy); // 3
        gain = std::max(sell, gain); // 4
        buy = std::min(buy, *it); // 5
    }

    return gain;
}

1. Trivial cases, I can't get any real gain.
2. Loop on all the input elements, starting from the second one.
3. It is worthy to sell today only if we should get a positive outcome.
4. If selling today leads to the current best result, store this gain.
5. This is the key point of the algorithm. If the current stock price is lower that the previously stored one, we could get rid of that one and start using the new one.

Go to the full post

Maximum slice problem

On Codility there is a section dedicated to problems about determining the maximum slice in a vector. The generic problem, disscused in a pdf, is this:

Given a vector of integers, find the slice with the largest sum, and return that value.

An example is provided, that I converted on the spot to a C++11 GoogleTest test case:

TEST(MaxSlice, Given)
{
    std::vector<int> input { 5, -7, 3, 5, -2, 4, -1 };
    EXPECT_EQ(10, solution(input));
}

The biggest slice is the one ranging from input[2] to input[5], resulting in a sum of 10.

A quadratic solution

It shouldn't be complicated to think of a solution asymptotically quadratic in time. Loop on all the elements in the sequence. For each of them loop on all the subsequence starting there up to the input end, calculating their sums. The biggest one of them is our result.

Here is a possible implementation of this algorithm:

int solution(const std::vector<int>& input)
{
    int result = 0;
    for(unsigned i = 0; i < input.size(); ++i) // 1
    {
        int tentative = 0; // 2
        for(unsigned j = i; j < input.size(); ++j) // 3
        {
            tentative += input[j];
            result = std::max(result, tentative); // 4
        }
    }
    return result;
}

1. Check all the subsequences, defining i as their begin.
2. Store here the sum for the current subsequence.
3. Try all the possible subsequences starting from i.
4. Save the current subsequence if it is the current biggest one.

A linear solution

The previous solution works fine, but it mindlessly repeat adding on and on the same elements. We could save time just performing a single linear scan of our data, observing that when we get a negative result adding an element, it is more worthy to get rid of that subsequence, and starting from the next element:

int solution(const std::vector<int>& input)
{
    int result = 0;
    int candidate = 0;
    for(int cur : input)
    {
        candidate = std::max(0, candidate + cur); // 1
        result = std::max(result, candidate); // 2
    }

    return result;
}

1. Keep the candidate until adding the current element gives a positive value. Otherwise, reset the sequence. In the worst case, all negative input element, we won't never accept any value, and we would return the total relative to the empty subsequence, namely zero.
2. If the current candidate is a better solution, save it.

This problem is better known as maximum subarray problem, and the linear solution above was firstly found by Jay Kadane. That's way it is called Kadane's algorithm.

Go to the full post

Codility EquiLeader

Given a vector of integers, check if there is a leader (aka dominator) in it. If so, define equi-leader as an element of such a vector that split the vector in two subsequences, each of them has the same leader. Write a function that returns the number of equi-leaders in the passed vector.

You could find this problem, and test your solution (possibly) in your preferred programming language in the Leader codility section.

Actually, if you have already solved the other problem in the codility Leader section, named Dominator, you are already very close to the solution of this one.

To better explain what the point of the problem is, codility provides a test case, here it is, in a C++11 GoogleTest version:

TEST(EquiLeader, Given)
{
    std::vector<int> input { 4, 3, 4, 4, 4, 2 };
    ASSERT_EQ(2, solution(input));
}

The number four dominates the input sequence. If we split it on the first element (index 0) we get two subsequences where four is again a leader. The same if take the third element (index 2) as a pivot. These are the only cases correctly working on these vector, so we want our function to return 2.

I have split my C++11 solution in two parts. Firstly I wrote a utility function that gets a candidate leader for the entire sequence:

int getCandidateLeader(const std::vector<int>& input)
{
    int count = 0;
    int candidate;
    for(int cur : input) // 1
    {
        if(count == 0)
        {
            count = 1;
            candidate = cur;
        }
        else
        { // 2
            count += (candidate == cur) ? 1 : -1;
        }
    }

    return count ? candidate : std::numeric_limits<int>::max(); // 3
}

1. I am still not very used to the new C++11 range based for-loop. This is a place where it helps to make the code more readable. I am looping on the entire vector, and I don't care much about the index of any of its element, I am just interested in their values.
2. This is the central idea of this algorithm. Increasing the counter, when the current element is the same that the previously detected candidate, or throw away both elements, if they are different, let us solve the problem of getting a candidate leader in linear time.
3. If no good candidate has been detected, I return a non-acceptable value.

And this is the solution to the actual problem:

int solution(std::vector<int>& input)
{
    int leader = getCandidateLeader(input);
    if(leader == std::numeric_limits<int>::max()) // 1
        return 0;

    const unsigned count = std::count(input.begin(), input.end(), leader);
    if(count < input.size() / 2) // 2
        return 0;

    int leaders = 0; // 3
    unsigned leftLeaders = 0; // 4
    for(unsigned i = 0; i < input.size(); ++i)
    {
        if(input[i] == leader) // 5
            leftLeaders++;
        unsigned rightLeaders = (count - leftLeaders); // 6
        if(leftLeaders > (i + 1) / 2 && rightLeaders > (input.size() - i - 1) / 2)
            ++leaders; // 7
    }

    return leaders;
}

1. No candidate leader found, no need to go on.
2. The candidate leader is not dominant.
3. Keep memory of the current number of equi-leaders.
4. Here I store the number of leader occurrences in the current left subsequence. In the first loop, it contains just the leftmost element. It would grow till it would get all the sequence.
5. If the current element is a leader, increase the left counter.
6. Get the right subsequence number of leaders by subtraction.
7. If both left and right leader count are dominant in their subsequence, increase the equi-leader counter.

Go to the full post

Codility Dominator

Given a vector of integers, check if there is a dominator, meaning a value that occurs more than half the times, in it.

I have already written a couple of posts about this Codility problem. In the first one I solve it using a map as a buffer, in the second one I use the cheaper golden leader algorithm.

My porting to C++11 didn't change much of the original code. For instance, having to define a synonim to a type, instead of "typedef" it, I referred to the "using" keyword.

Go to the full post

Codility Fish

We are given in input two same sized integer vectors, the first containing fish dimensions, all of them are different, the second their direction, upstream or downstream.
The river is so narrow that fishes going in opposite direction have no choice but colliding, and the smaller one will soccumb, eaten by the other.
We want to know how many fishes will survive in the end.

I have already written a couple of posts on this problem. One where I show a naive, with N squared time complexity, solution, and another one where I implement a linear solution.

I worked on it once again, now that we can use C++11 to solve codility problems. However I didn't get anything different from that time. I just wrote the test cases in a more clean way, using the new syntax that allows creating a vector from an initialization list.

Nothing new to say about it, I am sorry to say.

Go to the full post

Codility StoneWall

We are given a sequence of integers representing the height of a wall that we have to build. We should return the minimum number of rectangular blocks needed to build it.

A good hint is that this problem has been included in the Stacks and Queues codility section. If you want more clues, you could have a look at the codility blog, where this problem is discussed and a solution is given.

Actually, I found the discussion in there more confusing than helping. Still I got a couple of elements to think on. I pondered on them, and in the end I came out with a sort of pseudo-code.

I use a stack to keep track of the current status of the wall, each element in it represent a block I am using to reach the current level.
Let's say that I am at the i-th step. The stack reflects the status of the previous step.
Maybe I don't have to do anything. The current level is the same. OK, just move to the next step.
Maybe I have to increase the wall height. That's easy, just put a new block of the right size on top the stack.
Maybe I have to decrease the wall height. That's a bit more complicate. I should remove enough blocks to get the right size, or something less than it. And then add a new block, if required.
And, well, that's it.

Then I wrote a few test cases, C++ for GoogleTest:

TEST(StoneWall, Given) // 1
{
    std::vector<int> input { 8, 8, 5, 7, 9, 8, 7, 4, 8 };
    ASSERT_EQ(7, solution(input));
}

TEST(StoneWall, Simple) // 2
{
    std::vector<int> input { 1, 2, 1 };
    ASSERT_EQ(2, solution(input));
}

TEST(StoneWall, Bad) // 3
{
    std::vector<int> input { 0 };
    ASSERT_EQ(-1, solution(input));
}

1. Provided by codility.
2. A sort of minimal pyramid.
3. As usual, Codility does not care about bad input data. I decided to add a basic check to avoid non-positive input. A more robust solution would have been using assertions (if the data check would be a Somebody Else's Problem) or exceptions. Here I decided to keep it simpler and just return a minus one.

And here is my solution:

int solution(std::vector<int>& input)
{
    if(std::find_if_not(input.begin(), input.end(), [](int cur){ return cur > 0; }) != input.end()) // 1
        return -1;

    int blocks = 0;
    std::stack<int> buffer;

    std::for_each(input.begin(), input.end(), [&](int cur) // 2
    {
        while(!buffer.empty() && cur < buffer.top()) // 3
        {
            buffer.pop();
        }

        if(buffer.empty() || cur > buffer.top()) // 4
        {
            buffer.push(cur);
            ++blocks;
        }
    });

    return blocks;
}

1. The STL find_if_not() is used to ensure all the element in input are positive. Paranoid check not required by codility.
2. Loop on all the elements. Notice that the lambda captures by reference the external variables so that it can modify them in its body.
3. If the previous wall height is taller than the current request, remove block(s) from the stack.
4. Add a new block to the stack to match the current request. Besides, increase the block counter.

Go to the full post

Codility Nesting

Given a string that contains only open and close round brackets, check if they are all properly nested.

This is "painless" codility problem in the Stacks and Queues, is very close to its brother Brackets that we have just seen. Only, it is even simpler than that.

Well, it is painlessly easy if we take the right path, and consider it, as suggested by the section in which it is included, as a stack-related problem. If you miss this detail, you could find yourself in a bit more complicated territory. Have a look at this other post if you want to read some more stuff in that direction.

Otherwise, you could have a look at my C++11 solution:

int solution(std::string& input)
{
    int count = 0; // 1
    for(auto it = input.begin(); it != input.end(); ++it) // 2
    {
        if(*it == '(')
            ++count;
        if(*it == ')') // 3
        {
            if(!count)
                return 0;
            --count;
        }
    }
    return count == 0 ? 1 : 0; // 4
}

1. In this simple case, instead of using a proper stack, we can emulate it by using an integer. Have a look at Brackets to see a slight different version of this problem where a real stack makes more sense.
2. Loop on all the input elements, when we find an open bracket the counter is increased, as if we put it in a stack.
3. When a close bracket is found, we should pop the stack (i.e. decrease the counter). Still, we have to be careful, and ensure that the stack is not empty, that would mean no matching open bracket was previously seen.
4. I could have relied on the C++ automatic conversion from boolean type to integer, and avoid the use of the ternary operator. In any case, the meaning of this line is, only if the stack is empty (actually, the counter is zero) the function returns success.

Go to the full post

Codility Brackets

Given a string that contains only open and close brackets (round, squared, curly), check if they are all properly nested.

This is a "painless" problem in the Stacks and Queues, practicing area, on the codility web site.

When I compared my current solution to the previous one that I developed when codility didn't support C++11, I found out only minimal differences.

Firstly I wrote a few test cases, a couple of them are given by codility in the problem definition, I merely implemented them for GoogleTest:

TEST(Brackets, Given1)
{
    std::string input { "{[()()]}" };
    ASSERT_EQ(1, solution(input));
}

TEST(Brackets, Given2)
{
    std::string input { "([)()]" };
    ASSERT_EQ(0, solution(input));
}

TEST(Brackets, Empty)
{
    std::string input { "" };
    ASSERT_EQ(1, solution(input));
}

And this is my latest implementation:

int solution(std::string& input)
{
    if(input.size() % 2) // 1
        return 0;

    std::stack<char> buffer; // 2

    for(auto it = input.begin(); it != input.end(); ++it) // 3
    {
        switch(*it) // 4
        {
        case '(': case '[': case '{':
            buffer.push(*it);
            continue;
        }

        if(buffer.empty()) // 5
            return 0;

        char prev = buffer.top(); // 6
        if((*it == ')' && prev == '(') || (*it == ']' && prev == '[') || (*it == '}' && prev == '{'))
            buffer.pop();
        else
            return 0;
    }

    return buffer.empty() ? 1 : 0; // 7
}

1. An odd numbers of brackets implies bad nesting.
2. I am going to push on a stack all the open brackets, popping them as soon as a close one is found.
3. Loop on all the input elements.
4. If the current character is an open bracket, in any supported flavor, push it on the stack.
5. Otherwise, if the buffer is empty the current element is a bad one. It could be an unexpected close bracket or a dirty character. In any case I return failure.
6. Check the current element on stack top. If it is a matching bracket, pop it and check the next input value.
7. After checking all the elements, we return success only if no open bracket is still lingering in the stack.

Go to the full post

Codility Triangle

Given a vector of integer, detect if there is at least a triplet of elements that could represent the length of edges in a triangle.

You could find this "painless" problem in the Sorting section, practicing area, on the codility web site.

I have already solved it in a previous post, when codility referred to a C++98 compiler. Here I refactor the solution for the current C++11 codility programming environment. Please look at the previous post for more chatting on it.

We know that in a triangle, the sum of any two edges should be less that the third one, this lead to the condition we have to verify. Our function would get in input a STL vector of int, and return 1 for success and 0 for failure.

A couple of test case for this problem, C++11 for GoogleTest:

TEST(Triangle, Given1)
{
    std::vector<int> input { 10, 2, 5, 1, 8, 20 };
    ASSERT_EQ(1, solution(input));
}

TEST(Triangle, Given2)
{
    std::vector<int> input { 10, 50, 5, 1 };
    ASSERT_EQ(0, solution(input));
}

I have implemented my solution in this way:

int solution(std::vector<int>& input)
{
    if(input.size() < 3) // 1
    {
        return 0;
    }

    std::sort(input.begin(), input.end()); // 2

    for(unsigned i = 2; i < input.size(); ++i) // 3
    {
        if(input[i] < 0) // 4
        {
            i += 2;
            continue;
        }

        if(static_cast<unsigned>(input[i - 2] + input[i - 1]) > static_cast<unsigned>(input[i])) // 5
        {
            return 1;
        }
    }

    return 0;
}

1. Paranoid check, not required by codility.
2. Instead of checking all the possible triplets, that would generate a cubic time complexity solution, we can sort the input data before checking them.
3. Check all the possible triplets. The i-th one represent the third element in the current triplet.
4. We should expect also negative elements in input, still they make no sense for our problem. If the i-th element is negative, we can safely discard the current triplet and the next two ones (that would include this negative value).
5. We are considering three positive sorted elements. It is enough to ensure that the third one is bigger than the sum of the previous two ones to answer to the original question. I need to cast the values to unsigned to avoid any possible overflow problem.

Go to the full post

Codility MaxProductOfThree

Given a vector of integer, return the maximal value we could get multiplying any possible triplet.

You could find this problem, marked as painless, in the Sorting section, practicing area, on the codility web site.

And here is a test case, that I have translate for C++11 on GoogleTest, that is provided to make more clear the request:

TEST(MaxProductOfThree, Given)
{
    std::vector<int> input { -3, 1, 2, -2, 5, 6 };
    ASSERT_EQ(60, solution(input));
}

It is easy to spot that the maximal solution here is given by multiplying 2, 5, and 6, being them the biggest positive elements in the collection. Still, this test case it is not enough to capture all the interesting input data. It could happen also that we have two big negative elements, and we should remember that multiplying two negatives numbers gives a positive one.

I have already solved this problem some time ago, in a previous you could find my C++98 solution and some more testing and chatting about the requisites. Please have a look at it for more details.

Now Codility supports a C++11 compiler, and this is my refactoring for the new programming environment:

int solution(std::vector<int>& input)
{
    if(input.size() < 3) // 1
        return 0;

    std::sort(input.begin(), input.end()); // 2

    int pos = input.back() * *(input.rbegin() + 1) * *(input.rbegin() + 2); // 3
    int neg = input.front() * *(input.begin() + 1) * input.back(); // 4

    return std::max(pos, neg); // 5
}

1. Input checking is usually not considered for codility problems, still I didn't fill right not to state explicitly that we are expecting at least three elements in the input vector. Admittedly, throwing an exception would have been a more clean approach.
2. Checking all the triples would lead to a cubic time complexity. Much better to sort the collection, which costs a mere N*lgN.
3. Vanilla case, just pick the first three values to the left and multiply them.
4. Alternative solution, a couple of big negative values change the rule of the game. Get them, and multiply them to the one to the extreme left.
5. Compare the two tentative solutions, and return the big one.

If you think a bit about it, the pos and neg tentatives cover all the possible cases (all negative numbers, and any variation I could have thought of).

Go to the full post

Codility Distinct

We want to know how many distinct values in a passed integer vector.

This simple problem is available also on the Codility Sorting section. This fact is a huge hint to solve it, because sorting it before do any check makes our job a lot easier.

Reading the codility description, you could also find the description of a test case, that here I have ported to GoogleTest in C++11:

TEST(Distinct, Given)
{
    std::vector<int> input { 2, 1, 1, 2, 3, 1 };
    ASSERT_EQ(3, solution(input));
}

There are three distinct values in the input vector, so our function should return 3.

Here is a possible solution:

int solution(std::vector<int>& input)
{
    if(input.size() < 2) // 1
        return input.size();

    std::sort(input.begin(), input.end()); // 2

    int result = 1;
    int prev = input.front(); // 3
    std::for_each(input.begin() + 1, input.end(), [&](int cur){ // 4
        if(cur != prev) // 6
        {
            ++result;
            prev = cur;
        }
    });

    return result;
}

1. Trivial cases.
2. We are modifying the input data. This is not commonly seen as a good idea, since the caller usually doesn't want having such kind of surprises. On codility problems we usually assume that performances are of paramount importance.
3. There is at least a distinct value in the vector, the first one.
4. Loop on all the other values. Notice that I capture by reference the external values, so that I could change them in the lambda function.
5. Each time I see a new value, increase the result and keep track of the new value that I have to check.

Unique

The above solution works fine, still, meetingcpp tweeted me, what about std::unique?

Right, std::unique(). Here you can see it in action:

int solution(std::vector<int>& input)
{
    if(input.size() < 2)
        return input.size();

    std::sort(input.begin(), input.end());
    return std::unique(input.begin(), input.end()) - input.begin();
}

Isn't that a beauty?

The STL unique() function rearrange the elements in a sorted collection moving the duplicates to its end and returning an iterator to the new logical end - the actual collection size is not changed.

Go to the full post

Codility GenomicRangeQuery

Given a string representing a DNA sequence as a collection of 'A', 'C', 'G', 'T' letters, and number of subsequences in it, we should tell which is the minimal impact factor for each subsequence, where nucleotides A, C, G and T have impact factors of 1, 2, 3 and 4, respectively.

You could solve this problem on your own in the Codility Prefix Sum section. I have already given a couple of C++ solutions on a previous post, here I would just refactor the linear one to C++11.

Here is the Codility given test case, ported to GTest:

TEST(GenomicRangeQuery, Given)
{
    std::string dna { "CAGCCTA" };
    std::vector<int> first { 2, 5, 0 };
    std::vector<int> last { 4, 5, 6 };
    ASSERT_EQ(first.size(), last.size());

    std::vector<int> result = solution(dna, first, last);

    ASSERT_EQ(first.size(), result.size());
    EXPECT_EQ(2, result[0]); // 1
    EXPECT_EQ(4, result[1]);
    EXPECT_EQ(1, result[2]);
}

1. Subsequence 2, 4 is "GC", C has a lower impact factor, the result should be 2.
2. "T" leads to 4.
3. The full sequence has A's in it, so 1 is its minimal impact factor.

This is my C++11 solution:

int getType(char nucleotide) // 1
{
    switch (nucleotide)
    {
    case 'A':
        return 0;
    case 'C':
        return 1;
    case 'G':
        return 2;
    default:
        return 3;
    }
}

std::vector<int> solution(std::string& sequence, std::vector<int>& first, std::vector<int>& last)
{
    std::vector<std::vector<int>> parsums(4); // 2
    std::for_each(parsums.begin(), parsums.end(), [&sequence](std::vector<int>& cur)
    {
        cur.resize(sequence.size());
    });

    for (unsigned i = 0; i < sequence.size(); ++i) // 3
    {
        parsums[getType(sequence[i])][i] = 1;
    }

    for (int i = 0; i < 4; ++i) // 4
    {
        std::partial_sum(parsums[i].begin(), parsums[i].end(), parsums[i].begin());
    }

    std::vector<int> result(first.size()); // 5
    for (unsigned i = 0; i < first.size(); ++i)
    {
        int type = 3;
        for (unsigned j = 0; j < 3; ++j)
        {
            int left = first[i] > 0 ? parsums[j][first[i] - 1] : 0; // 6
            int right = parsums[j][last[i]];
            if (right != left) // 7
            {
                type = j;
                break;
            }
        }

        result[i] = type + 1; // 8
    }

    return result;
}

1. Trivial utility function to convert any nucleotide in a zero-based index.
2. I am going to put each nucleotide in its own vector. It would cost quite a lot in term of memory, but it is worthy in terms of execution time.
3. If I spot a T on position i, the element i on the T-vector is marked as found.
4. I convert each nucleotide specific vector in its own partial sum. In this way I am going to see just accessing first and last element in a give subsequence if that nucleotide is there. No need to loop in a loop, no quadratic complexity.
5. This is the vector I am going to give back to the caller.
6. We need to pay attention to the special case that occurs when the subsequence starts at the beginning of the sequence. Usually we need to get the value of the previous element, in this case there is not such a thing.
7. If the right partial sum has changed (actually, it could only be greater or equal) we have a nucleotide of the specific type. Since I am checking from the lowest value up, as soon as I find it, I break the loop.
8. The user wants a 1-based index to identify the solution.

Go to the full post

Codility MinAvgTwoSlice

Given a vector sized 2 or more, find the index where starts the first subsequence of two or more elements with a minimal average value.

Like CountDiv, I don't get why this problem is in the Codility Prefix Sum section. To solve it we could create our sort-of partial sums vector, but this would just add space complexity without giving anything back.

I guess the most interesting point in this problem is about the size of candidate subsequences. Let's say that we have spotted a good subsequence of four elements in our vector. We would always find a way of splitting it and get the same or better result by one of its subsequences, think for instance to { 1, 1, 1, 1 } . We can't always do that when the elements are just three, and this is as an example: { -12, 0, -12 }.

So, that's it. We need to check for 2- and 3- sized subsequences.

Here are a few GoogleTest C++11 test cases:

TEST(MinAvgTwoSlice, Given)
{
    std::vector<int> input { 4, 2, 2, 5, 1, 5, 8 };
    ASSERT_EQ(1, solution(input));
}

TEST(MinAvgTwoSlice, Neg)
{
    std::vector<int> input { 4, 2, 2, 5, -100, 5, -100 };
    ASSERT_EQ(4, solution(input));
}

TEST(MinAvgTwoSlice, Last)
{
    std::vector<int> input { 100, 100, 1 };
    ASSERT_EQ(1, solution(input));
}

And this is my solution:

int solution(std::vector<int>& input)
{
    assert(input.size() > 1); // 1

    int index = 0;
    double minimal = std::numeric_limits<double>::max();
    for(unsigned i = 0; i < input.size() - 2; ++i) // 2
    {
        double candidate = (input[i] + input[i+1]) / 2.0; // 3
        if(candidate < minimal)
        {
            index = i;
            minimal = candidate;
        }

        candidate = (input[i] + input[i+1] + input[i+2]) / 3.0; // 4
        if(candidate < minimal)
        {
            index = i;
            minimal = candidate;
        }
    }

    double candidate = (input.back() + *(input.rbegin() + 1)) / 2.0; // 5
    if(candidate < minimal)
        return input.size() - 2;

    return index;
}

1. Problem prerequisite.
2. Looping an all the elements but the last couple.
3. Check the current couple.
4. Check the current triplet.
5. Last couple.

Go to the full post

Codility CountDiv

We are asked to write a function that tells how many integers in a given interval are divisible by a given number.

The solution is very easy, if you don't get confused by the fact that this problem is in the Codility Prefix Sum section.

You could actually devise an implementation that makes use of partial sum, but it is not the right direction. An hint is that the expected complexity should be constant in time and size.

Let's write a couple of test cases (C++ for GoogleTest) while thinking to a better solution:

TEST(CountDiv, Given) // 1
{
    ASSERT_EQ(3, solution(6, 11, 2));
}

TEST(CountDiv, Minimal) // 2
{
    ASSERT_EQ(1, solution(0, 1, 3));
}

TEST(CountDiv, BothSides)
{
    ASSERT_EQ(7, solution(0, 12, 2));
}

1. Codility provided test case. In the interval [6, 11] there are 3 numbers divisible by 2: 6, 8, 10.
2. Remember that you can divide zero by any numbers, so in [0, 1] there is one number (zero) divisible by three.
3. A simple case, we want to check all the integer in [0 .. n].

And here is a possible solution:

int solution(int first, int last, int div)
{
    return (last / div) - (first / div) + !(first % div);
}

Firstly I get how many integers up to the end are divisible by div, then subtract the ones to the left. But remember to add the left limit, if it is needed.

***

You could think of saving some processor seeing that "div" is the dividend for both "last" and "first", and rewriting it as a single division. After all, we know that:

(A / C) - (B / C) = (A - B) / C

Right?

Yes, if we worked on floating point numbers. However, here we have integers, so we are applying the integer division operator, discarding each time the remaind, and having (sometimes) a different result from the floating point division operator applied to the same values.

For instance consider this test case:

TEST(CountDiv, Single)
{
    ASSERT_EQ(1, solution(6, 11, 7));
}

Here we are looking for the divisors of 7 in the interval [6, 11]. It is easy to spot that the solution is one, value 7 should be accepted. Applying the integer division to "last" and "first", we are actually answering to the question "how many values have seven as divisor in the interval [1, X]?"

In this context we can't apply the distributive property of division, because we are not working on the realm of real numbers.

We can't apply subtraction on "last" and "first" and then divide the result by 7, because this would mean that we are checking the size of the interval from "first" to "last" against 7.

If I try to be smarter that required, calculating (11 - 6) / 7, I get a zero. I'm not checking if there is a number in the interval that could be divided by 7, I'm checking how big is the interval compared to 7.

Go to the full post

Codility PassingCars

We are given a vector that contains just zeros and ones, each of them symbolizing a car going east/west-ward. We want to return the number of passing cars. I struggled a bit to understand what was the meaning of this, in the end I found out that we are required to tell how many ones follow each zero.

It helped a lot to look at the provided test case, that I ported here for C++11 and GoogleTest:

TEST(PassingCars, Given)
{
    std::vector<int> input { 0, 1, 0, 1, 1 };
    ASSERT_EQ(5, solution(input));
}

We have to zeros, one at position 0, the other at 2.
The first zero is followed by three ones, the second zero is followed by two ones, hence we are expecting a 5 as a result.

This problem is placed in the Codility Prefix Sum section, so you would probably solve it using the STL partial_sum() function, as I also did in a previous post, when Codility did not support C++11 yet, where you could also find a more in-depth discussion.

Here I refactor only a sleeker solution (credits go to Wen-Kai) that still uses the concept of partial sum, slightly changing it (so that I can't use anymore the STL function) to adhere to the problem requirement:

int solution(std::vector<int>& input)
{
    int sum = 0; // 1
    std::for_each(input.begin(), input.end(), [&sum](int& cur){ // 2
        if(cur == 0) // 3
            ++sum;
        else // 4
            cur = sum;
    });

    unsigned couples = std::accumulate(input.begin(), input.end(), 0U); // 5
    return couples > 1000000000 ? -1 : couples;
}

1. Let's count the number of zeros we see.
2. Loop on all the input items. Notice that I want my lambda function to capture the sum value and to get as parameter the current element in the vector both by reference, so that I can change them. Modifying a input parameter passed by reference is not considered good style, since it could be surprising for the caller. However, it is a common approach in Codility world, for efficiency reason.
3. If the current value is a zero, increase the counter.
4. Otherwise cover the current value (that was a 1) with the number of previously seen zeros.
5. Now we just have to sum all the previously generated partial sums to get the expected result. I should put the value in an unsigned int because a plain int could lead to overflow, given the possible input - in the original post I gave more details on this point and I also provided a test case to ensure my code works right also in extreme situation.
6. The codility problem asks to return -1 when the number of couples is over the one billion threshold.

Go to the full post

Codility Mushroom Picker

We have a street with mushroom at every place, represented by a non-empty and potentially big (one hundred thousand) vector of relatively small unsigned integers (say, less than one thousand). It is given also the initial position of a picker, and its movement range. We want to maximize the number of collected mushrooms.

This problem is proposed as example in the Codility Prefix Sum section. I would argue it is not the best choice, since the prefix sum part of the problem is hidden by the complexity of the determining how the picker moves in the array. Still, I had some good fun solving it.

Test cases

There is no codility check for this problem, so I wrote an awful lot of test cases, trying to check all the possible cases that come to my mind. I spare you the pain of seeing them, still I suggest you to spend some time thinking on them. Here is the Codility given one (C++11, GoogleTest):

int solution(const std::vector<unsigned>& input, unsigned pos, unsigned moves);

TEST(MushroomPicker, Given)
{
    EXPECT_EQ(25, solution({ 2, 3, 7, 5, 1, 3, 9 }, 4, 6));
}

I have seven mushroom places, the picker starts its trip on position 4, it could moves up to 6 consecutive places. Its best strategy would be to start moving to the left up to the third cell, than move to the right till the end of the array. In this way we would add up to 25 mushrooms.

An inefficient solution

Let's not care for the moment about performance. Try to focus instead of which intervals we have to check.
Theoretically, we could change direction any time we like, however It is easy to see how we should not change it more than once.

My idea is to start moving the picker to the left for a while, then change its direction and move it to the right.
I simply won't care of doing the same starting moving to the right, counting on the symmetry of the problem:

int solution(const std::vector<unsigned>& input, unsigned pos, unsigned moves)
{
    assert(!input.empty()); // 1
    assert(pos < input.size());

    unsigned result = 0; // 2

    unsigned maxLeftShift = std::min(pos, moves); // 3
    for(unsigned curLeftShift = 0; curLeftShift <= maxLeftShift; ++curLeftShift) // 4
    {
        unsigned first = pos - curLeftShift; // 5
        unsigned rightShift = std::min(pos + (moves - curLeftShift * 2), unsigned(input.size() - 1)); // 6
        unsigned last = std::max(pos, rightShift); // 7

        unsigned tentative = 0;
        for(unsigned i = first; i <= last; ++i) // 8
        {
            tentative += input[i];
        }
        result = std::max(result, tentative); // 9
    }

    return result;
}

1. Ensuring requisites as stated in the problem.
2. Currently fetched mushrooms.
3. I check the initial picker position and the number of moves available to get the maximum number of moves to the left. I want to be sure I won't break the vector boundary.
4. The left shift would range from zero (no shift at all, go directly to the right) up to the maximum value calculated in (3).
5. The current sub-interval starts at the initial picker position opportunely shifted to the left.
6. This is a bit tricky. Once we have moved to the left, we should determine which movement we are still allowed to do on the right. Getting back to the original position would cost the left shift doubled. And we have also to ensure that we won't break the right vector boundary.
7. And (6) is not enough. We have to consider also that our sub-interval should have at least its right limit in the initial picker position.
8. Given all the above, how it is just a matter of picking up all the mushroom in the [first, last] interval.
9. If the current sub-interval is richer that the previous ones, store it as temporary solution.

I have spent large part of the time writing lines (6) and (7), the rest of the code shouldn't be a problem.
Performance-wise, this implementation has an obvious problem, represented by the (8) loop.

Linear solution

Instead of recalculating each time the mushroom sum for a specific sub-interval, we could calculate the partial sum for the complete interval once and for all, outside the main for-loop.

I have already talked about partial sum, and its C++ STL implementation, in a previous post. Please, have a look at that post for more info.

It is not difficult to refactor the previous solution using partial sum (some code cleanup as bonus):

int solution(const std::vector<unsigned>& input, unsigned pos, unsigned moves)
{
    assert(!input.empty());
    assert(pos < input.size());

    std::vector<unsigned> parSum(input.size()); // 1
    std::partial_sum(input.begin(), input.end(), parSum.begin());

    unsigned result = 0;
    for(unsigned i = 0; i <= std::min(pos, moves); ++i)
    {
        unsigned first = pos - i;
        unsigned last = std::max(pos, std::min(pos + (moves - i * 2), unsigned(input.size() - 1)));
        result = std::max(result, first ? parSum[last] - parSum[first - 1] : parSum[last]); // 2
    }

    return result;
}

1. Create a partial sum from the input. We should expect this sort of transformation:
{2, 3, 7, 5, 1, 3, 9} -> {2, 5, 12, 17, 18, 21, 30}
2. If our sub-interval starts on the first vector element, we want to check simply the partial sum for its last element. Otherwise, we want subtract from it the partial sum for the elements on the left of the sub-interval.

Go to the full post

Codility MaxCounters

This problem, that you can find in the Codility test Counting Elements section, tagged as "respectable", requires us to implement a simple finite state automaton. We are going to pass to our function the number of elements we want to work with, and a series of operation we want to apply to them. We should get back the elements final state.

There are just two operation, increase-a-specified-value, and level-to-the-top. To invoke the first one we specify the 1-based index for the element we want to modify, the second one is invoked when N+1 is found. We can safely assume that no other integer would be passed, or we can consider them as no-op.

Here is a C++11 GTest test case, based on a case provided by codility:

TEST(MaxCounters, Given)
{
    const int size = 5; // 1
    std::vector<int> input { 3, 4, 4, 6, 1, 4, 4 }; // 2
    std::vector<int> output = solution(size, input);

    ASSERT_EQ(size, output.size());
    EXPECT_EQ(3, output[0]);
    EXPECT_EQ(2, output[1]);
    EXPECT_EQ(2, output[2]);
    EXPECT_EQ(4, output[3]);
    EXPECT_EQ(2, output[4]);
}

1. We want to operate on a five integer vector.
2. This list of operations translate as:
- increase the third element (from 0 to 1)
- increase the fourth element (to 1)
- increase the fourth element (to 2)
- equalize all the elements to the top value (2)
- increase the first element (to 3)
- increase the fourth element (to 3)
- increase the fourth element (to 4)
The resulting vector should be { 3, 2, 2, 4, 2 }.

I have already solved this problem some time ago, please have a look to that post if you want more information and also a first inefficient solution. This time I have aimed directly to the 100% solution, and I have written C++11 code - at that time codility did not support it.

As I found out comparing the results afterward, differences are minimal. Basically, just the use of a couple of lambda functions in for_each() loops, instead of old plain for loops:

std::vector<int> solution(int n, std::vector<int>& input)
{
    std::vector<int> result(n); // 1
    int highest = 0; // 2
    int lowest = 0;

    std::for_each(input.begin(), input.end(), [&](int cur){ // 3
        if(static_cast<unsigned>(cur) > result.size()) // 4
        {
            lowest = highest;
        }
        else // 5
        {
            int index = cur - 1; // 6
            result[index] = std::max(result[index], lowest) + 1; // 7
            if(result[index] > highest) // 8
                highest = result[index];
        }
    });

    std::for_each(result.begin(), result.end(), [lowest](int& cur){ // 9
        if(cur < lowest)
            cur = lowest;
    });

    return result;
}

1. This is what the caller wants me to generate.
2. Cache the current highest and lowest values in the output vector. This is going to save a lot of computation time. And it comes quite cheap too.
3. Loop on all the operations specified by the caller.
4. I rely on the input data correctness, guaranteed by codility. In my previous implementation I have been more paranoid, and implemented a no-op strategy in case of unexpected values. Anyway. If the current value is not a valid 1-based index, I assume a top leveling is required. Instead of applying the variation to the actual data, I simply state that the current lowest value is just like the highest one.
5. I have been requested to increase a specific value.
6. For sake of readability, I introduced this local variable, representing the actual 0-based index in my vector.
7. Remember about (4). Possibly the current value in that index is not the real one, I have to compare it against the lowest variable to ensure that I get it right. Then increase it.
8. I should ensure the highest cached value is still valid, so that next time I could do the (4) trick.
9. Before returning to the caller, I should ensure any value in the vector is not less than the lowest cached value.

Go to the full post

Codility MissingInteger

Given a not empty integer vector, returns the minimal positive number not seen in input. You could find this problem in the Codility test Counting Elements section. It is marked as "respectable", even though it doesn't seem more complicated than an average "painless" one.

To clarify what we are looking for, here is a bunch of C++11 GTest test cases:

TEST(MissingInteger, One) // 1
{
    std::vector<int> input { 1 };
    ASSERT_EQ(2, solution(input));
}

TEST(MissingInteger, Neg) // 2
{
    std::vector<int> input { -3 };
    ASSERT_EQ(1, solution(input));
}

TEST(MissingInteger, Given) // 3
{
    std::vector<int> input { 1, 3, 6, 4, 1, 2 };
    ASSERT_EQ(5, solution(input));
}

TEST(MissingInteger, Sequence) // 4
{
    std::vector<int> input { 1, 2, 3, 4, 5, 6, 7 };
    ASSERT_EQ(8, solution(input));
}

1. Only one integer in input.
2. We could have negative integer in input.
3. Example provided by Codility.
4. Natural sequence.

An intuitive solution would be checking all the positive numbers in [1..n], where n is the vector size, and returning as soon as we find a missing value. But think to test 4. It is the worst case scenario. I'll check all the element, to see that no one of them is missing, and so I would return n+1, with a n-squared complexity.

To achieve a time-linear solution, we should use some extra memory. We'll scan the input vector just once, setting a flag for each found positive number. Then I'll search for the first unset flag, and that's it:

int solution(std::vector<int>& input)
{
    std::vector<bool> flags(input.size()); // 1

    std::for_each(input.begin(), input.end(), [&flags](int cur){ // 2
        if(cur > 0 && cur <= static_cast<int>(flags.size())) // 3
            flags[cur - 1] = true;
    });

    return std::find(flags.begin(), flags.end(), false) - flags.begin() + 1; // 4
}

1. A flag for each expected (in the worst case) positive integer in input.
2. Loop on all the input data
3. If I the current value is positive and in the expected range, convert it to a valid index for the flags buffer, and mark that element as seen.
4. Find the index of the first missing element (remember that the STL find() function return the end() iterator in case it doesn't find anything, and this behavior is just what we want here), convert it back to the related value an return it.

Go to the full post

Codility PermCheck

We have in input an integer vector, and we want to check if it is a permutation of the natural sequence [1..n], where n is the vector size.
You could find this problem in the Codility test Counting Elements section.

I have already written a post about it, where I provide some test cases and a few different C++ solutions. Some time has passed, I am solving them from scratch, this time using C++11, that in the meantime has become supported by Codility. Please have a look at it for more details.

Here is my current solution:

int solution(std::vector<int>& input)
{
    std::vector<bool> flags(input.size()); // 1

    for(auto it = input.begin(); it != input.end(); ++it) // 2
    {
        unsigned index = *it - 1; // 3
        if(index >= input.size() || flags[index]) // 4
            return 0;
        flags[index] = true;
    }

    return 1; // 5
//    return std::find(flags.begin(), flags.end(), false) == flags.end();
}

1. The idea is to use a buffer where I keep track of all the expected elements. Initially no one has been seen, once I find a number I flag it.
2. Loop on all the elements.
3. Convert the current value in its associated zero-based vector index.
4. If the index is outside the expected range, or if the flag has already been set for this value, the sequence is not a permutation. Return the required failure value.
5. If I get here, the sequence has been checked and found OK, return the success value. Actually, also this time I didn't notice that, and I originally wrote the next line, now commented, to explicitly check that any number is found. This is useless, since I have already checked in (4) all possible cases.

Go to the full post

Codility FrogRiverOne

This problem is described in the Codility web site, Counting Elements section.
We are asked to check if the integer vector passed in input contains all the values in the natural sequence from 1 up to a passed number.

I have already given a couple of C++ solutions (a Big-Oh-Squared and a linear one) in a previous post. Here I give the C++ solution that jumped now to my mind, and it sort of look cleaner to me.

First of all, a couple of test cases (based on the GTest framework) that should clarify the problem:

TEST(FrogRiverOne, Given)
{
    std::vector<int> input { 1, 3, 1, 4, 2, 3, 5, 4 }; // 1
    ASSERT_EQ(6, solution(5, input));
}

TEST(FrogRiverOne, Cannot)
{
    std::vector<int> input { 1, 3, 1, 4, 2, 3, 5, 4 }; // 2
    ASSERT_EQ(-1, solution(6, input));
}

1. The Codility given test case. Given this vector, we want to get the index of the first element so that any value in [1 .. 5] is available. That means 6, given that 1 is at position 0 (and 2), 2 at 4, 3 at 1 (and 5), 4 at 3, and finally 5 at 6.
2. A negative case, 6 is not in the passed vector, so we should return -1.

Here is my latest solution:

int solution(int x, std::vector<int>& input)
{
    assert(x > 0); // 1
    assert(!input.empty());

    std::vector<bool> flags(x-1); // 2
    for(unsigned i = 0; i < input.size(); ++i) // 3
    {
        unsigned index = input[i] - 1; // 4
        if(!flags[index]) // 5
        {
            flags[index] = true;
            if(--x == 0)
                return i; // 6
        }
    }

    return -1; // 7
}

1. Enforce the most sensitive problem requisite. This is not requested by Codility, that ensures only "clean" input data are provided.
2. When I see a number in [1..x] I mark it as found. Here I store the flags for this purpose.
3. Loop on all the vector input elements.
4. Convert the current input value so that I can use it as an index for the flags vector. Production code should ensure that index is not outside the range of valid values.
5. If I see this value for the first time, I mark it as found, and I decrease the number of missing value that I am looking for.
6. When I have found all the expected elements, I return the current index in the input vector.
7. Otherwise not-found is returned.

Go to the full post

Codility FrogJmp

You could find the full description of this problem, and check your solution in (possibly) your favorite programming language, on the Codility web site, Time Complexity section. To simply put it, we have to calculate a distance, and determine how many segments of a given size we need to cover that distance.

It is marked as "painless", even if "embarrassingly simple" would probably be a better description. Still, even in this case a mistake could be lurking somewhere, ready to catch us off guard. Here the issue could be that we need to consider two cases, and one of them could be overseen. Here is a couple of test cases (C++, GoogleTest) that clarifies the matter:

TEST(FrogJmp, Given) // 1
{
    ASSERT_EQ(3, solution(10, 85, 30));
}

TEST(FrogJmp, Precise)
{
    ASSERT_EQ(4, solution(10, 90, 20));
}

1. A frog starts its trip on 10, wants to reach 85, and its jump is sized 30. Obviously it needs three steps to get it.
2. A weaker frog on a longer path. In four steps it gets exactly on its end point.

Here is how implemented my solution:

int solution(int x, int y, int d)
{
    assert(x < y); // 1
    assert(d > 0);

    int distance = y - x; // 2
    int steps = distance / d; // 3
    if(distance % d != 0) // 4
        ++steps;
    return steps;
}

1. Assuming no tricky input. The frog starts on the left and goes to the right.
2. This is the distance it has to travel.
3. The solution is just a matter of dividing the distance for the step size ...
4. ... plus one, in case our frog is not lucky enough to drop right on its path end.

Go to the full post

Codility Tape Equilibrium

The Codility Tape Equilibrium problem is marked as "painless" in their Time Complexity section. Basically, we have a vector in input, containing at least two items, and we want to split it in a way that minimize the difference between the two sections. Our function should return such a minimal value.

A few test cases would make clearer what we are looking for:

TEST(TapeEquilibrium, Given) // 1
{
    std::vector<int> input { 3, 1, 2, 4, 3 };
    ASSERT_EQ(1, solution(input));
}

TEST(TapeEquilibrium, Couple)
{
    std::vector<int> input { 3, 3 };
    ASSERT_EQ(0, solution(input));
}

TEST(TapeEquilibrium, Couple2)
{
    std::vector<int> input { 1000, -1000 };
    ASSERT_EQ(2000, solution(input));
}

TEST(TapeEquilibrium, HundredK) // 2
{
    std::vector<int> input(100000, 1000);

    EXPECT_EQ(0, solution(input));
}

TEST(TapeEquilibrium, HundredKMinusOne)
{
    std::vector<int> input(99999, 1000);

    EXPECT_EQ(1000, solution(input));
}

1. This is the test case provided by Codility. It should be easy to see how we minimize the difference by splitting the input in {3, 1, 2} and {4, 3}. The expected output of our function is one.
2. Hundred thousand items, each of them set to one thousand. Lot of work for our function, but it is easy to see how the solution should be zero.

If you wonder, these are C++11 tests written for the GoogleTest xUnit framework. If you are using an older C++ compiler, you could have a look to a previous version of this post, that also shows a bad, O(N**2), alternative solution.

My linear time solution starts creating two partial sums, left and right, where initially the left one contains only the leftmost element, and the right one all the others. Then, I would remove one element at the time from the right one and add it to the left one. Each time I should repeat the check to see if I find a better solution.

Here is how I implemented it:

int solution(std::vector<int>& input) // 1
{
    assert(input.size() > 1); // 2

    int left = input.front(); // 3
    int right = std::accumulate(input.begin() + 1, input.end(), 0); // 4
    int result = std::abs(left - right); // 5

    std::for_each(input.begin() + 1, input.end() - 1, [&](int cur){ // 6
        left += cur;
        right -= cur;
        int tentative = std::abs(left - right);
        if(tentative < result) // 7
            result = tentative;
    });

    return result;
}

1. Actually, the input vector could and should be a const reference. Codility requirements are sometimes a bit sloppy.
2. Not required by the problem, however I felt bad to assume such a requisite without enforcing it in the code is some way.
3. Initializing the left sum is easy.
4. The right sum requires a bit more work. To keep the code more readable I used the handy STL accumulate function.
5. Initialize the function result. Notice that we are interested in the difference about left and right without caring about its sign.
6. A nice lambda function in a for_each loop keeps the code compact and, hopefully, readable. I am starting to loop from the second element, and I would end one before the end, since no section should be empty.
7. This is a better candidate, save it.

Go to the full post

Shortest path by breadth-first search

Given an undirected unweighted graph that has no loop, we can use a basic breadth-first search algorithm to determine the shortest path from a specified vertex to any (other) one in the graph.

In the previous post I showed a way to store a graph in a C++11 STL container (a vector of forward list of unsigned int). This is the Graph class you are going to see in the code below. Besides, Vertex is a typedef for unsigned int, meaning that we identify a vertex simply by an unsigned number, starting from zero.

This is the class I want to develop:

class BreadthFirstSearch
{
private:
    std::vector<Vertex> parents_; // 1
    Vertex start_; // 2
public:
    BreadthFirstSearch(const Graph& graph, Vertex start); // 3
    std::vector<Vertex> path(Vertex end); // 4
    std::vector<Vertex> backpath(Vertex end); // 5
    void printPath(Vertex end); // 6
};

1. As a result of the breadth-first search algorithm I want to put in this vector the closest ancestor to the start one of any vertex in the associated graph.
2. This is the vertex that I want to use as starting point for BFS algorithm.
3. The ctor gets in input a Graph and one of its vertex and fills the parents_ vector as expected.
4. This method returns the path from the start vertex, as specified by the ctor, to the passed end one.
5. Not strictly a necessity, it could be a private method. Provides the shortest path in reversed order, from end to start.
6. Utility method, dump to standard output a shortest path.

The BreadthFirstSearch constructor implements the BFS algorithm in this way:

BreadthFirstSearch::BreadthFirstSearch(const Graph& graph, Vertex start) :
        parents_(graph.vertices_.size(), std::numeric_limits<Vertex>::max()), start_(start) // 1
{
    if (start >= graph.vertices_.size()) // 2
        return;

    std::vector<bool> seen(graph.vertices_.size()); // 3
    std::queue<Vertex> queue; // 4

    queue.push(start); // 5
    seen[start] = true;
    while (!queue.empty()) // 6
    {
        Vertex vertex = queue.front();
        queue.pop();

        for (auto it = graph.vertices_[vertex].begin(); it != graph.vertices_[vertex].end(); ++it) // 7
        {
            Vertex next = *it;
            if (!seen[next]) // 8
            {
                queue.push(next);
                seen[next] = true;
                parents_[next] = vertex;
            }
        }
    }
}

1. Initially, the parents_ vector contains just "no parent" elements. I used the largest value available for a Vertex to represent such state.
2. Event though this code is not production ready, I couldn't help to put at least a minimal error handling in it. The vertex passed as starting point should be an actual Graph element.
3. This vector keep track of all the vertices that have been already checked in a previous step of the algorithm. Initially no one is, so we could rely on the default behavior for the vector ctor that sets to false all its elements.
4. On this queue I'll put all the vertices that are connected to the one is currently checked.
5. Let's put the control variables in the initial condition. The start vertex is enqueued, and it is marked as seen.
6. Loop until all the elements in the queue are processed.
7. Loop on all the vertices connected to the current one.
8. If this vertex has not already processed, push it in queue, mark it as seen, set as its next the current vertex.

The BreadthFirstSearch ctor has filled the parents_ vector, now we can use it to create the shortest path from the start vertex to a specific one:

std::vector<Vertex> BreadthFirstSearch::path(Vertex end)
{
    std::vector<Vertex> backtrace = backpath(end); // 1
    return std::vector<Vertex>(backtrace.rbegin(), backtrace.rend()); // 2
}

1. Actually, the real job is delegated to backpath().
2. Since backpath() returns a reversed path, the only real task of this method is reverting the vector to return a "stright" one.

If you want the path to be stored in a vector, you will find that it is in the nature of the problem to generate a reversed solution. Or maybe you could use a deque, filling it from the front. Anyway, this is my solution:

std::vector<Vertex> BreadthFirstSearch::backpath(Vertex end)
{
    std::vector<Vertex> backtrace; // 1
    if (end >= parents_.size()) // 2
        return backtrace;

    for (Vertex cur = end; cur != std::numeric_limits<Vertex>::max(); cur = parents_[cur])
        backtrace.push_back(cur); // 3
    return backtrace;
}

1. I am going to backtracing from the passed graph vertex up to the starting one, pushing each parent in this vector.
2. Better to ensure the caller won't pass a nonexistent vertex.
3. Each ancestor is pushed back in the vector, until the "no parent" element is found.

The utility method that dumps the path to standard shows how to use backpath():

void BreadthFirstSearch::printPath(Vertex end)
{
    std::vector<Vertex> vxs = backpath(end);
    std::copy(vxs.rbegin(), vxs.rend(), std::ostream_iterator<Vertex>(std::cout, " "));
    std::cout << std::endl;
}

Here is a piece of code that uses my BreadthFirstSearch class:

BreadthFirstSearch bfs(graph, 0); // 1
bfs.printPath(3); // 2

1. I pass to bfs the graph object as created in the previous post example, and I specify zero as the starting vertex.
2. Ask to print the shortest path from zero to three.

The expected result is:

0 4 3

Go to the full post

Graph by adjacency list

If you need to work with graphs in your C++11 code, you'd usually rely on someone else's job, like the Boost Graph Library, friendly known as BGL. Sometimes it happens you simply can't, and you have to work it out by yourself. Here I am writing a trivial Graph class that would let me to store an undirected unweighted graph in a compact form.

I have a simple graph as the one showed in the picture. Each vertex is represented by an unsigned integer starting from zero, that helps me to keep the code even simpler. Edges have no weight nor direction, so we can move from vertex 0 to vertex 5 and vice versa, and we are not interested in the cost of moving from one vertex to another one. We only want to know if we can actually go from here to there.

The two common ways to represent a graph differ by using a matrix or a list to store the adjacency of each vertex. As often happens, you should know the actual problem you are tackling to decide which data structure would suit you better. Still, list is usually the primary suspect.

In this first implementation, my class Graph provides only a constructor to set it up and a print method to show what it has in its belly. The main focus here is about showing how the data is stored in it.

using Vertex = unsigned int; // 1
using Edge = std::pair<Vertex, Vertex>; // 2
using Edges = std::vector<Edge>; // 3
using Vertices = std::forward_list<Vertex>; // 4

class Graph
{
public:
    std::vector<Vertices> vertices_; // 5

    Graph(int nv, Edges edges) : vertices_(nv) // 6
    {
        std::for_each(edges.begin(), edges.end(), [this](const Edge& edge) // 7
        {
            if(edge.first < vertices_.size() && edge.second < vertices_.size()) // 8
            {
                vertices_[edge.first].push_front(edge.second); // 9
                vertices_[edge.second].push_front(edge.first);
            }
        });
    }

    void print() // 10
    {
        for(Vertex i = 0; i < vertices_.size(); ++i)
        {
            std::cout << i << ": ";
            std::copy(vertices_[i].begin(), vertices_[i].end(), std::ostream_iterator<Vertex>(std::cout, " "));
            std::cout << std::endl;
        }
    }
};

1. Each vertex is represented by an unsigned integer, starting from zero.
2. An edge is defined by the two vertices delimiting it.
3. I want to pass all the edges in my graph to the class constructor. This is the collection I am going to use for this task.
4. Any vertex in the graph has an associated collection of vertices, all the ones to which it is connected. The cheap C++11 forward_list suffices for this job.
5. A graph is a collection of Vertices. Each element in the vector is an actual vertex of the graph and the associated Vertices keeps track of all the connected vertices.
6. The Graph constructor requires as input the number of vertices in the graph, and all the edges on it. The data member vertices_ is initialized as a collection of empty Vertices.
7. Loop on all the passed edges to associate each vertex in the graph to its connections.
8. A real piece of code should have more effective error handling than this. Here I just discard any wrong edge. It would make sense let the user know that something went wrong.
9. Being the graph undirected, any edge creates two relations.
10. Utility method, just to show that anything worked as expected (hopefully).

Here is how my Graph class is used:

Edges edges { { 0, 1 }, { 0, 4 }, { 0, 5 }, { 1, 2 }, { 1, 4 }, { 2, 3 }, { 3, 4 } };
Graph graph(6, edges);
graph.print();

The expected output:

Go to the full post

Greedy algorithm for activity selection

A typical example of problem that has an optimal solution by implementing a greedy algorithm is the activity selection one, here is its description on wikipedia. In few words, we have a bunch of activities, identified by a start and end time, and we want to find a maximum selection of non-conflicting elements.

A couple of test cases (written in C++11 for GoogleTest) should clarify the problem:

typedef std::pair<int, int> Activity;
typedef std::vector<Activity> Activities;

TEST(ActSel, Simple)
{
  Activities input { {1, 2}, {5, 9}, {0, 6}, {8, 9}, {3, 4}, {5, 7} };

  Activities output = selectMax(input);
  ASSERT_EQ(4, output.size());
  for(unsigned i = 1; i < output.size(); ++i)
    ASSERT_LE(output[i-1].second, output[i].first);
}

TEST(ActSel, Simple2)
{
  Activities input { {1, 4}, {3, 5}, {0, 6}, {3, 9}, {5, 9}, {5, 7}, {6, 10}, {8, 11}, {8, 12}, {2, 14}, {12, 16} };

  Activities output = selectMax(input);
  ASSERT_EQ(4, output.size());
  for(unsigned i = 1; i < output.size(); ++i)
    ASSERT_LE(output[i-1].second, output[i].first);
}

In both cases I expect a selection of four Activity objects in output. In the first case these elements: (1, 2) (3, 4) (5, 7) (8, 9), in the second one (1, 4) (5, 7) (8, 12) (12, 16), or maybe (8, 11) instead of (8, 12). As you can see, there could be more solutions, and the problem doesn't require you to be particolary choosy. Once you maximize the number of selected items, the actual value of each of them is not an issue.

Still, I want to ensure in my test cases that I peak a valid solution, so I check, through ASSERT_LE, that all the elements in the extracted sequence are ordered as expected.

As said above, this problem has a greedy optimal solution. What we have to do is just sorting the input elements by their second component (the end time), and then greedily accepting all the elements we can. As in this implementation:

Activities selectMax(Activities& input) // 1
{
  std::sort(input.begin(), input.end(), [](Activity a, Activity b) { return a.second < b.second; }); // 2

  Activities output;
  output.push_back(input[0]); // 3

  for(unsigned i = 0, j = 1; j < input.size(); ++j) // 4
  {
    if(input[j].first >= input[i].second) // 5
    {
      output.push_back(input[j]);
      i = j;
    }
  }

  return output;
}

1. We don't mind if this function modify the input parameter, so it is passed as non-constant reference. Beware that this should be know and accepted by the callers.
2. The "normal" STL sort() function would order the passed sequence by its first component. So we need to use the overloaded version that let as pass a predicate to be used as comparator. Using a C++11 lambda function, as shown here, makes it simple and elegant.
3. The first element is always selected.
4. Now we are ready to loop on all the other elements in the sequence. The real looping variable is j, while i is used to keep track of the last accepted element.
5. The first element after the last accepted one that starts not before the end of it, is pushed in the output sequence.

Go to the full post

Rod cutting by dynamic programming

A typical problem that suits well to show how dynamic programming works. We have a rod sized up to, let's say, 10. We can freely cut it in pieces (integer sized) to sell them at the best price. Given a price table, find out the way to get the most from it.

Here is a C++11 test case for GoogleTest that should clarify the requirements:

typedef std::vector<int> Vector;

unsigned cutRod(const Vector& price, unsigned size);

TEST(CutRod, Simple)
{
  Vector price { 1, 5, 8, 9, 10, 17, 17, 20, 24, 30 };

  ASSERT_EQ(30, cutRod(price, 10));
  ASSERT_EQ(25, cutRod(price, 9));
  ASSERT_EQ(18, cutRod(price, 7));
  ASSERT_EQ(10, cutRod(price, 4));
}

Given that price list, we see immediately that if we have in input a rod sized up to 3, the best strategy is selling it in a single piece.
But if we have a rod sized four, selling it untouched we'll get 9. Better if we split it in two rodes both sized two, that give us 5 + 5 = 10.
Similarly, a rod sized 7 is priced 17. If we split it in two parts sized 6 and 1, we'll get 17 + 1 = 18.

Brute force

We may think to apply a recursive approach to this problem to check all the possible cut combinations we can think of. It is very easy to write the relative code, but can't we expect it to scale well:

unsigned cutRod(const Vector& price, unsigned size)
{
  unsigned result = 0;
  for(unsigned i = 0; i < size; ++i)
    result = std::max(result, price[i] + cutRod(price, size - (i+1)));

  return result;
}

It is just a matter of recursively calling our function reducing each time the size of the rod we are considering. We compare any time the partial result with the one we have previously stored, keeping just the best one.

Top-down dynamic programming

One obvious problem in the previous solution is that we solve again and again the same sub-problems. We could save lot of running time storing them in a buffer. This simple but effective idea is the basic of the dynamic programming technique.

In this context, the bargain of using space to avoid spending time repeating the same task to get a partial result is called memoization (as keeping a memo).

Here is a possible top-down implementation, very close to the naive version seen above:

unsigned cutRod(const Vector& price, unsigned size)
{
  Vector memo(size + 1, -1);
  memo[0] = 0;

  return memoCutRod(price, size, memo);
}

Here cutRod() just creates a memo vector that would store the values for each sub-problem, as soon as we get its result. Then it would start the recursion calling a support function.

Notice that the memo buffer has one element more than the price list. This is for storing also the value of the dummy cut sized zero. It is not a strict necessity, since we know that it won't cost anything, but it would help to make our code cleaner.

unsigned memoCutRod(const Vector& price, unsigned size, Vector& memo)
{
  if(memo[size] >= 0) // 1
    return memo[size];

  unsigned result = 0; // 2
  for(unsigned i = 0; i < size; ++i)
    result = std::max(result, price[i] + memoCutRod(price, size - (i+1), memo));

  return memo[size] = result; // 3
}

1. If the realtive memo buffer is not negative, we have already calculated it. Job already done.
2. Otherwise we calculate the best price as seen before.
3. And we set a memo before returning it.

Bottom-up approach

Again dynamic programming, still using memoization as we have just seen, but starting from the bottom of the problem and crawling up to its top. In this case the implementation is even simpler, and avoid us the pain and the cost of recursion:

unsigned cutRod(const Vector& price, unsigned size)
{
  Vector memo(size + 1); // 1
  for(unsigned i = 1; i <= size; ++i) // 2
  {
    int value = -1; // 3
    for(unsigned j = 0; j < i; ++j) // 4
      value = std::max(value, price[j] + memo[i-j-1]);
    memo[i] = value;
  }

  return memo.back(); // 5
}

1. As in the top-down approach, we get an extra element in the memo vector, just to keep simpler the code. But this time we don't need to initialize it to a "bad" values, because we are setting it up iteratively starting from the beginning.
2. First element in memo is already set to its expected value (that is, zero) as courtesy of the vector constructor. We need to calculate all the other elements, up to the rightmost one.
3. Initialize the current memo value to less than the minimum acceptable value (meaning, less than zero).
4. Basically it is the same loop we have seen in the previous implementations, but here we explicitly go for the smaller element first.
5. End of the story, the answer is stored in the rightmost memo element.

Check on github for full C++11 code.

Go to the full post

Quicksort

Quicksort is known to be a fast O(N lg N) divide and conquer sorting algorithm, in its average behavior. Still we have to pay attention to the worst case scenario, that brings it to a O(N ** 2) time cost.

The idea is repetitively partitioning the data collection, splitting it in two parts, in a way that a randomly chosen pivot would be equal or greater than the values on its left partition, and then call again the quicksorting procedure, until there is nothing more left to sort. As one could easily spot, is a possible bad choice of the pivot that could lead to poor performances.

The resulting code should be something like this:

void quicksort(std::vector<int>& data, int left, int right) // 1
{
  if(left < right) // 2
  {
    int pivot = partition(data, left, right); // 3
    quicksort(data, left, pivot - 1); // 4
    quicksort(data, pivot + 1, right);
  }
}

1. The function requires in input the collection on which it should operate and the indexes of its leftmost and rightmost elements.
2. Check if the interval is not empty.
3. Split the original interval in two parts. On the left side we have all the values less or equal to the value in the pivot element.
4. Call again quicksort on the left and right partitions. Notice that the pivot element is already in the right place, and don't need to be considered anymore.

We just need to partition a collection as expected:

int partition(std::vector<int>& data, int left, int right)
{
  int pivot = data[right]; // 1
  int index = left - 1; // 2

  for(int i = left; i < right; ++i) // 3
  {
    if(data[i] <= pivot) // 4
      std::swap(data[++index], data[i]);
  }

  std::swap(data[++index], data[right]); // 5
  return index;
}

1. OK, this doesn't look smart. As pivot we always select the rightmost element in the interval.
2. Initialize index to the first-before-beginning position in the interval.
3. Loop on all the items in the interval, but the last one (that is, the pivot).
4. If the current element value is less than the pivot, let's swap it with the first not already used element on the left.
5. Finally, we swap the pivot (rightmost value in the interval) with the element next to index.

Full C++ code on github.

Go to the full post

Subscribe to: Posts (Atom)