HackerRank Java Dequeue

Given n integers in input, consider each window in it sized m, where m is less or equal to n, to get the number of unique elements in it. Return its maximum value.

I found this problem in the HackerRank Java Challenges section with name Java Dequeue, as a hint that a Deque would helpful to solve it elegantly.

The HackerRank settings implies that data are passed us from System.in. To develop a method that could be easily tested, I made it accept an InputStream as parameter, then I create a Scanner on it, opened in a try-with-resource block.
public static int solution(InputStream is) {
    try (Scanner scanner = new Scanner(is)) {
        // ... see below
    }
}
The first to integers we expect in the input stream are the above described n and m. A problem constrain states that m is not bigger than n, I make it clear in the code with an assertion.
int n = scanner.nextInt();
int m = scanner.nextInt();
assert m <= n;
Now I should be ready to get the next n integers an work on them. The big hint in the problem name suggests to use a Deque. But which one? ArrayDeque looks promising, but it is a flawed choice, since we'll be push about all items in from one side and pop them from the other, this would lead to a useless continuous data shift in it. Much better using a LinkedList.

Since we work with the count of items in the window, a second container would be useful. We'll push any item entering the window in a hash table, storing as value its current count. When an item exits the window we decrease its count in the hash. If the count is zero, we remove it.

Since we are going to call our method many times in a row, it would be a waste create each time a deque and a hash. Better to make them class data member, and simply reset them when the method is called.
public class Solution {
    private static Deque<Integer> window = new LinkedList<>();
    private static Map<Integer, Integer> counter = new HashMap<>();

    public static int solution(InputStream is) {
        // ... see above

        window.clear();            
        counter.clear();
        
        // ... see below
    }
}
Let's fill up the window for the first time. The hash map 'counter' is set up as described above.
for (int i = 0; i < m; i++) {
    int in = scanner.nextInt();
    window.add(in);
    counter.merge(in, 1, Integer::sum);
}

int result = counter.size();

// ... see below
To count how many different integers are in the current window, I simply check the size of the hash map.

Now it is just a matter of iterating on all the other integer in the input stream.
for (int i = m; i < n && result < m; i++) {
    Integer out = window.remove();
    Integer in = scanner.nextInt();
    window.add(in);

    // ... see below
}

return result;
I have started looping on the m-th item, willing to go up to the n-th. But, wait, if I find a window with all different numbers I have already found the problem solution. That could save some CPU time.

Firstly I have adjusted the window, removing its first element and adding a new last element. The exiting and entering elements are used to modify the counter map.
if (out.intValue() != in.intValue()) { // 1
    counter.merge(in, 1, Integer::sum);  // 2
    counter.merge(out, -1, (a, b) -> a == 1 ? null : a + b); // 3
    result = Math.max(result, counter.size()); // 4
}
1. If what enters in the window is the same of what exists from it, I don't have to do anything.
2. I merge the item 'in' in the map. That means, if the map contains 'in', Integer::sum is used to adjust its value (on its current value and the '1' I passed in). Otherwise a new item is created in the map, key 'in', value '1'.
3. Similarly to the previous line, but now I'm performing a sort of 'merge down'. I'm not satisfied by this line I wrote, even though it is kind of fun. Its point is that we know for sure that 'out' is in the map, so we know that we are going to execute the lambda passed to merge(). It would return null if the value associated to 'out' is 1, removing it. Otherwise it would return a + b, but b is set to -1, so it would decrease it. The weak spot is, what if for 'out' is not in the map? Well, merge() would push it in, with value -1. Damn it. There is no way to avoid this inconsistency, since merge() would throw an exception if null is passed as value. End of the story is, I would not use this line in production code.
4. Maybe this check-and-set looks a bit too pythonic, what do you think about it?

Full Java code and test case available on GitHub.

Go to the full post

HackerRank Java Sort

We have a Student class, containing an int, a String and a double data member. We want read a bunch of Students from a data stream, push them all in list, sort them by the double - decreasing, then string, and then int. In the end we'll output just the student names.

This apparently boring little HackerRank problem shows how much more fun has become Java since functional programming has been introduced in it.

I won't touch the Student class, limiting my job to the main class where I create a Scanner on the input stream. Once I read the number of students to work with, Stream is my good friend.
Stream.generate(() -> new Student(sc.nextInt(), sc.next(), sc.nextDouble())) // 1
    .limit(count) // 2
    .sorted(Comparator // 3
        .comparingDouble(Student::getCgpa).reversed() // 4
        .thenComparing(Student::getFname) // 5
        .thenComparingInt(Student::getId)) // 6
    .map(Student::getFname) // 7
    .forEach(System.out::println); // 8
1. The stream comes up by data provided on the fly, reading them from the scanner - named sc. I read an int, a string, a double from it, I shovel them in the Student ctor, that is used in a lambda as supplier for the stream generated() method.
2. I don't want generate() to be called forever, the stream size is limited by the number of students, that I have previously read in 'count'.
3. I have to sort my stream, to do that I provide a Comparator.
4. Firstly I compare (primitive) doubles as provided by the Student.getCgpa() method. But, wait, I don't want the natural order, so I reverse it.
5. Second level of comparison, the Student first name.
6. Third level, the (primitive) int representing the Student id should be used.
7. Almost ready to print the result. Since I'm interested only in the student names, I map each student to her/his name only.
8. Loop on the stream, containing the opportunely sorted student names, and print each one on a new line.

You could get the full Java code from GitHub.

Go to the full post

HackerRank Java 1D Array (Part 2)

Don't be misled by its puzzling name, this fun little HackerRank problem has its main interest not in its implementation language (I should admit that I designed it thinking more pythonesquely than javaly) nor in the use of the array data structure, but in the algorithm you should think to solve it.

There is a one dimensional board. Each cell in it could be set to zero or one. At startup the pawn is placed on the first cell to the left, and its job is going out to the right. It could be moved only on zero cells moving forward by 1 or 'leap' positions, and backward just by one.

Having specified the zeros and ones on the board, and the value of leap (non negative and not bigger than the board size), can we say if the game could be won?

It looks to me that Dynamic Programming is the a good approach to be used here. The problem splits naturally in a series of simple steps, that contribute to the requested solution. There's one issue that muddies the water, the backward move. The rest of the algorithm is pretty straightforward. Start from the last cell in the board. Determine if a pawn could be placed there and if this would lead to a win. Move to the next one of its left, up to the leftmost one. Then return the win condition for the first cell.

I implemented my idea in this way using, as required, Java as implementation language.

Firstly, I create a cache to store all the intermediate values:
boolean[] memo = new boolean[game.length];
if (game[game.length - 1] == 0)
    memo[memo.length - 1] = true;
They are all initialized to false, but the last one, and only if a pawn could go there.

Having set this initial condition, I could loop on all the other elements.
for (int i = memo.length - 2; i >= 0; i--) {
    // ... see below
}

return memo[0];
In the end, the first cell of my cache would contain the solution.

Each cell in the board is set to true if one of these conditions holds:
if (game[i] == 0 && (i + leap >= game.length || memo[i + 1] || memo[i + leap])) {
    memo[i] = true;
    // ... see below
}
First thing, check the current value in the board. If it is not zero, I can't put the pawn there, so there is nothing to do, the cache value surely stays set to false.
Otherwise, one of the following three conditions lead to true in cache:
- adding the current position to the leap value pushes the pawn out of the board
- the next cell in the board is a good one
- the next leap cell is a good one.

Now the funny part. When I set to true a cell, I should ensure that a few cells to the right are marked as good, too. This is due to the backward move. It is a sort of domino effect, that should stop when we find a cell where our pawn can't go.
for (int j = i + 1; j < memo.length && game[j] == 0; j++)
    memo[j] = true;
Full Java code and test case pushed on GitHub.

Go to the full post

Partitioning Souvenirs (patched)

Given a list of integers, we want to know if there is a way to split it evenly in three parts, so that the sum of each part is the same than the other ones.

Problem given in week six of the edX MOOC Algs200x Algorithmic Design and Techniques by the UC San Diego.

My first solution, as you can see in a previous post (please check it out for more background information), was accepted in the MOOC but didn't work in a few cases, as Shuai Zhao pointed out.

So, I added the Shuai test case to my testing suite, and then a couple of other ones, that I felt would help me in writing a more robust solution:
def test_shuai(self):
    self.assertFalse(solution([7, 2, 2, 2, 2, 2, 2, 2, 3]))

def test_confounding_choice(self):
    self.assertFalse(solution([3, 2, 2, 2, 3]))

def test_duplicates(self):
    self.assertTrue(solution([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]))
Firstly, I rewrote the initial check, that now looks in this way:
def solution(values):
    third, module = divmod(sum(values), 3)  # 1
    if len(values) < 3 or module or max(values) > third:  # 2
        return False
1. Add up the passed values, our target is its third, and surely we can't split it there is a remainder after the division. Using divmod() we get both values in a single shot.
2. If we have less than three values in input, a remainder by dividing their sum by three, or if there is a value bigger than a third of the sum, our job is already done.

Then, since I saw that the problem was due to the lack of check in the ownership for values, I added a shadow table to the one I'm using for the Dynamic Programming procedure. I named it 'taken', and it keeps the id of the owner for the relative value for the current row. Zero means the value is currently not taken.
table = [[0] * (len(values) + 1) for _ in range(third + 1)]
taken = [[0] * (len(values) + 1) for _ in range(third + 1)]
As before, I apply the standard DP procedure, looping on all the "valid" elements.
for i in range(1, third + 1):
    for j in range(1, len(values) + 1):
        # ...
Another not a substantial change, it occurred to me that I keep track of a maximum of two positive matches in the table, so, if I have already reached that value, there's no need in doing other checks, just set the current value to two.
if table[i][j-1] == 2:
    table[i][j] = 2
    continue
The next check was already in, I just refactored it out, because the combined if clause in which it was included was getting too complicated, and I added the first use of the taken table.
if values[j-1] == i:
    table[i][j] = 1 if table[i][j-1] == 0 else 2  # 1
    taken[i][j] = j  # 2
    continue

ii = i - values[j-1]  # 3
1. If the current value is the same of the current row value, I have found a match. So I increase its value in the DP table
2. And I mark it as taken for the j value in the current row.
3. If j is not an exact match for i, I split it, looking for the difference in the previous column.

Now it's getting complicated. If there is a good value in the previous column, before using it we have to ensure it is available. And, since it could have been the result of a splitting, we need to check on the full row to do our job correctly.
available = True
taken_id = taken[ii][j-1]  # 1
if taken_id:
    for jj in range(1, j):  # 2
        if taken[ii][jj] == taken_id:  # 3
            if taken[i][jj]:
                available = False
                break
1. The current value id from the taken table is zero if was not used in the 'ii' row.
2. If the current value has a valid id, we need to ensure it has not be already used in current 'i' row.
3. The current jj is marked as used in the ii row for the id we care about, if it is taken also in the i row, we have a problem. We are trying to using in 'i' a value that has been already used. This was the check I missed.

Good, now I can safely going on with my DP algorithm.
if ii > 0 and table[ii][j - 1] > 0 and not taken[i][j - 1] and available:
    table[i][j] = 1 if table[i][j - 1] == 0 else 2
    
    // ...
The annoying thing is that I have to set the taken table too:
taken[i][j] = j  # 1
taken[i][j-1] = j

matcher = values[j-1] + values[j-2]  # 2
if taken_id:
    for jj in range(1, len(values)):  # 3
        if taken[ii][jj] == taken_id:
            taken[i][jj] = j
            matcher += values[jj-1]
if matcher < i:
    for jj in range(j-2, 0, -1):  # 4
        if taken[i][jj] or matcher + values[jj-1] > i:
            continue
        matcher += values[jj-1]
        taken[i][jj] = j
        if matcher == i:
            break
1. This is nice and simple. The current j and the previous one should be marked as j.
2. We have to mark all the other values concurring to generate the current sum. These are two just marked 'j' on this row, all the ones marked as stored in taken_id on 'ii' and, if they are not enough, also the ones free on 'i', to the get the expected result. To perform this last check I need another variable, that I called 'matcher'.
3. Mark on 'i' as 'j' all the items marked taken_id on 'ii'. And adjust matcher.
4. If needed, loop on the left elements looking for the missing values.

And finally:
        # ...
        else:  # 1
            table[i][j] = table[i][j - 1]

return True if table[-1][-1] == 2 else False  # 2
1. This 'else' follows the 'if ii > 0 and ...' above, ensuring the current cell is adjourned correctly if nothing else happened before.
2. The bottom right cell in our DP table should be set to 2 only if we find a correct 3-way split.

See full python code and test cases on GitHub.

Go to the full post

HackerRank Deque-STL

Given an array of integers, find the max value for each contiguous subarray in it sized k. This HackerRank problem is meant to be solved in C++ and, as its name suggests, using a deque.

For instance, if we are given the array {3, 4, 6, 3, 4} and k is 2, we have to consider four subarrays sized:
{3,4} {4,6} {6,3} {3,4} 
And the expected solution is
{4, 6, 6, 4}
An adapter

The original HackerRank problem asks to write a function than outputs its result to standard output. I didn't like much this requisite. As a TDD developer, I'm used to let tests drive the code development. And having to check standard output to verify a function behavior is not fun. So I slightly changed the function signature, asking to return a vector containing the results, and I used the original function as a simple adapter to the original problem. Something like that:
std::vector<int> maxInSubs(int data[], int n, int k)
{
    // ...
}

// ...
void printKMax(int arr[], int n, int k)
{
    auto data = maxInSubs(arr, n, k);
    std::copy(data.begin(), data.end(), std::ostream_iterator<int>(std::cout, " "));
    std::cout << '\n';
}
First (naive) attempt

Just do what we are asked to to. For each subarray find its maximum value ad push it to the result vector.
std::vector<int> maxInSubs(int data[], int n, int k)
{
    std::vector<int> results;
    for (int i = 0; i < n - k + 1; ++i)
    {
        results.push_back(*std::max_element(data + i, data + i + k));
    }
    return results;
}
Clean and simple and, when k and n are small, not even too slow. However, for k comparable to a large n we can say bye bye to performance.

Patched naive attempt

We could be tempted to save the algorithm explained above, observing that the slowness is due to the k calls to max_element(). We could avoid to call it a substantial number of times checking the value of the elements exiting and entering the current window, for instance in this way:
std::vector<int> results{ *std::max_element(data, data + k) };  // 1

for (size_t beg = 1, end = k + 1; end <= n; ++beg, ++end)  // 2
{
    if (data[end - 1] > results[results.size() - 1])  // 3
    {
        results.push_back(data[end - 1]);
    }
    else if (data[beg - 1] < results[results.size() - 1])  // 4
    {
        results.push_back(results[results.size() - 1]);
    }
    else  // 5
    {
        results.push_back(*std::max_element(data + beg, data + end));
    }
}
1. Initialize the result vector with the max element for the first interval.
2. Keep beg and end as loop variable, describing the current window to check.
3. The new right element of the window is bigger than the max for the previous window. Surely it is the max for this one.
4. The element that has just left the window is smaller than the previous max. Surely the max is still in the window.
5. Otherwise, we'd better check which is the current max.

A smartly designed array in input could beat this simple algorithm. However on HackerRank they didn't spend too much time on this matter, and this solution is accepted with full marks.

Solution with a deque

In a more elegant solution, we should to minimize the multiple check we perform on the data elements. Right, but how? Until this moment, I haven't paid attention to the huge hint HackerRank gave us, "Use a deque!", they shout from the name of the problem itself.

The point is that I want to perform a cheap cleanup on each window, so that I could just pick a given element in it, without scanning the entire interval.

Let's use the deque as a buffer to store only the reasonable candidates as max. Since we want to remove from this buffer the candidates that are not anymore valid when the window is moved, instead of their value we keep in it their indices from the original data array.

Here is how I initialize it:
std::deque<int> candidates{ 0 };  // 1
for (int i = 1; i < k; ++i)
{
    pushBack(candidates, data, i);  // 2
}
1. We could safely say that the first element in data is a good candidate as max for its first subarray.
2. Push back to candidates the "i" index from data, but first ensure the previous candidates are useful.

Since the code in pushBack() is going to be used also afterward, I made function for it:
void pushBack(std::deque<int>& candidates, int data[], int i)
{
    while (!candidates.empty() && data[i] >= data[candidates.back()])  // 1
        candidates.pop_back();
    candidates.push_back(i);
}
1. There is no use in a candidate, if the newcomer is bigger, so remove it.

Now candidates contains the indices of all the elements in the first window on data having the max value. Possibly just one element, but for sure the deque is not empty.

We are ready for the main loop:
for (int i = k; i < n; ++i)
{
    results.push_back(data[candidates.front()]);  // 1

    if (candidates.front() <= i - k)  // 2
        candidates.pop_front();

    pushBack(candidates, data, i);  // 3
}
results.push_back(data[candidates.front()]);  // 4
1. As said above, we know that candidates is not empty and its front is the index of a max value in the current window. Good. Push it to results.
2. Now we prepare for the next window. If the front candidate is out, we remove it.
3. Push back the new element index among the candidates, following the algorithm described above. It would kill the candidates that are not bigger than it, ending up with a deque where the biggest element is surely on front.
4. Remember to push the last candidate in the results, and then the job is done.

Does this solution look more convincing to you? Full C++ code and test case on GitHub.

Go to the full post

HackerRank Equal

We have a list of integers, and we want to know in how many rounds we could make them all equal. In each round we add to all the items in the list but one the same number chosen among 1, 2, and 5. Pay special attention when you solve this problem on HackerRank. Currently (April 2018) its enunciation says each increase should be of one, three, or five. However, this is not what the solution is tested for.
It looks like someone decided to change 2 for 3 a few months ago, edited the description and then forgot to modify the testing part. Who knows what is going to happen next. Be ready to be surprised.
Besides, it is inserted in the Algorithms - Dynamic Programming section, but I don't think that is the right approach to follow.

Simplifying the problem

Instead of applying the weird addition process stated in the problem, we could decrease each time a single item. For instance, given in input [2, 2, 3, 7], a solution to the original problem is:
2, 2, 3, 7
+5 +5 +5  = (1)
 7, 7, 8, 7
+1 +1  = +1 (2)
 8, 8, 8, 8
We could solve it in this way instead:
2, 2, 3, 7
 =  =  = -5 (1)
 2, 2, 3, 2
 =  = -1  = (2)
 2, 2, 2, 2
Since we are asked to calculate the number of steps to get to the equal state, in both ways we get the same result.

Base cases

A sort of rough algorithm is already emerging. We get the lowest value in the list, and decrease all the other elements using the available alternatives until we reach it. We start from the biggest value (5) and only when we are forced to, we fall back to the shorter steps.

To better understand how we should manage the last steps, I put the base cases on paper.
If x is at level 1, 2 or 5, we could get zero at cost 1. But if x is at level 3 or 4, it costs 2 to us. Notice that if, instead of getting to level zero, we move both the lowest element and the x element at level -2, we get to the result in the same number of moves. For instance, if our input list is [10, 13]
10, 13
    -2 (1)
    -1 (2)
10, 10
 
10, 13
-2     (1)
    -5 (2)
 8,  8
If the input list is [0, 3, 3] getting down to the bottom from 3 in just one move gives us an advantage:
10, 13, 13
    -2      (1)
    -1      (2)
        -2  (3)
        -1  (4)
10, 10, 10
 
10, 13, 13
-2          (1)
    -5      (2)
        -5  (3)
 8,  8,  8
The algorithm

I think I've got it. Firstly I find the minimum value, M, in the input list. That is a possible target for all the items to be reach. However I have to check other two alternatives, M-1 and M-2.
I loop on all the items in the list. For all the three possible targets, I calculate the difference between it and the current value, count the number of steps to get there, and add it to the total number of steps required for getting to that target.
And then I choose as a result the cheapest target to reach.

The code

Using Python as implementation language, I started with a couple of test cases, and then added a few ones along the way, when I bumped into troubles, and I ended up with this code.
SHIFT = [0, 1, 2]  # 1


def solution(data):
    lowest = min(data)  # 2

    results = [0] * len(SHIFT)  # 3
    for item in data:
        for i in SHIFT:
            gap = item - lowest + i  # 4
            results[i] += gap // 5 + SHIFT[(gap%5 + 1) // 2]  # 5
    return min(results)  # 6
1. Represents the three possible targets, from the minimal value in the list down to -2.
2. Get the minimal value in the list.
3. Buffer for the results when using the different targets.
4. Distance that has to be split.
5. Add the current number of steps to the current buffer. Firstly I get the number of long moves dividing gap by 5. Now I am in the base case, as showed in the picture above. Notice that the cost of moving from X to target is [0, 1, 1, 2, 2] for gap in [0..5], if we take gap, apply modulo five, increase it and then divide by two, we get the index in SHIFT to the number of steps actually needed. Admittedly an obscure way to get there, if this was production code, I would have probably resisted temptation to use it.
6. Get the minimal result and return it.

All tests passed in hackerrank, python script and test case pushed to GitHub.

Go to the full post

HackerRank Roads and Libraries

We are given n nodes and a (possibly huge) number of edges. We are also given the cost of building a library in a city (i.e. a node) and a road (i.e. an edge). Based on these data we want to minimize the cost of creating a forest of graphs from the given nodes and edges, with the requirement that each graph should have a library on one of its nodes. This is a HackerRank problem on Graph Theory algorithms, and I am about to describe my python solution to it.

If a library is cheaper than a road, the solution is immediate. Build a library on every node.
def solution(n, library, road, edges):
    if road >= library:
        return n * library

    # ...
Otherwise, we want to create a minimum spanning forest, so to minimize the number of roads, keeping track of the number of edges used and how many graphs are actually generated. I found natural using an adapted form of the Kruskal MST (Minimum Spanning Tree) algorithm, that looks very close to our needs.

Kruskal needs a union-find to work, and this ADT is not commonly available out of the box. So, I first implemented a python UnionFind class, see previous post for details.
Then, while working on this problem, I made a slight change to it. My algorithm was simpler and faster if its union() method returned False when nothing was actually done in it, and True only if it led to a join in two existing graph.

Using such refactored UnionFind.union(), I wrote this piece of code based on Kruskal algorithm:
uf = UnionFind(n)
road_count = 0  # 1

for edge in edges:
 if uf.union(edge[0] - 1, edge[1] - 1):  # 2
  road_count += 1  # 4
  if uf.count == 1:  # 5
   break
1. The union-find object keeps track of the numbers of disjointed graphs in the forest, but not of edges. This extra variable does.
2. I need to convert the edges from 1-based to 0-based convention before use them. If the two nodes get connected by this call to union(), I have some extra work to do.
4. An edge has been used by union(), keep track of it.
5. If union() connected all the nodes in a single graph, there is no use in going on looping.

Now it is just a matter of adding the cost for roads and libraries to get the result.
return road_count * road + uf.count * library

Complete python code for problem, union-find, and test case on GitHub.

Go to the full post

Union Find

I have a (possibly huge) bunch of edges describing a forest of graphs, and I'd like to know how many components it actually has. This problem has a simple solution if we shovel the edges in a union-find data structure, and then just ask it for that piece of information.

Besides the number of components, our structure keeps track of the id associated to each node, and the size of each component. Here is how the constructor for my python implementation looks:
class UnionFind:
    def __init__(self, n):  # 1
        self.count = n
        self.id = [i for i in range(n)]  # 2
        self.sz = [1 for i in range(n)]  # 3
1. If the union-find is created for n nodes, initially the number of components, named count, is n itself.
2. All the nodes in a component have the same id, initially the id is simply the index of each node.
3. In the beginning, each node is a component on its own, so the size is initialized to one for each of them.

This simple ADT has two operations, union and find, hence its name. The first gets in input an edge and, if the two nodes are in different components, joins them. The latter returns the id of the passed node.

Besides, the client code would check the count data member to see how many components are in. Pythonically, this is exposure of internal status is not perceived as horrifying. A more conservative implementation would mediate this access with a getter.

Moreover, a utility method is provided to check if two node are connected. This is not a strict necessity, still makes the user code more readable:
def connected(self, p, q):
    return self.find(p) == self.find(q)
The meaning of this method is crystal clear. Two nodes are connected only if they have the same id.

In this implementation, we connect two nodes making them share the same id. So, if we call union() on p and q, we'll change the id of one of them to assume the other one. Given this approach, we implement find() in this way:
def find(self, p):
    while p != self.id[p]:
        p = self.id[p]
    return p
If the passed p has id different from its default value, we check the other node until we find one that has its original value, that is the id of the component.

We could implement union() picking up randomly which id keep among the two passed, but we want keep low the operational costs, so we work it out so to keep low the height of the tree representing nodes in a component, leading to O(log n) find() complexity.
def union(self, p, q):
    i = self.find(p)
    j = self.find(q)
    if i != j:  # 1
        self.count -= 1  # 2
        if self.sz[i] < self.sz[j]:  # 3
            self.id[i] = j
            self.sz[j] += self.sz[i]
        else:
            self.id[j] = i
            self.sz[i] += self.sz[j]
1. If the two nodes are already in the same component, there is no need of doing anything more.
2. We are joining two components, their total number in the union-find decrease.
3. This is the smart trick to keep low the cost of find(). We decide which id to keep as representative for the component accordingly with the sizes of the two merging ones.

As example, consider this:
uf = UnionFind(10)
uf.union(4, 3)
uf.union(3, 8)
uf.union(6, 5)
uf.union(9, 4)
uf.union(2, 1)
uf.union(5, 0)
uf.union(7, 2)
uf.union(6, 1)
I created a union-find for nodes in [0..9], specifying eight edges among them, from (4, 3) to (6, 1).
As a result, I expect two components and, for instance, to see that node 2 and node 6 are connected, whilst 4 and 5 not.

I based my python code on the Java implementation provided by Robert Sedgewick and Kevin Wayne in their excellent Algorithms, 4th Edition, following the weighted quick-union variant. Check it out also for a better algorithm description.

I pushed to GitHub full code for the python class, and a test case for the example described above.

Go to the full post

HackerRank Climbing the Leaderboard

We are given two lists of integers. The first one is monotonically decreasing and represent the scores of the topmost players in a leaderboard. The second one is monotonically increasing and contains the score history of Alice, a player who rapidly climbed the board.
Following the dense ranking convention, we want to get back a list containing the history of rank positions for Alice.
This is a HackerRank Algorithm Implementation problem, and I am going to show you how I solved it, using Python as implementation language.

I noticed that the first list, scores, is already sorted, we just have to get rid of duplicates to have a matching between the position and the score Alice has to get to achieve that ranking.

Bisect solution

I just have to do the matching. First idea jumped to my mind, was performing a binary search on scores to do it. It helps that Python provides for the job a well known library, bisect. There's just a tiny issue, bisect expects the underlying list to be sorted in natural order, so we need to reverse our scores.

It looks promising, let's implement it.

A pythonic way to get our ranking would be this:
ranking = sorted(list(set(scores)))
I get the original list, convert to a set to get rid of duplicates, than back to list, so that I can sort it in natural order. Nice, but in this problem we are kind of interested in performance, since we could have up to 20 thousand items in both lists. So we want to take advantage of the fact that the list is already sorted.

So, I ended up using this rather uncool code:
ranking = [scores[-1]]
for i in range(len(scores)-2, -1, -1):
 if scores[i] > ranking[-1]:
  ranking.append(scores[i])
I initialize the ranking list with the last item in scores, then I range on all the other indices in the list from right to left. If the current item is bigger than the latest one pushed in ranking, I push it too.

Now I can just rely on the bisect() function in the bisect python module, that would find which position the current item should be inserted in the list. With a caveat, I have reverted the order, so I have to adjust the bisect() result to get the result I'm looking for:
results = []
last = len(ranking) + 1
for score in alice:
 results.append(last - bisect(ranking, score))
This code pass all the tests, job done.

However. Do I really need to pay for the bisect() search for each element of alice?

Double scan solution

Well, actually, we don't. Since we know that both list are sorted, we can use also the ordering in alice to move linearly in ranking.

Since we are not using anymore bisect, we don't need to revert the sorting order in ranking, and the duplicate cleanup is getting a bit simpler:
ranking = [scores[0]]
for score in scores[1:]:
 if score < ranking[-1]:
  ranking.append(score)

Now we compare each item in alice against the items in ranking moving linearly from bottom to head:
results = []
for score in alice:
 while ranking and score >= ranking[-1]:
  ranking.pop()
 results.append(len(ranking) + 1)
We don't have to be alarmed by the nested loops, they don't have a multiplicative effect on the time complexity, since we always move forward on both lists, the result is a O(M + N) time complexity.

Is this a better solution than the first one? Well, it depends. We should know more on the expected input. However, for large and close values of N and M, it looks so.

I pushed the python script for both solutions and a test case to GitHub.

Go to the full post

HackerRank Divisible Sum Pairs

Given a list of integers, we want to know how many couples of them, when summed, are divisible by a given integer k.

So, for instance, given [1, 2, 3, 4, 5, 6], we have five couples of items with sum divisible by 3:
(1, 2), (1, 5), (2, 4), (3, 6), (4, 5)
This is a HackerRank algorithm problem, implementation section.

Naive solution

Test all the couples, avoiding duplicates. If we check (a1, a2), we don't have to check (a2, a1).

The code I have written for this solution should be of immediate comprehension, even if you are not that much into Python:
result = 0
for i in range(len(values) - 1):  # 1
    for j in range(i+1, len(values)):  # 2
        if (values[i] + values[j]) % k == 0:  # 3
            result += 1
1. Loops on all the indices in the list but the last one.
2. Loops from the next index to the current "i" to the last one.
3. Check the current couple, and increase the result if compliant.

Even if this is what HackerRank was expecting from us (the problem is marked as easy), we can do better than this, considering its disappointing O(N^2) time complexity.

Linear solution

The problem could be restated as counting the couples that, added up, equal to zero modulo k. Following this insight, let's partition the items accordingly to their own modulo.
remainders = [0] * k
for i in range(len(values)):
    remainders[values[i] % k] += 1
Each element in the "remainders" list represents the number of items in the original list having as modulo the index of the element.

For the example shown above we'll get this remainders:
[2, 2, 2]
Now, if we add an element having remainder x to element with remainder k - x, we'll get a number equal zero modulo k. We want all the possible combinations of the x elements with the k - x ones, so we apply the Cartesian product to the two sets, that has a size that is their product.

There are a couple of exceptions to this rule. The elements having modulo zero have to be added among themselves, and the same happens to the element having as modulo half k, if k is even. The number of combinations of a set of N elements could be expressed as N * (N-1) / 2.

Putting all together we have this code:
result = remainders[0] * (remainders[0] - 1) // 2  # 1

pivot = k // 2  # 2
if k%2:
    pivot += 1  # 3
else:
    result += remainders[k//2] * (remainders[k//2] - 1) // 2  # 4

for i in range(1, pivot):  # 5
    result += remainders[i] * remainders[k-i]
1. Initialize "result" using the above described formula for the modulo zero items.
2. Let's calculate the central element in the list, where we have stop looping to sum up.
3. If k is odd, we won't have a lonely central element, and the pivot should be moved a step to the right.
4. When k is even, the elements having half-k modulo are managed as the zero modulo ones.
5. Normal cases.

After the for-loop, result contains the answer to the original question.

I pushed a python script with both solutions, naive and remainder based, to GitHub, along with a few tests.

Go to the full post

Boost ASIO echo UDP asynchronous server

A change request for the echo UDP client-server app discussed before. We want keep the client as is, but we need the server be asynchronous.

Instead of using the synchronous calls receive_from() and send_to() on a UDP socket, we are going to use their asynchronous versions async_receive_from() and async_send_to().

The asynchronicity leads naturally to implement a class, having a socket has its private data member, so that we can make our asynchronous call on it.
const int MAX_LEN = 1'024;
const uint16_t ECHO_PORT = 50'015;

class Server
{
private:
    udp::socket socket_;  // 1
    udp::endpoint endpoint_;  // 2
    char data_[MAX_LEN];  // 3

// ...
1. Our ASIO UDP socket.
2. The endpoint we use to keep track of the client currently connected to the server.
3. Data buffer, used to keep track of the message received from the client.

The constructor gets the app ASIO I/O context by reference from the caller and uses it to instantiate its member socket. Then it calls its private method receive() to start its endless job.
Server(ba::io_context& io) : socket_(io, udp::endpoint(udp::v4(), ECHO_PORT))  // 1
{
    receive();
}

void receive()
{
    socket_.async_receive_from(ba::buffer(data_), endpoint_, [this](bs::error_code ec, std::size_t len) {  // 2
        if (!ec && len > 0)  // 3
        {
            send(len);
        }
        else
        {
            receive();  // 4
        }
    });
}
1. The socket requires also an endpoint describing the associated protocol and port. We create it on the fly.
2. Call asynchronously receive_from on the socket. ASIO would put in the data buffer what the client sends and store its network information in the passed endpoint. When the socket receive is completed, ASIO would call the handler passed as third parameter, here a lambda that captures "this" and honors the expected parameters.
3. If the receiving worked fine - no error_code reported - and the message is not empty, we'll call our Server send() method, to echo the message.
4. Otherwise - error or empty message - we don't have to send anything back, so we call the enclosing receive() method, to serve a new client.

When a "good" message is received from a client, our server sends it back to it as is:
void send(std::size_t len)
{
    socket_.async_send_to(ba::buffer(data_, len), endpoint_, std::bind(&Server::receive, this));
}
The socket gets the job of asynchronously send the data, as stored in the Server member variable, with the length, as passed in as parameter, to the endpoint saved as Server member variable. When the data transfer is completed, ASIO would call the handler passed as third argument. Here we don't want to do anything in case or error, not even logging something, so we can simply bind to the handler a call to "this" receive(), ignoring error code and length of the transferred data.

I pushed the complete C++ source file to GitHub. The code is based on the UDP asynchronous echo example in the official Boost ASIO documentation.

Go to the full post

Boost ASIO echo TCP asynchronous server

Let's refactor the echo TCP server to achieve asynchrony. It's going to be a lot of fun. If you feel that it is too much fun, you could maybe have first a look at the similar but a bit simpler asynchronous server discussed in a previous post.

Main

This server works with the same clients as seen for the synchronous server, here we deal just with the server. All the job is delegated to the Server class, whose constructor gets a reference to the application ASIO I/O context.
namespace ba = boost::asio;
// ...

Server server(io);
io.run();
Server

The server ctor initialize its own acceptor on the ASIO I/O context on the endpoint specifying the TCP IP protocol and port chosen, then it calls its private member method accept():
using ba::ip::tcp;
// ...

const uint16_t ECHO_PORT = 50'014;
// ...

class Server
{
private:
 tcp::acceptor acceptor_;

 void accept()
 {
  acceptor_.async_accept([this](bs::error_code ec, tcp::socket socket)  // 1
  {
   if (!ec)
   {
    std::make_shared<Session>(std::move(socket))->read();  // 2
   }
   accept();  // 3
  });
 }
public:
 Server(ba::io_context& io) : acceptor_(io, tcp::endpoint(tcp::v4(), ECHO_PORT))
 {
  accept();
 }
};
1. As handler to async_accept() is a lambda that gets as parameters an error code that let us know if the connection from the client has been accepted correctly, and the socket eventually created to support the connection itself.
2. A beautiful and perplexing line. We create a shared_prt smart pointer to a Session created from a rvalue reference to the socket received as parameter, and call on it its read() method. However this anonymous variable exits its definition block on the next line, so its life is in danger - better see what is going on in read(). Actually, we are in danger, too. If something weird happens in this session object, we don't have any way to do anything about.
3. As soon as a Session object is created, a new call to accept() is issued, an so the server puts itself in wait for a new client connection.

Session

As we have seen just above, we should expect some clever trick from Session, especially in its read() method. Thinking better about it, it is not a big surprise seeing that its superclass is enable_shared_from_this:
class Session : public std::enable_shared_from_this<Session>
{
private:
 tcp::socket socket_;
 char data_[MAX_LEN];

// ...
public:
 Session(tcp::socket socket) : socket_(std::move(socket)) {}  // 1

 void read()  // 2
 {
  std::shared_ptr<Session> self{ shared_from_this() };  // 3
  socket_.async_read_some(ba::buffer(data_), [self](bs::error_code ec, std::size_t len) {  // 4
   if (!ec)
   {
    self->write(len);
   }
  });
 }
};
1. The ctor gets in the socket that we seen was created by the acceptor and moved in, in its turn, the constructor moves it to its data member.
2. The apparently short lived Session object created by the handler of async_accept() calls this method.
3. A new shared_ptr is created from this! Actually, being such, it is the same shared_prt that we have seen in the calling handler, just its use counter increased. However, our object is still not safe, we need to keep it alive until the complete read-write cycle between client and server is completed.
4. We read asynchronously some bytes from the client. To better see the effect, I have set the size of the data buffer to a silly low value. But the more interesting part here is the handler passed to async_read_some(). Notice that in the capture clause of the lambda we pass self, the shared pointer from this. So our object is safe till the end of the read.

So far so good. Just remember to ensure the object doesn't get invalidated during the writing process:
void write(std::size_t len)
{
 std::shared_ptr<Session> self{ shared_from_this() };
 ba::async_write(socket_, ba::buffer(data_, len), [self](bs::error_code ec, std::size_t) {
  if (!ec)
  {
   self->read();
  }
 });
}
Same as in read(), we ensure "this" stays alive creating a shared pointer from it, and passing it to the async_write() handler.

As required, as the read-write terminates, "this" has no more live references. Bye, bye, session.

I have pushed my C++ source file to GitHub. And here is the link to the original example from Boost ASIO.

Go to the full post