Another interesting MOOC for Java developer

I've just completed the EdX MOOC Data Structures and Software Design by University of Pennsylvania. I would suggest it to someone with a high-beginner or lower-intermediate experience as Java software developer. I think that the lessons would sound reasonably interesting even if your level is higher, however you could be get a bit bored by the exercises - kind of too simple. This is why I didn't push anything on my GitHub account. To have some fun, I didn't use any IDE, just solved the problems in the old way, using a dumb editor, compiling and running them from command line :D Homework 11 is, in my opinion, the most interesting one in the course. We have to refactor a poorly written, albeit working, piece of code to make it fast enough to match the requirements.

Go to the full post

A nice Java testing MOOC

I've just finished the first week of the MOOC "Automated Software Testing: Practical Skills for Java Developers" provided by TU Delft on EdX. It looks to me they did a good job. Maybe, if you are a native English speaker, the speakers' accent could sometimes sound a bit weird. However, as continental European, I kind of enjoy it (and I know that my Italian accent sounds even funnier).

The course is focused on Java, JUnit 5, using IntelliJ as IDE.

It looks like a brand new production, so be ready to step into typos and minor mistakes here and there. The most annoying of them, is the one in Question 11 on the Quizzes.

We have to write tests to detect where is the bug in a function named mirrorEnds() that sort of detect how much a string is palindromic. Unfortunately, a wrong solution has to be selected to get the point on it!

If you are taking the course, I suggest you to have a look at the discussion forum to get help.

And, if you want to compare your code with mine, please find my forked repository on GitHub.

Go to the full post

Autowired, Resource, Inject

In chapter three, Introducing IoC and DI in Spring, of Pro Spring 5, I have read the example about Using Setter Injection in the annotation flavor, where for the "renderer" service the Autowired annotation was used in its setter to inject the "provider" bean in its parameter.
A note hints that other annotations, Resource and Inject, could be used, but no example is provided. I thought it was better to practice a bit on them too.

Autowired

The provided example is actually what we usually see in Spring code:
@Override
@Autowired
public void setMessageProvider(MessageProvider provider) {
    this.messageProvider = provider;
}
We should only be aware that Autowired is pure Spring, as its full name documents.

Inject

We could get the same result using the JSR standard annotation Inject:
@Override
@Inject
public void setMessageProvider(MessageProvider provider) {
    this.messageProvider = provider;
}
I had to change the provided build.gradle file for chapter three, setter-injection project, since it didn't have the compile dependency for injection, named misc.inject and defined in the main build.gradle file as
inject         : "javax.inject:javax.inject:1"
Autowired and Qualifier

Say that we have more than one Component implementing the MessageProvider interface. The default one is name "provider", and so it is chosen by Spring in the above examples. But what if I want to inject a different one, for instance "pro2":
@Component("pro2")
class MessageProvider2 implements MessageProvider {

    @Override
    public String getMessage() {
        return "Hello from Message Provider 2!";
    }
}
The Spring commonly used approach is using the Qualifier annotation.
@Override
@Autowired
@Qualifier("pro2")
public void setMessageProvider(MessageProvider provider) {
    this.messageProvider = provider;
}
Both Autowired and Qualifier are Spring specific annotations.

Resource

A JSR standard alternative to the couple Autowired + Qualifier is given by the Resource annotation.
@Override
@Resource(name = "pro3")
public void setMessageProvider(MessageProvider provider) {
    this.messageProvider = provider;
}
Here above, the Component named "pro3" is going to be injected in provider.

Inject and Named

Another JSR standard alternative to Autowired + Qualifier is provided by the couple Inject + Named.
@Override
@Inject
@Named("pro4")
public void setMessageProvider(MessageProvider provider) {
    this.messageProvider = provider;
}
I have pushed to GitHub my changes. You could see there the build.gradle file, my new Components, and the alternative setters.

Go to the full post

HackerRank Array Manipulation

We have an array of integers of a given size, that could be in the order of tenth of millions. It is initialized with all zero elements.
We change it a few times, up to tenth of thousands, adding to given subintervals each time a positive number.
At the end, we want to know which is the highest values stored in the array.

This is a Data Structures HackerRank problem. Here below I show you a naive solution, and a smarter one, both of them using Python as implementation language.

An example

First thing, I wrote a test following the provided example. As a welcomed side effect, it helps me to clarify the problem to myself.
Given an array sized 5, let modify it three times, adding 100 to the first and second element, then 100 to the elements from the second up to the fifth, then again 100 to the third and fourth.

The array should go through these states:
  0   0   0   0   0
100 100   0   0   0
100 200 100 100 100
100 200 200 200 100
At the end, 200 should be the highest value in it.

My test code is:
def test_provided_naive_0(self):
    manipulator = NaiveManipulator(5)
    manipulator.set(1, 2, 100)
    manipulator.set(2, 5, 100)
    manipulator.set(3, 4, 100)
    self.assertEqual(200, manipulator.solution())
As I'm sure you have guessed, I have implemented my naive solution as a class, NaiveManipulator, that is initialized passing the size of the underlying array, and that has a couple of methods, set() to perform a transformation, and solution() to get the requested value at the end.

Let's see its code.
class NaiveManipulator:
    def __init__(self, sz):
        self.data = [0] * sz  # 1
    
    def set(self, first, last, value):
        for i in range(first-1, last):  # 2
            self.data[i] += value  # 3

    def solution(self):
        return max(self.data)
1. The array, initialized with the passed size and with all zero elements, is kept in the member named "data".
2. The indices are given as 1-based, so I convert them in 0-based before using them.
3. Each element in the specified interval is increased by the passed value.
4. Just a call to the built-in function max()

This implementation is really naive. It works fine, but only for limited input data.

A more challenging example

What happens if I have thousands of transformations on a moderately large array, where the subinterval sizes are in the order of the array size?

Let's write a test to see it.
def test_naive(self):
    manipulator = NaiveManipulator(1_000)
    for _ in range(2_000):
        manipulator.set(10, 800, 1)
    self.assertEqual(2_000, manipulator.solution())
It works fine. However, we start seeing how it is getting time consuming. The fact is that in test_naive() we have a for-loop, inside it we call the manipulator set() where there is another for-loop. This algorithm has a O(N*M) time complexity, where N is the number of transformations and M the (average) size of the subintervals. It is enough to have both N and M in the order of thousands to get puny performances by this algorithm.

Be lazy to be faster

Do we really need to increase all the values in the subinterval each time? Why don't we just set where all the increases would start and end, and then perform the increase just once? That could save as a lot of time.

I have refactored my NaiveManipulator to a smarter ArrayManipulator, keeping the same interface, so to nullify the impact on the user code.
class ArrayManipulator:
    def __init__(self, sz):
        self.data = [0] * (sz + 2)  # 1
    
    def set(self, first, last, value):  # 2
        self.data[first] += value
        self.data[last+1] -= value

    def solution(self):  # 3
        return max(itertools.accumulate(self.data))
1. The data array is going to change its meaning. It is not storing the actual value for each element, but the difference between the previous element and the current one. This explains why I need to increase its size by two, since I add a couple of sentinel elements, one at the begin, the other at the end.
2. Instead of changing all the elements in the subinterval, now I change just the first, to signal an increment sized "value", and the one after the last one, to signal that we are getting back to the original value.
3. Large part of the job is now done here. I call accumulate() from the standard itertools library to convert the values stored in the data array to the actual value, then I pass its result to max(), to select its biggest value.

On my machine, this algorithm is about 200 times faster that the previous one on test_naive. Enough to pass the HackerRank scrutiny.

Full python code and test case pushed to GitHub.

Go to the full post

How to close a Spring ApplicationContext

I've just finished reading chapter 2 Getting Started of Pro Spring 5. Nice and smooth introduction to Spring, and to Dependency Injection motivation.

Going through the provided examples, I had just a minor on the code provided. In a couple of cases, the created Spring contexts is stored in plain ApplicationContexts reference. This is normally considered a good way of writing code. We don't want to bring around a fat interface when a slimmer one would be enough. Anyway, ApplicationContext has no close() method declared, leading to an annoying warning "Resource leak: 'ctx' is never closed".

Here is one place where you could see the issue, in class HelloWorldSpringDI:
ApplicationContext ctx =
    new ClassPathXmlApplicationContext("spring/app-context.xml");

MessageRenderer mr = ctx.getBean("renderer", MessageRenderer.class);
mr.render();
An obvious solution would be explicitly cast ctx to ClassPathXmlApplicationContext, and call close() on it. However, we could do better than that. The Spring application contexts close()'s come from the interface Closeable. This means that we could refactor our code wrapping the ApplicationContext usage in a try-with-resource block. We let Java taking care of calling close() when the context goes out of scope.

The resulting code is something like that:
try (ClassPathXmlApplicationContext ctx =
        new ClassPathXmlApplicationContext("spring/app-context.xml")) {
    MessageRenderer mr = ctx.getBean("renderer", MessageRenderer.class);
    mr.render();
}
I pushed on GitHub my patched code for both HelloWorldSpringDI and HelloWorldSpringAnnotated, that is, the other class suffering for the same lack of application context closing.

Go to the full post

The Gradle nature of Pro Spring 5

I've started reading Pro Spring 5: An In-Depth Guide to the Spring Framework and Its Tools. It looks fun and interesting. The first issue I had was on having its code from GitHub working in my STS IDE. Here are a few notes on what I did to solve it, if someone else stumble in the same place.

First step was easy. After navigating through the File menu in this way ...
File
    Import...
        Git
            Project from Git
                GitHub
                    Clone URI
I entered the repository address: https://github.com/Apress/pro-spring-5.git

Second step should have been even easier. Add Gradle Nature to the project. It's just a matter of right-clicking on the project, and from the menu select
Configure
    Add Gradle Nature
Unfortunately, there are a few version numbers in the main "build.gradle" file for the project that are not currently working, leading to a number of annoying exceptions like:
A problem occurred evaluating project ':chapter02:hello-world'.
Could not resolve all files for configuration ':chapter02:hello-world:compile'.
Could not resolve org.springframework:spring-context:5.0.4.BUILD-SNAPSHOT.
Required by:
    project :chapter02:hello-world
Could not resolve org.springframework:spring-context:5.0.4.BUILD-SNAPSHOT.
Could not get resource ...
Could not HEAD ...
Read timed out
org.gradle.tooling.BuildException: Could not run build action ...
I tried to minimize the impact of my changes on the original configuration file.

Basically, when I had an issue, I moved to a newer version, preferring a stable RELEASE to BUILD-SNAPSHOT's or release candidates.

I also had a problem for
io.projectreactor.ipc:reactor-netty:0.7.0.M1
That I moved to
io.projectreactor.ipc:reactor-netty:0.7.9.RELEASE
More complicated was the problem for chapter five. It looks that I have something wrong for aspectJ in my configuration. I didn't to spend too much time on that now, assuming that it would lead to to some detail in that specific area. So I opened the aspectj-aspects gradle build file and commented the dependency to gradle-aspectj and the plugin application for aspectj itself.

Now my build.gradle for chapter five - aspectj-aspects has these lines in:
}

    dependencies {
//      classpath "nl.eveoh:gradle-aspectj:2.0"
    }
}
// apply plugin: 'aspectj'
Sure enough, it won't compile. Still, I finally added the gradle nature to the project and I can go on reading the book. I'll take care of that issue when I'll reach chapter five.

I've forked the original APress repository, and there I pushed my variants for the main build.gradle file, and the one for chapter 5, aspectj-aspects. See there for details.

Go to the full post

A thread-safe DateFormatter via ThreadLocal

The exercise 2-b at the end of chapter two from the book Java 8 Lambdas by Richard Warburton asks to wrap a DateFormatter in a ThreadLocal so to make it thread safe.

We have spotted in legacy code, designed to work in a single thread context, something like:
DateFormatter formatter = // ...

// ...

Calendar cal = Calendar.getInstance();
cal.set(Calendar.YEAR, 1970);
cal.set(Calendar.MONTH, Calendar.JANUARY);
cal.set(Calendar.DAY_OF_MONTH, 1);
String formatted = formatter.getFormat().format(cal.getTime());
We know that DateFormatter, a Swing class used to format java util Date, is non thread safe. And now we plan to use this code in a multithreaded environment. We need to refactor it.

If we need to stick to the DateFormatter, here is a possible solution:
ThreadLocal<DateFormatter> formatter =  // 1
    ThreadLocal.withInitial(  // 2
        () -> new DateFormatter(  // 3
            new SimpleDateFormat("dd-MMM-yyyy", Locale.ENGLISH)));  // 4
1. Wrapping an instance of it in a ThreadLocal saves our day, since each thread has its own copy of the variable.
2. Since Java 8, the withInitial() static method let us create a ThreadLocal passing a Supplier that is going to be used to initialize the object.
3. That's the Supplier. Nothing is passed in, and it returns a new DateFormatter.
4. The DateFormatter will be constructed from this SimpleDateFormat built on the fly specifying the format as a string. Notice the second parameter, I want the dates to be localized in English whichever is the default locale.

Job done. Still it would be nice to ...

Switch to java.time

The classes in java.time have been designed for working correctly in multithreading. So, getting rid of Calendar and DateFormatter for LocalDate and DateTimeFormatter, would lead to code simpler and more robust, like this:
DateTimeFormatter formatter8 = DateTimeFormatter.ofPattern("dd-MMM-yyyy", Locale.ENGLISH);
// ...

LocalDate aDate = LocalDate.of(1970, 1, 1);
String formatted = formatter8.format(aDate);
I have pushed the above Java code in my GitHub repository forked from the one gently provided by the author.
Follow the links to see the Question2 solution and its test case.

Go to the full post

HackerRank Java Dequeue

Given n integers in input, consider each window in it sized m, where m is less or equal to n, to get the number of unique elements in it. Return its maximum value.

I found this problem in the HackerRank Java Challenges section with name Java Dequeue, as a hint that a Deque would helpful to solve it elegantly.

The HackerRank settings implies that data are passed us from System.in. To develop a method that could be easily tested, I made it accept an InputStream as parameter, then I create a Scanner on it, opened in a try-with-resource block.
public static int solution(InputStream is) {
    try (Scanner scanner = new Scanner(is)) {
        // ... see below
    }
}
The first to integers we expect in the input stream are the above described n and m. A problem constrain states that m is not bigger than n, I make it clear in the code with an assertion.
int n = scanner.nextInt();
int m = scanner.nextInt();
assert m <= n;
Now I should be ready to get the next n integers an work on them. The big hint in the problem name suggests to use a Deque. But which one? ArrayDeque looks promising, but it is a flawed choice, since we'll be push about all items in from one side and pop them from the other, this would lead to a useless continuous data shift in it. Much better using a LinkedList.

Since we work with the count of items in the window, a second container would be useful. We'll push any item entering the window in a hash table, storing as value its current count. When an item exits the window we decrease its count in the hash. If the count is zero, we remove it.

Since we are going to call our method many times in a row, it would be a waste create each time a deque and a hash. Better to make them class data member, and simply reset them when the method is called.
public class Solution {
    private static Deque<Integer> window = new LinkedList<>();
    private static Map<Integer, Integer> counter = new HashMap<>();

    public static int solution(InputStream is) {
        // ... see above

        window.clear();            
        counter.clear();
        
        // ... see below
    }
}
Let's fill up the window for the first time. The hash map 'counter' is set up as described above.
for (int i = 0; i < m; i++) {
    int in = scanner.nextInt();
    window.add(in);
    counter.merge(in, 1, Integer::sum);
}

int result = counter.size();

// ... see below
To count how many different integers are in the current window, I simply check the size of the hash map.

Now it is just a matter of iterating on all the other integer in the input stream.
for (int i = m; i < n && result < m; i++) {
    Integer out = window.remove();
    Integer in = scanner.nextInt();
    window.add(in);

    // ... see below
}

return result;
I have started looping on the m-th item, willing to go up to the n-th. But, wait, if I find a window with all different numbers I have already found the problem solution. That could save some CPU time.

Firstly I have adjusted the window, removing its first element and adding a new last element. The exiting and entering elements are used to modify the counter map.
if (out.intValue() != in.intValue()) { // 1
    counter.merge(in, 1, Integer::sum);  // 2
    counter.merge(out, -1, (a, b) -> a == 1 ? null : a + b); // 3
    result = Math.max(result, counter.size()); // 4
}
1. If what enters in the window is the same of what exists from it, I don't have to do anything.
2. I merge the item 'in' in the map. That means, if the map contains 'in', Integer::sum is used to adjust its value (on its current value and the '1' I passed in). Otherwise a new item is created in the map, key 'in', value '1'.
3. Similarly to the previous line, but now I'm performing a sort of 'merge down'. I'm not satisfied by this line I wrote, even though it is kind of fun. Its point is that we know for sure that 'out' is in the map, so we know that we are going to execute the lambda passed to merge(). It would return null if the value associated to 'out' is 1, removing it. Otherwise it would return a + b, but b is set to -1, so it would decrease it. The weak spot is, what if for 'out' is not in the map? Well, merge() would push it in, with value -1. Damn it. There is no way to avoid this inconsistency, since merge() would throw an exception if null is passed as value. End of the story is, I would not use this line in production code.
4. Maybe this check-and-set looks a bit too pythonic, what do you think about it?

Full Java code and test case available on GitHub.

Go to the full post

HackerRank Java Sort

We have a Student class, containing an int, a String and a double data member. We want read a bunch of Students from a data stream, push them all in list, sort them by the double - decreasing, then string, and then int. In the end we'll output just the student names.

This apparently boring little HackerRank problem shows how much more fun has become Java since functional programming has been introduced in it.

I won't touch the Student class, limiting my job to the main class where I create a Scanner on the input stream. Once I read the number of students to work with, Stream is my good friend.
Stream.generate(() -> new Student(sc.nextInt(), sc.next(), sc.nextDouble())) // 1
    .limit(count) // 2
    .sorted(Comparator // 3
        .comparingDouble(Student::getCgpa).reversed() // 4
        .thenComparing(Student::getFname) // 5
        .thenComparingInt(Student::getId)) // 6
    .map(Student::getFname) // 7
    .forEach(System.out::println); // 8
1. The stream comes up by data provided on the fly, reading them from the scanner - named sc. I read an int, a string, a double from it, I shovel them in the Student ctor, that is used in a lambda as supplier for the stream generated() method.
2. I don't want generate() to be called forever, the stream size is limited by the number of students, that I have previously read in 'count'.
3. I have to sort my stream, to do that I provide a Comparator.
4. Firstly I compare (primitive) doubles as provided by the Student.getCgpa() method. But, wait, I don't want the natural order, so I reverse it.
5. Second level of comparison, the Student first name.
6. Third level, the (primitive) int representing the Student id should be used.
7. Almost ready to print the result. Since I'm interested only in the student names, I map each student to her/his name only.
8. Loop on the stream, containing the opportunely sorted student names, and print each one on a new line.

You could get the full Java code from GitHub.

Go to the full post

HackerRank Java 1D Array (Part 2)

Don't be misled by its puzzling name, this fun little HackerRank problem has its main interest not in its implementation language (I should admit that I designed it thinking more pythonesquely than javaly) nor in the use of the array data structure, but in the algorithm you should think to solve it.

There is a one dimensional board. Each cell in it could be set to zero or one. At startup the pawn is placed on the first cell to the left, and its job is going out to the right. It could be moved only on zero cells moving forward by 1 or 'leap' positions, and backward just by one.

Having specified the zeros and ones on the board, and the value of leap (non negative and not bigger than the board size), can we say if the game could be won?

It looks to me that Dynamic Programming is the a good approach to be used here. The problem splits naturally in a series of simple steps, that contribute to the requested solution. There's one issue that muddies the water, the backward move. The rest of the algorithm is pretty straightforward. Start from the last cell in the board. Determine if a pawn could be placed there and if this would lead to a win. Move to the next one of its left, up to the leftmost one. Then return the win condition for the first cell.

I implemented my idea in this way using, as required, Java as implementation language.

Firstly, I create a cache to store all the intermediate values:
boolean[] memo = new boolean[game.length];
if (game[game.length - 1] == 0)
    memo[memo.length - 1] = true;
They are all initialized to false, but the last one, and only if a pawn could go there.

Having set this initial condition, I could loop on all the other elements.
for (int i = memo.length - 2; i >= 0; i--) {
    // ... see below
}

return memo[0];
In the end, the first cell of my cache would contain the solution.

Each cell in the board is set to true if one of these conditions holds:
if (game[i] == 0 && (i + leap >= game.length || memo[i + 1] || memo[i + leap])) {
    memo[i] = true;
    // ... see below
}
First thing, check the current value in the board. If it is not zero, I can't put the pawn there, so there is nothing to do, the cache value surely stays set to false.
Otherwise, one of the following three conditions lead to true in cache:
- adding the current position to the leap value pushes the pawn out of the board
- the next cell in the board is a good one
- the next leap cell is a good one.

Now the funny part. When I set to true a cell, I should ensure that a few cells to the right are marked as good, too. This is due to the backward move. It is a sort of domino effect, that should stop when we find a cell where our pawn can't go.
for (int j = i + 1; j < memo.length && game[j] == 0; j++)
    memo[j] = true;
Full Java code and test case pushed on GitHub.

Go to the full post

Partitioning Souvenirs (patched)

Given a list of integers, we want to know if there is a way to split it evenly in three parts, so that the sum of each part is the same than the other ones.

Problem given in week six of the edX MOOC Algs200x Algorithmic Design and Techniques by the UC San Diego.

My first solution, as you can see in a previous post (please check it out for more background information), was accepted in the MOOC but didn't work in a few cases, as Shuai Zhao pointed out.

So, I added the Shuai test case to my testing suite, and then a couple of other ones, that I felt would help me in writing a more robust solution:
def test_shuai(self):
    self.assertFalse(solution([7, 2, 2, 2, 2, 2, 2, 2, 3]))

def test_confounding_choice(self):
    self.assertFalse(solution([3, 2, 2, 2, 3]))

def test_duplicates(self):
    self.assertTrue(solution([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]))
Firstly, I rewrote the initial check, that now looks in this way:
def solution(values):
    third, module = divmod(sum(values), 3)  # 1
    if len(values) < 3 or module or max(values) > third:  # 2
        return False
1. Add up the passed values, our target is its third, and surely we can't split it there is a remainder after the division. Using divmod() we get both values in a single shot.
2. If we have less than three values in input, a remainder by dividing their sum by three, or if there is a value bigger than a third of the sum, our job is already done.

Then, since I saw that the problem was due to the lack of check in the ownership for values, I added a shadow table to the one I'm using for the Dynamic Programming procedure. I named it 'taken', and it keeps the id of the owner for the relative value for the current row. Zero means the value is currently not taken.
table = [[0] * (len(values) + 1) for _ in range(third + 1)]
taken = [[0] * (len(values) + 1) for _ in range(third + 1)]
As before, I apply the standard DP procedure, looping on all the "valid" elements.
for i in range(1, third + 1):
    for j in range(1, len(values) + 1):
        # ...
Another not a substantial change, it occurred to me that I keep track of a maximum of two positive matches in the table, so, if I have already reached that value, there's no need in doing other checks, just set the current value to two.
if table[i][j-1] == 2:
    table[i][j] = 2
    continue
The next check was already in, I just refactored it out, because the combined if clause in which it was included was getting too complicated, and I added the first use of the taken table.
if values[j-1] == i:
    table[i][j] = 1 if table[i][j-1] == 0 else 2  # 1
    taken[i][j] = j  # 2
    continue

ii = i - values[j-1]  # 3
1. If the current value is the same of the current row value, I have found a match. So I increase its value in the DP table
2. And I mark it as taken for the j value in the current row.
3. If j is not an exact match for i, I split it, looking for the difference in the previous column.

Now it's getting complicated. If there is a good value in the previous column, before using it we have to ensure it is available. And, since it could have been the result of a splitting, we need to check on the full row to do our job correctly.
available = True
taken_id = taken[ii][j-1]  # 1
if taken_id:
    for jj in range(1, j):  # 2
        if taken[ii][jj] == taken_id:  # 3
            if taken[i][jj]:
                available = False
                break
1. The current value id from the taken table is zero if was not used in the 'ii' row.
2. If the current value has a valid id, we need to ensure it has not be already used in current 'i' row.
3. The current jj is marked as used in the ii row for the id we care about, if it is taken also in the i row, we have a problem. We are trying to using in 'i' a value that has been already used. This was the check I missed.

Good, now I can safely going on with my DP algorithm.
if ii > 0 and table[ii][j - 1] > 0 and not taken[i][j - 1] and available:
    table[i][j] = 1 if table[i][j - 1] == 0 else 2
    
    // ...
The annoying thing is that I have to set the taken table too:
taken[i][j] = j  # 1
taken[i][j-1] = j

matcher = values[j-1] + values[j-2]  # 2
if taken_id:
    for jj in range(1, len(values)):  # 3
        if taken[ii][jj] == taken_id:
            taken[i][jj] = j
            matcher += values[jj-1]
if matcher < i:
    for jj in range(j-2, 0, -1):  # 4
        if taken[i][jj] or matcher + values[jj-1] > i:
            continue
        matcher += values[jj-1]
        taken[i][jj] = j
        if matcher == i:
            break
1. This is nice and simple. The current j and the previous one should be marked as j.
2. We have to mark all the other values concurring to generate the current sum. These are two just marked 'j' on this row, all the ones marked as stored in taken_id on 'ii' and, if they are not enough, also the ones free on 'i', to the get the expected result. To perform this last check I need another variable, that I called 'matcher'.
3. Mark on 'i' as 'j' all the items marked taken_id on 'ii'. And adjust matcher.
4. If needed, loop on the left elements looking for the missing values.

And finally:
        # ...
        else:  # 1
            table[i][j] = table[i][j - 1]

return True if table[-1][-1] == 2 else False  # 2
1. This 'else' follows the 'if ii > 0 and ...' above, ensuring the current cell is adjourned correctly if nothing else happened before.
2. The bottom right cell in our DP table should be set to 2 only if we find a correct 3-way split.

See full python code and test cases on GitHub.

Go to the full post

HackerRank Deque-STL

Given an array of integers, find the max value for each contiguous subarray in it sized k. This HackerRank problem is meant to be solved in C++ and, as its name suggests, using a deque.

For instance, if we are given the array {3, 4, 6, 3, 4} and k is 2, we have to consider four subarrays sized:
{3,4} {4,6} {6,3} {3,4} 
And the expected solution is
{4, 6, 6, 4}
An adapter

The original HackerRank problem asks to write a function than outputs its result to standard output. I didn't like much this requisite. As a TDD developer, I'm used to let tests drive the code development. And having to check standard output to verify a function behavior is not fun. So I slightly changed the function signature, asking to return a vector containing the results, and I used the original function as a simple adapter to the original problem. Something like that:
std::vector<int> maxInSubs(int data[], int n, int k)
{
    // ...
}

// ...
void printKMax(int arr[], int n, int k)
{
    auto data = maxInSubs(arr, n, k);
    std::copy(data.begin(), data.end(), std::ostream_iterator<int>(std::cout, " "));
    std::cout << '\n';
}
First (naive) attempt

Just do what we are asked to to. For each subarray find its maximum value ad push it to the result vector.
std::vector<int> maxInSubs(int data[], int n, int k)
{
    std::vector<int> results;
    for (int i = 0; i < n - k + 1; ++i)
    {
        results.push_back(*std::max_element(data + i, data + i + k));
    }
    return results;
}
Clean and simple and, when k and n are small, not even too slow. However, for k comparable to a large n we can say bye bye to performance.

Patched naive attempt

We could be tempted to save the algorithm explained above, observing that the slowness is due to the k calls to max_element(). We could avoid to call it a substantial number of times checking the value of the elements exiting and entering the current window, for instance in this way:
std::vector<int> results{ *std::max_element(data, data + k) };  // 1

for (size_t beg = 1, end = k + 1; end <= n; ++beg, ++end)  // 2
{
    if (data[end - 1] > results[results.size() - 1])  // 3
    {
        results.push_back(data[end - 1]);
    }
    else if (data[beg - 1] < results[results.size() - 1])  // 4
    {
        results.push_back(results[results.size() - 1]);
    }
    else  // 5
    {
        results.push_back(*std::max_element(data + beg, data + end));
    }
}
1. Initialize the result vector with the max element for the first interval.
2. Keep beg and end as loop variable, describing the current window to check.
3. The new right element of the window is bigger than the max for the previous window. Surely it is the max for this one.
4. The element that has just left the window is smaller than the previous max. Surely the max is still in the window.
5. Otherwise, we'd better check which is the current max.

A smartly designed array in input could beat this simple algorithm. However on HackerRank they didn't spend too much time on this matter, and this solution is accepted with full marks.

Solution with a deque

In a more elegant solution, we should to minimize the multiple check we perform on the data elements. Right, but how? Until this moment, I haven't paid attention to the huge hint HackerRank gave us, "Use a deque!", they shout from the name of the problem itself.

The point is that I want to perform a cheap cleanup on each window, so that I could just pick a given element in it, without scanning the entire interval.

Let's use the deque as a buffer to store only the reasonable candidates as max. Since we want to remove from this buffer the candidates that are not anymore valid when the window is moved, instead of their value we keep in it their indices from the original data array.

Here is how I initialize it:
std::deque<int> candidates{ 0 };  // 1
for (int i = 1; i < k; ++i)
{
    pushBack(candidates, data, i);  // 2
}
1. We could safely say that the first element in data is a good candidate as max for its first subarray.
2. Push back to candidates the "i" index from data, but first ensure the previous candidates are useful.

Since the code in pushBack() is going to be used also afterward, I made function for it:
void pushBack(std::deque<int>& candidates, int data[], int i)
{
    while (!candidates.empty() && data[i] >= data[candidates.back()])  // 1
        candidates.pop_back();
    candidates.push_back(i);
}
1. There is no use in a candidate, if the newcomer is bigger, so remove it.

Now candidates contains the indices of all the elements in the first window on data having the max value. Possibly just one element, but for sure the deque is not empty.

We are ready for the main loop:
for (int i = k; i < n; ++i)
{
    results.push_back(data[candidates.front()]);  // 1

    if (candidates.front() <= i - k)  // 2
        candidates.pop_front();

    pushBack(candidates, data, i);  // 3
}
results.push_back(data[candidates.front()]);  // 4
1. As said above, we know that candidates is not empty and its front is the index of a max value in the current window. Good. Push it to results.
2. Now we prepare for the next window. If the front candidate is out, we remove it.
3. Push back the new element index among the candidates, following the algorithm described above. It would kill the candidates that are not bigger than it, ending up with a deque where the biggest element is surely on front.
4. Remember to push the last candidate in the results, and then the job is done.

Does this solution look more convincing to you? Full C++ code and test case on GitHub.

Go to the full post

HackerRank Equal

We have a list of integers, and we want to know in how many rounds we could make them all equal. In each round we add to all the items in the list but one the same number chosen among 1, 2, and 5. Pay special attention when you solve this problem on HackerRank. Currently (April 2018) its enunciation says each increase should be of one, three, or five. However, this is not what the solution is tested for.
It looks like someone decided to change 2 for 3 a few months ago, edited the description and then forgot to modify the testing part. Who knows what is going to happen next. Be ready to be surprised.
Besides, it is inserted in the Algorithms - Dynamic Programming section, but I don't think that is the right approach to follow.

Simplifying the problem

Instead of applying the weird addition process stated in the problem, we could decrease each time a single item. For instance, given in input [2, 2, 3, 7], a solution to the original problem is:
2, 2, 3, 7
+5 +5 +5  = (1)
 7, 7, 8, 7
+1 +1  = +1 (2)
 8, 8, 8, 8
We could solve it in this way instead:
2, 2, 3, 7
 =  =  = -5 (1)
 2, 2, 3, 2
 =  = -1  = (2)
 2, 2, 2, 2
Since we are asked to calculate the number of steps to get to the equal state, in both ways we get the same result.

Base cases

A sort of rough algorithm is already emerging. We get the lowest value in the list, and decrease all the other elements using the available alternatives until we reach it. We start from the biggest value (5) and only when we are forced to, we fall back to the shorter steps.

To better understand how we should manage the last steps, I put the base cases on paper.
If x is at level 1, 2 or 5, we could get zero at cost 1. But if x is at level 3 or 4, it costs 2 to us. Notice that if, instead of getting to level zero, we move both the lowest element and the x element at level -2, we get to the result in the same number of moves. For instance, if our input list is [10, 13]
10, 13
    -2 (1)
    -1 (2)
10, 10
 
10, 13
-2     (1)
    -5 (2)
 8,  8
If the input list is [0, 3, 3] getting down to the bottom from 3 in just one move gives us an advantage:
10, 13, 13
    -2      (1)
    -1      (2)
        -2  (3)
        -1  (4)
10, 10, 10
 
10, 13, 13
-2          (1)
    -5      (2)
        -5  (3)
 8,  8,  8
The algorithm

I think I've got it. Firstly I find the minimum value, M, in the input list. That is a possible target for all the items to be reach. However I have to check other two alternatives, M-1 and M-2.
I loop on all the items in the list. For all the three possible targets, I calculate the difference between it and the current value, count the number of steps to get there, and add it to the total number of steps required for getting to that target.
And then I choose as a result the cheapest target to reach.

The code

Using Python as implementation language, I started with a couple of test cases, and then added a few ones along the way, when I bumped into troubles, and I ended up with this code.
SHIFT = [0, 1, 2]  # 1


def solution(data):
    lowest = min(data)  # 2

    results = [0] * len(SHIFT)  # 3
    for item in data:
        for i in SHIFT:
            gap = item - lowest + i  # 4
            results[i] += gap // 5 + SHIFT[(gap%5 + 1) // 2]  # 5
    return min(results)  # 6
1. Represents the three possible targets, from the minimal value in the list down to -2.
2. Get the minimal value in the list.
3. Buffer for the results when using the different targets.
4. Distance that has to be split.
5. Add the current number of steps to the current buffer. Firstly I get the number of long moves dividing gap by 5. Now I am in the base case, as showed in the picture above. Notice that the cost of moving from X to target is [0, 1, 1, 2, 2] for gap in [0..5], if we take gap, apply modulo five, increase it and then divide by two, we get the index in SHIFT to the number of steps actually needed. Admittedly an obscure way to get there, if this was production code, I would have probably resisted temptation to use it.
6. Get the minimal result and return it.

All tests passed in hackerrank, python script and test case pushed to GitHub.

Go to the full post

HackerRank Roads and Libraries

We are given n nodes and a (possibly huge) number of edges. We are also given the cost of building a library in a city (i.e. a node) and a road (i.e. an edge). Based on these data we want to minimize the cost of creating a forest of graphs from the given nodes and edges, with the requirement that each graph should have a library on one of its nodes. This is a HackerRank problem on Graph Theory algorithms, and I am about to describe my python solution to it.

If a library is cheaper than a road, the solution is immediate. Build a library on every node.
def solution(n, library, road, edges):
    if road >= library:
        return n * library

    # ...
Otherwise, we want to create a minimum spanning forest, so to minimize the number of roads, keeping track of the number of edges used and how many graphs are actually generated. I found natural using an adapted form of the Kruskal MST (Minimum Spanning Tree) algorithm, that looks very close to our needs.

Kruskal needs a union-find to work, and this ADT is not commonly available out of the box. So, I first implemented a python UnionFind class, see previous post for details.
Then, while working on this problem, I made a slight change to it. My algorithm was simpler and faster if its union() method returned False when nothing was actually done in it, and True only if it led to a join in two existing graph.

Using such refactored UnionFind.union(), I wrote this piece of code based on Kruskal algorithm:
uf = UnionFind(n)
road_count = 0  # 1

for edge in edges:
 if uf.union(edge[0] - 1, edge[1] - 1):  # 2
  road_count += 1  # 4
  if uf.count == 1:  # 5
   break
1. The union-find object keeps track of the numbers of disjointed graphs in the forest, but not of edges. This extra variable does.
2. I need to convert the edges from 1-based to 0-based convention before use them. If the two nodes get connected by this call to union(), I have some extra work to do.
4. An edge has been used by union(), keep track of it.
5. If union() connected all the nodes in a single graph, there is no use in going on looping.

Now it is just a matter of adding the cost for roads and libraries to get the result.
return road_count * road + uf.count * library

Complete python code for problem, union-find, and test case on GitHub.

Go to the full post

Union Find

I have a (possibly huge) bunch of edges describing a forest of graphs, and I'd like to know how many components it actually has. This problem has a simple solution if we shovel the edges in a union-find data structure, and then just ask it for that piece of information.

Besides the number of components, our structure keeps track of the id associated to each node, and the size of each component. Here is how the constructor for my python implementation looks:
class UnionFind:
    def __init__(self, n):  # 1
        self.count = n
        self.id = [i for i in range(n)]  # 2
        self.sz = [1 for i in range(n)]  # 3
1. If the union-find is created for n nodes, initially the number of components, named count, is n itself.
2. All the nodes in a component have the same id, initially the id is simply the index of each node.
3. In the beginning, each node is a component on its own, so the size is initialized to one for each of them.

This simple ADT has two operations, union and find, hence its name. The first gets in input an edge and, if the two nodes are in different components, joins them. The latter returns the id of the passed node.

Besides, the client code would check the count data member to see how many components are in. Pythonically, this is exposure of internal status is not perceived as horrifying. A more conservative implementation would mediate this access with a getter.

Moreover, a utility method is provided to check if two node are connected. This is not a strict necessity, still makes the user code more readable:
def connected(self, p, q):
    return self.find(p) == self.find(q)
The meaning of this method is crystal clear. Two nodes are connected only if they have the same id.

In this implementation, we connect two nodes making them share the same id. So, if we call union() on p and q, we'll change the id of one of them to assume the other one. Given this approach, we implement find() in this way:
def find(self, p):
    while p != self.id[p]:
        p = self.id[p]
    return p
If the passed p has id different from its default value, we check the other node until we find one that has its original value, that is the id of the component.

We could implement union() picking up randomly which id keep among the two passed, but we want keep low the operational costs, so we work it out so to keep low the height of the tree representing nodes in a component, leading to O(log n) find() complexity.
def union(self, p, q):
    i = self.find(p)
    j = self.find(q)
    if i != j:  # 1
        self.count -= 1  # 2
        if self.sz[i] < self.sz[j]:  # 3
            self.id[i] = j
            self.sz[j] += self.sz[i]
        else:
            self.id[j] = i
            self.sz[i] += self.sz[j]
1. If the two nodes are already in the same component, there is no need of doing anything more.
2. We are joining two components, their total number in the union-find decrease.
3. This is the smart trick to keep low the cost of find(). We decide which id to keep as representative for the component accordingly with the sizes of the two merging ones.

As example, consider this:
uf = UnionFind(10)
uf.union(4, 3)
uf.union(3, 8)
uf.union(6, 5)
uf.union(9, 4)
uf.union(2, 1)
uf.union(5, 0)
uf.union(7, 2)
uf.union(6, 1)
I created a union-find for nodes in [0..9], specifying eight edges among them, from (4, 3) to (6, 1).
As a result, I expect two components and, for instance, to see that node 2 and node 6 are connected, whilst 4 and 5 not.

I based my python code on the Java implementation provided by Robert Sedgewick and Kevin Wayne in their excellent Algorithms, 4th Edition, following the weighted quick-union variant. Check it out also for a better algorithm description.

I pushed to GitHub full code for the python class, and a test case for the example described above.

Go to the full post

HackerRank Climbing the Leaderboard

We are given two lists of integers. The first one is monotonically decreasing and represent the scores of the topmost players in a leaderboard. The second one is monotonically increasing and contains the score history of Alice, a player who rapidly climbed the board.
Following the dense ranking convention, we want to get back a list containing the history of rank positions for Alice.
This is a HackerRank Algorithm Implementation problem, and I am going to show you how I solved it, using Python as implementation language.

I noticed that the first list, scores, is already sorted, we just have to get rid of duplicates to have a matching between the position and the score Alice has to get to achieve that ranking.

Bisect solution

I just have to do the matching. First idea jumped to my mind, was performing a binary search on scores to do it. It helps that Python provides for the job a well known library, bisect. There's just a tiny issue, bisect expects the underlying list to be sorted in natural order, so we need to reverse our scores.

It looks promising, let's implement it.

A pythonic way to get our ranking would be this:
ranking = sorted(list(set(scores)))
I get the original list, convert to a set to get rid of duplicates, than back to list, so that I can sort it in natural order. Nice, but in this problem we are kind of interested in performance, since we could have up to 20 thousand items in both lists. So we want to take advantage of the fact that the list is already sorted.

So, I ended up using this rather uncool code:
ranking = [scores[-1]]
for i in range(len(scores)-2, -1, -1):
 if scores[i] > ranking[-1]:
  ranking.append(scores[i])
I initialize the ranking list with the last item in scores, then I range on all the other indices in the list from right to left. If the current item is bigger than the latest one pushed in ranking, I push it too.

Now I can just rely on the bisect() function in the bisect python module, that would find which position the current item should be inserted in the list. With a caveat, I have reverted the order, so I have to adjust the bisect() result to get the result I'm looking for:
results = []
last = len(ranking) + 1
for score in alice:
 results.append(last - bisect(ranking, score))
This code pass all the tests, job done.

However. Do I really need to pay for the bisect() search for each element of alice?

Double scan solution

Well, actually, we don't. Since we know that both list are sorted, we can use also the ordering in alice to move linearly in ranking.

Since we are not using anymore bisect, we don't need to revert the sorting order in ranking, and the duplicate cleanup is getting a bit simpler:
ranking = [scores[0]]
for score in scores[1:]:
 if score < ranking[-1]:
  ranking.append(score)

Now we compare each item in alice against the items in ranking moving linearly from bottom to head:
results = []
for score in alice:
 while ranking and score >= ranking[-1]:
  ranking.pop()
 results.append(len(ranking) + 1)
We don't have to be alarmed by the nested loops, they don't have a multiplicative effect on the time complexity, since we always move forward on both lists, the result is a O(M + N) time complexity.

Is this a better solution than the first one? Well, it depends. We should know more on the expected input. However, for large and close values of N and M, it looks so.

I pushed the python script for both solutions and a test case to GitHub.

Go to the full post

HackerRank Divisible Sum Pairs

Given a list of integers, we want to know how many couples of them, when summed, are divisible by a given integer k.

So, for instance, given [1, 2, 3, 4, 5, 6], we have five couples of items with sum divisible by 3:
(1, 2), (1, 5), (2, 4), (3, 6), (4, 5)
This is a HackerRank algorithm problem, implementation section.

Naive solution

Test all the couples, avoiding duplicates. If we check (a1, a2), we don't have to check (a2, a1).

The code I have written for this solution should be of immediate comprehension, even if you are not that much into Python:
result = 0
for i in range(len(values) - 1):  # 1
    for j in range(i+1, len(values)):  # 2
        if (values[i] + values[j]) % k == 0:  # 3
            result += 1
1. Loops on all the indices in the list but the last one.
2. Loops from the next index to the current "i" to the last one.
3. Check the current couple, and increase the result if compliant.

Even if this is what HackerRank was expecting from us (the problem is marked as easy), we can do better than this, considering its disappointing O(N^2) time complexity.

Linear solution

The problem could be restated as counting the couples that, added up, equal to zero modulo k. Following this insight, let's partition the items accordingly to their own modulo.
remainders = [0] * k
for i in range(len(values)):
    remainders[values[i] % k] += 1
Each element in the "remainders" list represents the number of items in the original list having as modulo the index of the element.

For the example shown above we'll get this remainders:
[2, 2, 2]
Now, if we add an element having remainder x to element with remainder k - x, we'll get a number equal zero modulo k. We want all the possible combinations of the x elements with the k - x ones, so we apply the Cartesian product to the two sets, that has a size that is their product.

There are a couple of exceptions to this rule. The elements having modulo zero have to be added among themselves, and the same happens to the element having as modulo half k, if k is even. The number of combinations of a set of N elements could be expressed as N * (N-1) / 2.

Putting all together we have this code:
result = remainders[0] * (remainders[0] - 1) // 2  # 1

pivot = k // 2  # 2
if k%2:
    pivot += 1  # 3
else:
    result += remainders[k//2] * (remainders[k//2] - 1) // 2  # 4

for i in range(1, pivot):  # 5
    result += remainders[i] * remainders[k-i]
1. Initialize "result" using the above described formula for the modulo zero items.
2. Let's calculate the central element in the list, where we have stop looping to sum up.
3. If k is odd, we won't have a lonely central element, and the pivot should be moved a step to the right.
4. When k is even, the elements having half-k modulo are managed as the zero modulo ones.
5. Normal cases.

After the for-loop, result contains the answer to the original question.

I pushed a python script with both solutions, naive and remainder based, to GitHub, along with a few tests.

Go to the full post

Boost ASIO echo UDP asynchronous server

A change request for the echo UDP client-server app discussed before. We want keep the client as is, but we need the server be asynchronous.

Instead of using the synchronous calls receive_from() and send_to() on a UDP socket, we are going to use their asynchronous versions async_receive_from() and async_send_to().

The asynchronicity leads naturally to implement a class, having a socket has its private data member, so that we can make our asynchronous call on it.
const int MAX_LEN = 1'024;
const uint16_t ECHO_PORT = 50'015;

class Server
{
private:
    udp::socket socket_;  // 1
    udp::endpoint endpoint_;  // 2
    char data_[MAX_LEN];  // 3

// ...
1. Our ASIO UDP socket.
2. The endpoint we use to keep track of the client currently connected to the server.
3. Data buffer, used to keep track of the message received from the client.

The constructor gets the app ASIO I/O context by reference from the caller and uses it to instantiate its member socket. Then it calls its private method receive() to start its endless job.
Server(ba::io_context& io) : socket_(io, udp::endpoint(udp::v4(), ECHO_PORT))  // 1
{
    receive();
}

void receive()
{
    socket_.async_receive_from(ba::buffer(data_), endpoint_, [this](bs::error_code ec, std::size_t len) {  // 2
        if (!ec && len > 0)  // 3
        {
            send(len);
        }
        else
        {
            receive();  // 4
        }
    });
}
1. The socket requires also an endpoint describing the associated protocol and port. We create it on the fly.
2. Call asynchronously receive_from on the socket. ASIO would put in the data buffer what the client sends and store its network information in the passed endpoint. When the socket receive is completed, ASIO would call the handler passed as third parameter, here a lambda that captures "this" and honors the expected parameters.
3. If the receiving worked fine - no error_code reported - and the message is not empty, we'll call our Server send() method, to echo the message.
4. Otherwise - error or empty message - we don't have to send anything back, so we call the enclosing receive() method, to serve a new client.

When a "good" message is received from a client, our server sends it back to it as is:
void send(std::size_t len)
{
    socket_.async_send_to(ba::buffer(data_, len), endpoint_, std::bind(&Server::receive, this));
}
The socket gets the job of asynchronously send the data, as stored in the Server member variable, with the length, as passed in as parameter, to the endpoint saved as Server member variable. When the data transfer is completed, ASIO would call the handler passed as third argument. Here we don't want to do anything in case or error, not even logging something, so we can simply bind to the handler a call to "this" receive(), ignoring error code and length of the transferred data.

I pushed the complete C++ source file to GitHub. The code is based on the UDP asynchronous echo example in the official Boost ASIO documentation.

Go to the full post

Boost ASIO echo TCP asynchronous server

Let's refactor the echo TCP server to achieve asynchrony. It's going to be a lot of fun. If you feel that it is too much fun, you could maybe have first a look at the similar but a bit simpler asynchronous server discussed in a previous post.

Main

This server works with the same clients as seen for the synchronous server, here we deal just with the server. All the job is delegated to the Server class, whose constructor gets a reference to the application ASIO I/O context.
namespace ba = boost::asio;
// ...

Server server(io);
io.run();
Server

The server ctor initialize its own acceptor on the ASIO I/O context on the endpoint specifying the TCP IP protocol and port chosen, then it calls its private member method accept():
using ba::ip::tcp;
// ...

const uint16_t ECHO_PORT = 50'014;
// ...

class Server
{
private:
 tcp::acceptor acceptor_;

 void accept()
 {
  acceptor_.async_accept([this](bs::error_code ec, tcp::socket socket)  // 1
  {
   if (!ec)
   {
    std::make_shared<Session>(std::move(socket))->read();  // 2
   }
   accept();  // 3
  });
 }
public:
 Server(ba::io_context& io) : acceptor_(io, tcp::endpoint(tcp::v4(), ECHO_PORT))
 {
  accept();
 }
};
1. As handler to async_accept() is a lambda that gets as parameters an error code that let us know if the connection from the client has been accepted correctly, and the socket eventually created to support the connection itself.
2. A beautiful and perplexing line. We create a shared_prt smart pointer to a Session created from a rvalue reference to the socket received as parameter, and call on it its read() method. However this anonymous variable exits its definition block on the next line, so its life is in danger - better see what is going on in read(). Actually, we are in danger, too. If something weird happens in this session object, we don't have any way to do anything about.
3. As soon as a Session object is created, a new call to accept() is issued, an so the server puts itself in wait for a new client connection.

Session

As we have seen just above, we should expect some clever trick from Session, especially in its read() method. Thinking better about it, it is not a big surprise seeing that its superclass is enable_shared_from_this:
class Session : public std::enable_shared_from_this<Session>
{
private:
 tcp::socket socket_;
 char data_[MAX_LEN];

// ...
public:
 Session(tcp::socket socket) : socket_(std::move(socket)) {}  // 1

 void read()  // 2
 {
  std::shared_ptr<Session> self{ shared_from_this() };  // 3
  socket_.async_read_some(ba::buffer(data_), [self](bs::error_code ec, std::size_t len) {  // 4
   if (!ec)
   {
    self->write(len);
   }
  });
 }
};
1. The ctor gets in the socket that we seen was created by the acceptor and moved in, in its turn, the constructor moves it to its data member.
2. The apparently short lived Session object created by the handler of async_accept() calls this method.
3. A new shared_ptr is created from this! Actually, being such, it is the same shared_prt that we have seen in the calling handler, just its use counter increased. However, our object is still not safe, we need to keep it alive until the complete read-write cycle between client and server is completed.
4. We read asynchronously some bytes from the client. To better see the effect, I have set the size of the data buffer to a silly low value. But the more interesting part here is the handler passed to async_read_some(). Notice that in the capture clause of the lambda we pass self, the shared pointer from this. So our object is safe till the end of the read.

So far so good. Just remember to ensure the object doesn't get invalidated during the writing process:
void write(std::size_t len)
{
 std::shared_ptr<Session> self{ shared_from_this() };
 ba::async_write(socket_, ba::buffer(data_, len), [self](bs::error_code ec, std::size_t) {
  if (!ec)
  {
   self->read();
  }
 });
}
Same as in read(), we ensure "this" stays alive creating a shared pointer from it, and passing it to the async_write() handler.

As required, as the read-write terminates, "this" has no more live references. Bye, bye, session.

I have pushed my C++ source file to GitHub. And here is the link to the original example from Boost ASIO.

Go to the full post

Boost ASIO echo UDP synchronous client-server

Close to the previous post. The main difference that there we have seen a TCP-based data exchange while here we see a UDP echo.

Server

This server is simpler than the previous one. Just one connection is served at a time.
udp::socket socket(io, udp::endpoint(udp::v4(), ECHO_PORT));  // 1

for (;;)  // 2
{
 char data[MAX_LEN];
 udp::endpoint client;
 size_t len = socket.receive_from(ba::buffer(data), client);  // 3

 // ...
 socket.send_to(ba::buffer(data, len), client);  // 4
}
1. Create an ASIO UDP socket on the app io_context, on a UDP created on the fly where the UDP IP protocol and the port to be used are specified.
2. Forever loop to serve, in strict sequential order, all the requests coming from clients.
3. ASIO blocks here, expecting the socket to receive a connection from a client. Make sure that the buffer data is big enough.
4. Since this is an echo server, nothing exciting happens between receiving and sending. Here we send the data, as received, to the endpoint as set by receive_from().

Client
char request[MAX_LEN];
// ...

udp::socket socket{ io, udp::endpoint(udp::v4(), 0) };  // 1
udp::resolver resolver{ io };
udp::endpoint destination = *resolver.resolve(udp::v4(), host, ECHO_PORT_STR).begin();  // 2
socket.send_to(ba::buffer(request, std::strlen(request)), destination);  // 3

char reply[MAX_LEN];
udp::endpoint sender;
size_t len = socket.receive_from(ba::buffer(reply), sender);  // 4
// ...
1. Create a UDP socket on the ASIO I/O context. Notice that the UDP endpoint passed specify the IP protocol but not a valid port.
2. The destination endpoint, that refers to the server, is generated by the resolver created on the line above, that resolves the specified host and port for the given UDP IP protocol. Then the first result is taken (by the begin iterator and then dereferencing). In case of any trouble we have guarantee an exception is thrown by resolve().
3. Send the data through buffer to the socket, that mediates the connection to the server.
4. Once send_to() has ended its job (notice that it is a blocking function), we get the reply from the server calling receive_from(). The socket knows where to go and get the data, and will fill the passed endpoint (sender) with these information.

I pushed the full C++ code - both client and server in the same source file - to GitHub. I based them on blocking_udp_echo_server.cpp and blocking_udp_echo_client.cpp from the official Boost ASIO Tutorial.

Go to the full post

Boost ASIO echo TCP synchronous client-server

I think this echo client-server application is a good introduction to ASIO. The server creates a new TCP socket each time it receives a request from a client, and run it in a new thread, where the read-write activity is accomplished in a synchronous way. The client sends some data to the server, gets it back, and then terminates.
The structure is simple, still a few interesting points are touched.

Client

Given io, the app ASIO io_context, and the server hostname as a string, the client tries this block, and eventually just output to console an exception.
namespace ba = boost::asio;
using ba::ip::tcp;
// ...

tcp::socket socket{ io };  // 1
tcp::resolver resolver{ io };
ba::connect(socket, resolver.resolve(host, ECHO_PORT_STR));  // 2

// ...
ba::write(socket, ba::buffer(request, reqLen));  // 3

char reply[CLIENT_MAX_LEN];  // 4
size_t repLen = ba::read(socket, ba::buffer(reply, reqLen));  // 4
// ...
1. Create an ASIO TCP socket and a resolver on the current io_context.
2. Then resolve() the resolver on the host and port of the echo server (in my case, localhost:50014), and use the resulting endpoints to estabilish a connection on the socket.
3. If the connection holds, write to the socket the data we previously put in the char buffer named request, for a size of reqLen.
4. We reserve a confidently large buffer where to store the server reply. Since we are writing a echo application, we know that the size of the data we are about to get from the client should be the same of the size we have just sent. This simplify our code to the point that we can do a single read for the complete data block.
5. Use the socket for reading from the server. We use the buffer, and the size of the data we sent, for what said on (4).

At this point we could do whatever we want with the data we read in reply with size repLen.

Server loop

Once we create an acceptor on the ASIO io_context, specifying as endpoint the IP protocol we want (here I used version 4) and the port number, we loop forever, creating a new socket through a call to accept() on the acceptor each time a request comes from a client, passing it to the session() function that is going to run in a new thread.
tcp::acceptor acceptor{ io, tcp::endpoint(tcp::v4(), ECHO_PORT) };

for (;;)
{
 std::thread(session, acceptor.accept()).detach();
}
Notice that each thread created in the loop survives the exiting of the block only because it is detached. This is both handy and frightening. In production code, I would probably push them in a collection instead, so that I could explicitly kill anyone that would stop behave properly.

Server session

Since we don't know the size of the data sent by the client, we should be ready to split it and read it in chunks.
for (;;)
{
 char data[SERVER_MAX_LEN];  // 1

 bs::error_code error;
 size_t len = socket.read_some(ba::buffer(data), error);  // 2
 if (error == ba::error::eof)
 {
  return;
 }
 else if (error)
 {
  throw bs::system_error(error);
 }

 ba::write(socket, ba::buffer(data, len)); // 3
}
1. To better see the effect, I have chosen a ridiculously small size for the server data buffer.
2. The data coming from the client is split in chunks from read_some() on the socket created by the acceptor. When the read is completed, read_some() sets the passed boost system error to eof error. When we detect it, we know that we could terminate the session. Any other error says that the something went wrong.
3. If read_some() set no error, we use the current chunk of data to do what the server should do. In this case, we just echo it back to the client.

Full C++ code on GitHub. The original source is the official Boost ASIO tutorial, divided in two parts, client and server.

Go to the full post

Boost ASIO UDP asynchronous server

Having already seen how to establish an ASIO UDP synchronous connection and how create ASIO TCP asynchronous server, we sort of put them together to write an ASIO UDP asynchronous server.

Main

As client we could happily recycle the one written for the UPD synchronous connection - only, be sure to use the same IP protocol for both. So in the main function we just instantiate an ASIO io_context (also known as io_service), pass it by reference to the ctor of a Server object, and then call run() on it.

In a second time, while running a number of clients to play with the app, you would want to run io also in other threads - be sure to do that between the server creation and the io run on the current thread.

Server class

The server would sit on port 50013 and send to the clients always the same message, concatenated with to a counter. To work it needs an ASIO UPD socket and a UDP endpoint that would identify the current client.
// ...
const int HELLO_PORT = 50'013;
const std::string MSG("Async UDP hello from ASIO ");

class Server
{
private:
 udp::socket socket_;
 udp::endpoint endpoint_;
 uint16_t counter_ = 0;
// ...
public:
 Server(ba::io_context& io) : socket_(io, udp::endpoint(udp::v6(), HELLO_PORT))
 {
  start();
 }
};
The server ctor sets the socket data member up using the reference to the ASIO io context received from the instantiator and a UDP endpoint created on the fly, specifying the required IP protocol (here version 6) and the server port.

Then the server start() private method is called:
void start()
{
 std::array<char, 0> buffer;  // 1
 socket_.async_receive_from(ba::buffer(buffer), endpoint_,
  std::bind(&Server::recvHandler, this, std::placeholders::_1));  // 2
 std::cout << "Server ready" << std::endl;
}
1. The client is expected to send an empty message, so the receive buffer could be zero sized.
2. We call async_receive_from() to receive asynchronously from the client a message in buffer. We'll get the client endpoint information in the data member and, on receive completion, it will call another Server's private method, recvHandler(), passing to it the first parameter that ASIO was expected to send, namely a reference to the boost system error_code describing how the async_receive_from() was completed.

If no error was detected in async_receive_from(), the recvHandler() creates a message and sends it to the client:
void recvHandler(const bs::error_code& error)
{
 if (!error)
 {
  std::shared_ptr<std::string> data(new std::string(MSG + std::to_string(++counter_)));  // 1

  socket_.async_send_to(ba::buffer(*data), endpoint_,
   std::bind(&Server::sendHandler, this, data, std::placeholders::_1, std::placeholders::_2));  // 2
  start();
 }
}
1. This piece of code is a bit involuted. We create on the heap a string containing the data to be send to the client, and we wrap it in a shared pointer. In this way we can keep it alive in a multithreading environment until we need it, that is, the end of the sendHandler() method invoked by async_send_to() at the end of its operation.
2. async_send_to() uses the endpoint set by async_receive_from() to know where sending the data. At the end, sendHandler() is called.

From the ASIO point of view, sendHandler() could be an empty method. The only important thing is that the data created in recvHandler() gets here in the shared smart pointer, so that it can ensure it not to be destroyed when still required.
void sendHandler(std::shared_ptr<std::string> data, const bs::error_code& error, std::size_t size)
{
 if (!error)
 {
  std::cout << size << " byte sent from [" << *data << "]" << std::endl;
 }
}
I pushed the full C++ source code on GitHub. It is based on the Daytime.6 example from the official Boost ASIO tutorial.

Go to the full post

Boost ASIO synchronous UDP client/server

If you know how to write an app that uses an ASIO TCP connection, you are close to know also how to do it on UDP.

Large part of the differences are taken care for us in ASIO, and we just have to use the socket as defined in boost::asio::ip::udp instead of its tcp counterpart.

Server

First thing, we create a udp socket, that requires the ASIO I/O context and a udp endpoint, that needs as parameters the IP protocol to be used - version 4 or 6 - and the port - here I picked up 50013.
namespace ba = boost::asio;
namespace bs = boost::system;
using ba::ip::udp;
// ...

const unsigned short HELLO_PORT = 50'013;
// ...

void server(ba::io_context& io)
{
    udp::socket socket{ io, udp::endpoint{udp::v6(), HELLO_PORT} };
 // ...
Then we repeat how many times we like this block - in my tester I did it just once:
std::array<char, 0> recvData;  // 1
udp::endpoint endpoint;  // 2
bs::error_code error;  // 3
socket.receive_from(ba::buffer(recvData), endpoint, 0, error);  // 4
if (error)
 throw bs::system_error(error);  // 5

std::string message{ "UDP hello from ASIO" };

bs::error_code ec;
socket.send_to(boost::asio::buffer(message), endpoint, 0, ec);  // 6
1. In this buffer we store the message sent from the client. It has no use here, so it could be it even zero sized.
2. The endpoint, that will be used to sent the message to the client, is set by the socket receive_from() method, two lines below.
3. This error code is set by receive_from(), in case of problems.
4. The server wait synchronously here for the client. The three parameters are output ones. When the connection starts, the data coming from the client is put in the first parameter (here, an empty message is expected), the second parameter is filled with the client endpoint, the last one stores the possible error in the operation.
5. If receive_from() fails, throw the boost system error code relative exception, that is a standard runtime_error subclass.
6. Send the message to the client, using the endpoint as set by receive_from() and not specifying any flag. Any possible error code returned is disregarded.

Client

The client function tries this code:
udp::resolver resolver{ io };
udp::endpoint destination = *resolver.resolve(udp::v6(), host, HELLO_PORT_STR).begin();  // 1

udp::socket socket{ io };
socket.open(udp::v6());  // 2

std::array<char, 0> sendData;
socket.send_to(ba::buffer(sendData), destination);  // 3

std::array<char, 128> recvData;  // 4
udp::endpoint sender;
size_t len = socket.receive_from(ba::buffer(recvData), sender);

std::cout.write(recvData.data(), len);  // 5
1. Having instantiated a udp resolver on the previous line, we resolve() on it for the same IP protocol of the server - here I used version six - specifying its host and port. Since resolve() returns at least one endpoint or fails, we could safely access the first one dereferencing its begin() iterator.
2. We need an ASIO upd socket. Having created it on the previous line, passing the current ASIO I/O control, we open it for the required UDP version.
3. We start the communication with the server, sending an empty message - as nothing more is expected from it.
4. We'd better have enough room for the message coming from the server, the actual size of it is returned by the call to receive_from().
5. Let's see what we get, outputting it to the console.

Client and server are two free functions in the same C++ file that I pushed to GitHub. Passing no parameter to its main you run it as server, otherwise is a client.

This example is based on Daytime.4 and Daytime.5 from the official Boost ASIO tutorial.

Go to the full post

Boost ASIO TCP/IP asynchronous server

Having seen how simple is creating a synchronous ASIO TCP/IP server, let's see now how to create an asynchronous one.

Main

The code for this example is divided in two classes, Server and Connection, described below. The example main function instantiates an ASIO io_context, uses it to instantiate a Server object, and then run() the I/O context.
namespace ba = boost::asio;
// ...

ba::io_context io;
Server server(io);
io.run();
Connection

The lower details of our code are here. Connection has as private data member an ASIO TCP/IP socket object on which we are going to write data to the client. Since we want to perform the write asynchronously, we use the ASIO async_write() function. This leads us to ensure that the current connection object is still alive when the write would actually be performed. To do that we'll pass to async_write() an instance of the connection object itself. To avoid a nightmarish memory management, we'll wrap it in a shared_ptr. However, to to that, we need to create a shared smart pointer from this, and to do that we have to enable the feature explicitly, deriving our class from the standard enable_shared_from_this:
class Connection : public std::enable_shared_from_this<Connection>
{
private:
 tcp::socket socket_;
 std::string message_{ "Async hello from ASIO " };
 static int counter_;

// ...
The Connection ctor creates its member socket using the passed ASIO I/O context, and sets the message that we'll send to the client. Notice that the message has to be a Connection data member because we have to guarantee its liveliness until the asynchronous write is performed.
Connection(ba::io_context& io) : socket_(io)
{
 message_ += std::to_string(counter_++);
}
However, the ctor is private. The only way we want to let a Connection user to create an instance of this class is by wrapping it in a smart pointer, for the reason we described above, so, we have this static public method:
static std::shared_ptr<Connection> create(ba::io_context& io)
{
 return std::shared_ptr<Connection>(new Connection(io));
}
The writing is performed by this public method:
void start()
{
 ba::async_write(socket_, ba::buffer(message_),
  std::bind(&Connection::write, shared_from_this(), std::placeholders::_1, std::placeholders::_2));
}
ASIO async_write() requires an AsyncWriteStream, our socket, a ConstBufferSequence, that we create on the fly from our message, and a WriteHandler. This last parameter represent a function in which we can perform any further action after the normal write to socket as been done and before the connection to the client is closed. A free function with two parameters, a constant reference to a Boost error_code and a size_t, is expected, but bind() is here a helpful friend. I use both parameters, but we could easily get rid of them. More importantly, notice the use of shared_from_this(). Even if we don't want do anything in the WriteHandler, it is vital that the connection is kept alive till the end of writing. Keeping the "this" reference active here does the trick.

Server

In the core of our server there is an ASIO TCP/IP acceptor, that is initialized by the ctor, and used by the Server start() function to accept - asynchronously - a connection from a client on a Connection object.
using ba::ip::tcp;
const int HELLO_PORT = 50013;
// ...

class Server
{
private:
 tcp::acceptor acceptor_;
// ...
public:
 Server(ba::io_context& io) : acceptor_(io, tcp::endpoint(tcp::v4(), HELLO_PORT))
 {
  start();
 }
// ...
The ctor calls the Server private method start(), that creates a new connection on the ASIO I/O context received from the main and stored in the acceptor. The socket owned by the connection is used in the async_accept() call on the acceptor, so that the server would wait for a client connection on it.
void start()
{
 ba::io_context& io = acceptor_.get_executor().context();
 std::shared_ptr<Connection> connection = Connection::create(io);
 tcp::socket& socket = connection->socket();
 acceptor_.async_accept(socket, std::bind(&Server::handle, this, connection, std::placeholders::_1));
}
As second parameter, async_accept() expects an ASIO AcceptHandler, a void free function that gets in input a constant reference to a boost system error code, we bind it to call the following Server private method:
void handle(std::shared_ptr<Connection> connection, const bs::error_code& ec)
{
 if (!ec)
 {
  connection->start();
 }
 start();
}
If the handshake with the client worked fine, we use the connection to write - asynchronously - through the socket. Then we call again Server start(), to prepare the server to accept another connection.

This is more or less all. You could see the full C++ code on GitHub.

I have tested this server using the client described in the previous post. I found interesting adding here and there sleeps and printing to console to better observe how the process work. For more fun I'd suggest you to run more clients and let ASIO I/O control to run on a few threads, as shown in the strand example. The code is based on the official ASIO tutorial, Daytime.3 example.

Go to the full post

Boost ASIO synchronous exchange on TCP/IP

Let's build a simple synchronous client-server application based on the TCP/IP protocol using the Boost ASIO ip tcp socket. The server waits a connection on a port, as it comes, it writes a message and then terminate. The client connects to the server, reads its message from the socket, outputs it, and then it terminates too.

Main

I have put both client and server in a single app, if no parameter is passed to the main, the process acts as server, otherwise as a client.
namespace ba = boost::asio;
// ...
const std::string HOSTNAME{ "localhost" };  // 1
// ...

int main(int argc, char* argv[])
{
 ba::io_context io;  // 1

 if (argc > 1)
  client(io, HOSTNAME);
 else
  server(io);
}
1. Used by the client, when connecting to the server. In this simple example both server and client live on the same machine.
2. An io_context (also known as io_service, but that name is now deprecated) is the first requirement for almost anything in ASIO, so I create it as first thing, then is passed by reference to the client or server function, accordingly to the number of parameters passed to the program.

Server

The following code in the server function throws exceptions deriving from std::exception to signal problems. Being this just an example, we just wrap it in a try-catch and output the relative message.
using ba::ip::tcp;
// ...
const int HELLO_PORT = 50013;
// ...

tcp::acceptor acceptor{ io, tcp::endpoint(tcp::v6(), HELLO_PORT) };  // 1

{   // 2
 tcp::socket socket{ io };  // 3
 std::cout << "Server ready" << std::endl;
 acceptor.accept(socket);  // 4

 std::string message{ "Hello from ASIO" };  // 5
 bs::error_code ec; // 6
 ba::write(socket, ba::buffer(message), ec);  // 7
}
1. Instantiate an object of the ASIO TCP/IP acceptor, so that we can listen for connections. We pass to it the ASIO io_context and a TCP endpoint, created specifying the version of the TCP/IP protocol to use (4 or 6) and the port to be used.
2. Here this block is executed just once. Convert it to a for loop for a more common behavior.
3. Each client connection requires a dedicated ASIO TCP/IP socket to be managed. Here it is created and, at the end of the block, exiting the scope, the socket dtor would clean it up.
4. The server sits down, waiting for a client to be served.
5. When the acceptor has accepted a client on the socket, the server wakes up and builds a message.
6. The ASIO write call in the next line requires an error code, to be set in case something goes wrong. We won't even check it here, usually this is not a good idea.
7. The message is converted to an ASIO buffer, so that it could be consumed by the ASIO write() to be written to the socket.

Client

It mirrors the server, with the part of the acceptor taken by a resolver.
tcp::resolver resolver{ io };
tcp::resolver::results_type endpoints = resolver.resolve(host, HELLO_PORT_STR);  // 1

tcp::socket socket{ io };
ba::connect(socket, endpoints);  // 2

for (;;)  // 3
{
 std::array<char, 4> buf;  // 4
 bs::error_code error;
 size_t len = socket.read_some(ba::buffer(buf), error);  // 5

 if (error == ba::error::eof)  // 6
  break; // Connection closed cleanly by peer
 else if (error)
  throw bs::system_error(error);  // 7

 std::cout.write(buf.data(), len);  // 8
 std::cout << '|';  // 9
}
std::cout << std::endl;
1. The resolver is resolved on the required host and port, returning a list of valid endpoints on them.
2. We call the ASIO connect() on the socket created in the line above, specifying the endpoints resolved in (1).
3. Let's loop until the full message is received from the server.
4. I have set the buffer size to a ridiculously low size, just for see it better at work.
5. read_some() data from the socket in the buffer.
6. If we reach end of file, the connection has been correctly completed.
7. Otherwise we interrupt the operation throwing the Boost system error we got.
8. Use the partial data received from the server.
9. This pipe character is put at the end of each chunk of data read only for seeing the effect on the read message.

Full C++ code is on GitHub. It is based on the Daytime.1 and Daytime.2 example from the official Boost ASIO tutorial.

Go to the full post

HackerRank DP: Coin Change

We are given in input a list of integers, and another integer that represents a total we should reach adding up how many elements from the passed list, each of them used 0+ times, with no upper limit. The name of the problem is slightly misleading, since the list could contain any positive integer, and we could not have almost any expectation them, besides being positive.
I'd say that is a version of the Partitions of Integers problem where a special condition is imposed on the integers that we can use.

You can find and solve this problem on HackerRank, section Cracking the Coding Interview.

First thing, I have taken a non completely trivial example and I studied it on paper.

Given in input [2, 5, 3, 6] and 10, it easy to see how the solution is 5:
2 + 2 + 2 + 2 + 2
5 + 5
2 + 3 + 5
3 + 3 + 2 + 2
2 + 2 + 6
The fact that it is marked as DP, should put me on the way of looking for a Dynamic Programming solution. So I create a table, reasoning how to fill it up coherently. Each column represents the totals I could get, ranging from 0 to the passed value. I have a column for each number passed in the input list, plus the topmost one, that represents the "no value" case.

Cell in position (0, 0) is set to 1, since I could get 0 from no value in just one way. The other values in the first row are set to zero, since I can't get that total having nothing to add up. We don't care much what it is in the other cells, since we are about to get the right value by construction.

We'll move in the usual way for a dynamic programming problem requiring a bidimensional table, row by row, skipping the zeroth one, from top to bottom, moving from left to right. We could have filled the first column before starting the procedure, since it is immediate to see that there is only one way to get a total of zero, whichever number I have at hand. Still, in this case it doesn't help to make the code simpler, so I just keep it in the normal table filling part.

For each cell what I have to do is:
  • copy the value from the cell above
  • if "cur", the current value associated to the row, is not greater than the current column index, add the value in the cell on the same row but "cur" times to the left
The first point should be clear. Maybe having a new number at hand would give us a new way to get the total, surely it won't reduce the alternatives we have already calculated.
The second point refers to the contribution of the new element. I guess the picture will help understand it.

The arrow pointing down from (0, 0) to (1, 0) means that since having no values leads to have one way to get a sum of zero, this implies that having no value and 2 still gives at least one way to get a sum of zero.
The other arrow pointing down, from (2, 8) to (3, 8) means that having one way to get 8 from no value and [2, 5] implies we still have at least one way to get it from no value and [2, 5, 3].
The arrow pointing left from (1, 0) to (1, 2) means that since we have a way to get zero having a 2, if we add a 2, we have a way to get 2 as a total.
The arrow pointing left from (3, 5) to (3, 8) means that having two ways of getting 5 using [2, 5, 3] implies that we still have two way of getting 5 + 3 = 8. Added with the one coming from the cell above, it explains why we put 3 in this cell.

Following the algorithm, I have written this python code here below:
def solution_full(denominations, total):  # 1
    table = [[0] * (total + 1) for _ in range(len(denominations) + 1)]  # 2
    table[0][0] = 1

    for i in range(1, len(denominations) + 1):  # 3
        for j in range(total+1):
            table[i][j] += table[i - 1][j]  # 4
            cur = denominations[i-1]
            if cur <= j:
                table[i][j] += table[i][j-cur]  # 5

    return table[-1][-1]  # 6
1. In the example, denominations is [2, 5, 3, 6] and total is 10.
2. Table has total + 1 columns and a row for each denomination, plus one. Its values are all set to zero, but the left-topmost one, set to 1.
3. Loop on all the "real" cells, meaning that I skip just the first row. I move in the usual way. Left to right, from the upper row downward.
4. The current cell value is initialized copying the value from the immediate upper one.
5. If there are enough cell to the left, go get the value of the one found shifting for the value of the current denomination, and add it to the one calculated in the previous step.
6. Return the value in the bottom right cell, that represents our solution.

How to save some memory

Writing the code, I have seen how there is no use in keeping all the rows. The only point where I use the values in the rows above the current one is in (4), and there I use just the value in the cell immediately above the current one. So I refactored the solution in this way:
def solution(denominations, total):
    cache = [0] * (total + 1)  # 1
    cache[0] = 1

    for denomination in denominations:  # 2
        for j in range(denomination, total+1):  # 3
            cache[j] += cache[j-denomination]
    return cache[-1]
1. The memoization here is done just in one row. Initialized as in the previous version.
2. Since I don't care anymore of the row index, I can just work directly on the denominations.
3. Instead of checking explicitly for the column index, I can start the internal loop from the first good position.

I pushed my python script with both solutions and a slim test case to GitHub.

Go to the full post