HackerRank Deque-STL

Given an array of integers, find the max value for each contiguous subarray in it sized k. This HackerRank problem is meant to be solved in C++ and, as its name suggests, using a deque.

For instance, if we are given the array {3, 4, 6, 3, 4} and k is 2, we have to consider four subarrays sized:
{3,4} {4,6} {6,3} {3,4} 
And the expected solution is
{4, 6, 6, 4}
An adapter

The original HackerRank problem asks to write a function than outputs its result to standard output. I didn't like much this requisite. As a TDD developer, I'm used to let tests drive the code development. And having to check standard output to verify a function behavior is not fun. So I slightly changed the function signature, asking to return a vector containing the results, and I used the original function as a simple adapter to the original problem. Something like that:
std::vector<int> maxInSubs(int data[], int n, int k)
    // ...

// ...
void printKMax(int arr[], int n, int k)
    auto data = maxInSubs(arr, n, k);
    std::copy(data.begin(), data.end(), std::ostream_iterator<int>(std::cout, " "));
    std::cout << '\n';
First (naive) attempt

Just do what we are asked to to. For each subarray find its maximum value ad push it to the result vector.
std::vector<int> maxInSubs(int data[], int n, int k)
    std::vector<int> results;
    for (int i = 0; i < n - k + 1; ++i)
        results.push_back(*std::max_element(data + i, data + i + k));
    return results;
Clean and simple and, when k and n are small, not even too slow. However, for k comparable to a large n we can say bye bye to performance.

Patched naive attempt

We could be tempted to save the algorithm explained above, observing that the slowness is due to the k calls to max_element(). We could avoid to call it a substantial number of times checking the value of the elements exiting and entering the current window, for instance in this way:
std::vector<int> results{ *std::max_element(data, data + k) };  // 1

for (size_t beg = 1, end = k + 1; end <= n; ++beg, ++end)  // 2
    if (data[end - 1] > results[results.size() - 1])  // 3
        results.push_back(data[end - 1]);
    else if (data[beg - 1] < results[results.size() - 1])  // 4
        results.push_back(results[results.size() - 1]);
    else  // 5
        results.push_back(*std::max_element(data + beg, data + end));
1. Initialize the result vector with the max element for the first interval.
2. Keep beg and end as loop variable, describing the current window to check.
3. The new right element of the window is bigger than the max for the previous window. Surely it is the max for this one.
4. The element that has just left the window is smaller than the previous max. Surely the max is still in the window.
5. Otherwise, we'd better check which is the current max.

A smartly designed array in input could beat this simple algorithm. However on HackerRank they didn't spend too much time on this matter, and this solution is accepted with full marks.

Solution with a deque

In a more elegant solution, we should to minimize the multiple check we perform on the data elements. Right, but how? Until this moment, I haven't paid attention to the huge hint HackerRank gave us, "Use a deque!", they shout from the name of the problem itself.

The point is that I want to perform a cheap cleanup on each window, so that I could just pick a given element in it, without scanning the entire interval.

Let's use the deque as a buffer to store only the reasonable candidates as max. Since we want to remove from this buffer the candidates that are not anymore valid when the window is moved, instead of their value we keep in it their indices from the original data array.

Here is how I initialize it:
std::deque<int> candidates{ 0 };  // 1
for (int i = 1; i < k; ++i)
    pushBack(candidates, data, i);  // 2
1. We could safely say that the first element in data is a good candidate as max for its first subarray.
2. Push back to candidates the "i" index from data, but first ensure the previous candidates are useful.

Since the code in pushBack() is going to be used also afterward, I made function for it:
void pushBack(std::deque<int>& candidates, int data[], int i)
    while (!candidates.empty() && data[i] >= data[candidates.back()])  // 1
1. There is no use in a candidate, if the newcomer is bigger, so remove it.

Now candidates contains the indices of all the elements in the first window on data having the max value. Possibly just one element, but for sure the deque is not empty.

We are ready for the main loop:
for (int i = k; i < n; ++i)
    results.push_back(data[candidates.front()]);  // 1

    if (candidates.front() <= i - k)  // 2

    pushBack(candidates, data, i);  // 3
results.push_back(data[candidates.front()]);  // 4
1. As said above, we know that candidates is not empty and its front is the index of a max value in the current window. Good. Push it to results.
2. Now we prepare for the next window. If the front candidate is out, we remove it.
3. Push back the new element index among the candidates, following the algorithm described above. It would kill the candidates that are not bigger than it, ending up with a deque where the biggest element is surely on front.
4. Remember to push the last candidate in the results, and then the job is done.

Does this solution look more convincing to you? Full C++ code and test case on GitHub.

Go to the full post

HackerRank Equal

We have a list of integers, and we want to know in how many rounds we could make them all equal. In each round we add to all the items in the list but one the same number chosen among 1, 2, and 5. Pay special attention when you solve this problem on HackerRank. Currently (April 2018) its enunciation says each increase should be of one, three, or five. However, this is not what the solution is tested for.
It looks like someone decided to change 2 for 3 a few months ago, edited the description and then forgot to modify the testing part. Who knows what is going to happen next. Be ready to be surprised.
Besides, it is inserted in the Algorithms - Dynamic Programming section, but I don't think that is the right approach to follow.

Simplifying the problem

Instead of applying the weird addition process stated in the problem, we could decrease each time a single item. For instance, given in input [2, 2, 3, 7], a solution to the original problem is:
2, 2, 3, 7
+5 +5 +5  = (1)
 7, 7, 8, 7
+1 +1  = +1 (2)
 8, 8, 8, 8
We could solve it in this way instead:
2, 2, 3, 7
 =  =  = -5 (1)
 2, 2, 3, 2
 =  = -1  = (2)
 2, 2, 2, 2
Since we are asked to calculate the number of steps to get to the equal state, in both ways we get the same result.

Base cases

A sort of rough algorithm is already emerging. We get the lowest value in the list, and decrease all the other elements using the available alternatives until we reach it. We start from the biggest value (5) and only when we are forced to, we fall back to the shorter steps.

To better understand how we should manage the last steps, I put the base cases on paper.
If x is at level 1, 2 or 5, we could get zero at cost 1. But if x is at level 3 or 4, it costs 2 to us. Notice that if, instead of getting to level zero, we move both the lowest element and the x element at level -2, we get to the result in the same number of moves. For instance, if our input list is [10, 13]
10, 13
    -2 (1)
    -1 (2)
10, 10
10, 13
-2     (1)
    -5 (2)
 8,  8
If the input list is [0, 3, 3] getting down to the bottom from 3 in just one move gives us an advantage:
10, 13, 13
    -2      (1)
    -1      (2)
        -2  (3)
        -1  (4)
10, 10, 10
10, 13, 13
-2          (1)
    -5      (2)
        -5  (3)
 8,  8,  8
The algorithm

I think I've got it. Firstly I find the minimum value, M, in the input list. That is a possible target for all the items to be reach. However I have to check other two alternatives, M-1 and M-2.
I loop on all the items in the list. For all the three possible targets, I calculate the difference between it and the current value, count the number of steps to get there, and add it to the total number of steps required for getting to that target.
And then I choose as a result the cheapest target to reach.

The code

Using Python as implementation language, I started with a couple of test cases, and then added a few ones along the way, when I bumped into troubles, and I ended up with this code.
SHIFT = [0, 1, 2]  # 1

def solution(data):
    lowest = min(data)  # 2

    results = [0] * len(SHIFT)  # 3
    for item in data:
        for i in SHIFT:
            gap = item - lowest + i  # 4
            results[i] += gap // 5 + SHIFT[(gap%5 + 1) // 2]  # 5
    return min(results)  # 6
1. Represents the three possible targets, from the minimal value in the list down to -2.
2. Get the minimal value in the list.
3. Buffer for the results when using the different targets.
4. Distance that has to be split.
5. Add the current number of steps to the current buffer. Firstly I get the number of long moves dividing gap by 5. Now I am in the base case, as showed in the picture above. Notice that the cost of moving from X to target is [0, 1, 1, 2, 2] for gap in [0..5], if we take gap, apply modulo five, increase it and then divide by two, we get the index in SHIFT to the number of steps actually needed. Admittedly an obscure way to get there, if this was production code, I would have probably resisted temptation to use it.
6. Get the minimal result and return it.

All tests passed in hackerrank, python script and test case pushed to GitHub.

Go to the full post

HackerRank Roads and Libraries

We are given n nodes and a (possibly huge) number of edges. We are also given the cost of building a library in a city (i.e. a node) and a road (i.e. an edge). Based on these data we want to minimize the cost of creating a forest of graphs from the given nodes and edges, with the requirement that each graph should have a library on one of its nodes. This is a HackerRank problem on Graph Theory algorithms, and I am about to describe my python solution to it.

If a library is cheaper than a road, the solution is immediate. Build a library on every node.
def solution(n, library, road, edges):
    if road >= library:
        return n * library

    # ...
Otherwise, we want to create a minimum spanning forest, so to minimize the number of roads, keeping track of the number of edges used and how many graphs are actually generated. I found natural using an adapted form of the Kruskal MST (Minimum Spanning Tree) algorithm, that looks very close to our needs.

Kruskal needs a union-find to work, and this ADT is not commonly available out of the box. So, I first implemented a python UnionFind class, see previous post for details.
Then, while working on this problem, I made a slight change to it. My algorithm was simpler and faster if its union() method returned False when nothing was actually done in it, and True only if it led to a join in two existing graph.

Using such refactored UnionFind.union(), I wrote this piece of code based on Kruskal algorithm:
uf = UnionFind(n)
road_count = 0  # 1

for edge in edges:
 if uf.union(edge[0] - 1, edge[1] - 1):  # 2
  road_count += 1  # 4
  if uf.count == 1:  # 5
1. The union-find object keeps track of the numbers of disjointed graphs in the forest, but not of edges. This extra variable does.
2. I need to convert the edges from 1-based to 0-based convention before use them. If the two nodes get connected by this call to union(), I have some extra work to do.
4. An edge has been used by union(), keep track of it.
5. If union() connected all the nodes in a single graph, there is no use in going on looping.

Now it is just a matter of adding the cost for roads and libraries to get the result.
return road_count * road + uf.count * library

Complete python code for problem, union-find, and test case on GitHub.

Go to the full post

Union Find

I have a (possibly huge) bunch of edges describing a forest of graphs, and I'd like to know how many components it actually has. This problem has a simple solution if we shovel the edges in a union-find data structure, and then just ask it for that piece of information.

Besides the number of components, our structure keeps track of the id associated to each node, and the size of each component. Here is how the constructor for my python implementation looks:
class UnionFind:
    def __init__(self, n):  # 1
        self.count = n
        self.id = [i for i in range(n)]  # 2
        self.sz = [1 for i in range(n)]  # 3
1. If the union-find is created for n nodes, initially the number of components, named count, is n itself.
2. All the nodes in a component have the same id, initially the id is simply the index of each node.
3. In the beginning, each node is a component on its own, so the size is initialized to one for each of them.

This simple ADT has two operations, union and find, hence its name. The first gets in input an edge and, if the two nodes are in different components, joins them. The latter returns the id of the passed node.

Besides, the client code would check the count data member to see how many components are in. Pythonically, this is exposure of internal status is not perceived as horrifying. A more conservative implementation would mediate this access with a getter.

Moreover, a utility method is provided to check if two node are connected. This is not a strict necessity, still makes the user code more readable:
def connected(self, p, q):
    return self.find(p) == self.find(q)
The meaning of this method is crystal clear. Two nodes are connected only if they have the same id.

In this implementation, we connect two nodes making them share the same id. So, if we call union() on p and q, we'll change the id of one of them to assume the other one. Given this approach, we implement find() in this way:
def find(self, p):
    while p != self.id[p]:
        p = self.id[p]
    return p
If the passed p has id different from its default value, we check the other node until we find one that has its original value, that is the id of the component.

We could implement union() picking up randomly which id keep among the two passed, but we want keep low the operational costs, so we work it out so to keep low the height of the tree representing nodes in a component, leading to O(log n) find() complexity.
def union(self, p, q):
    i = self.find(p)
    j = self.find(q)
    if i != j:  # 1
        self.count -= 1  # 2
        if self.sz[i] < self.sz[j]:  # 3
            self.id[i] = j
            self.sz[j] += self.sz[i]
            self.id[j] = i
            self.sz[i] += self.sz[j]
1. If the two nodes are already in the same component, there is no need of doing anything more.
2. We are joining two components, their total number in the union-find decrease.
3. This is the smart trick to keep low the cost of find(). We decide which id to keep as representative for the component accordingly with the sizes of the two merging ones.

As example, consider this:
uf = UnionFind(10)
uf.union(4, 3)
uf.union(3, 8)
uf.union(6, 5)
uf.union(9, 4)
uf.union(2, 1)
uf.union(5, 0)
uf.union(7, 2)
uf.union(6, 1)
I created a union-find for nodes in [0..9], specifying eight edges among them, from (4, 3) to (6, 1).
As a result, I expect two components and, for instance, to see that node 2 and node 6 are connected, whilst 4 and 5 not.

I based my python code on the Java implementation provided by Robert Sedgewick and Kevin Wayne in their excellent Algorithms, 4th Edition, following the weighted quick-union variant. Check it out also for a better algorithm description.

I pushed to GitHub full code for the python class, and a test case for the example described above.

Go to the full post

HackerRank Climbing the Leaderboard

We are given two lists of integers. The first one is monotonically decreasing and represent the scores of the topmost players in a leaderboard. The second one is monotonically increasing and contains the score history of Alice, a player who rapidly climbed the board.
Following the dense ranking convention, we want to get back a list containing the history of rank positions for Alice.
This is a HackerRank Algorithm Implementation problem, and I am going to show you how I solved it, using Python as implementation language.

I noticed that the first list, scores, is already sorted, we just have to get rid of duplicates to have a matching between the position and the score Alice has to get to achieve that ranking.

Bisect solution

I just have to do the matching. First idea jumped to my mind, was performing a binary search on scores to do it. It helps that Python provides for the job a well known library, bisect. There's just a tiny issue, bisect expects the underlying list to be sorted in natural order, so we need to reverse our scores.

It looks promising, let's implement it.

A pythonic way to get our ranking would be this:
ranking = sorted(list(set(scores)))
I get the original list, convert to a set to get rid of duplicates, than back to list, so that I can sort it in natural order. Nice, but in this problem we are kind of interested in performance, since we could have up to 20 thousand items in both lists. So we want to take advantage of the fact that the list is already sorted.

So, I ended up using this rather uncool code:
ranking = [scores[-1]]
for i in range(len(scores)-2, -1, -1):
 if scores[i] > ranking[-1]:
I initialize the ranking list with the last item in scores, then I range on all the other indices in the list from right to left. If the current item is bigger than the latest one pushed in ranking, I push it too.

Now I can just rely on the bisect() function in the bisect python module, that would find which position the current item should be inserted in the list. With a caveat, I have reverted the order, so I have to adjust the bisect() result to get the result I'm looking for:
results = []
last = len(ranking) + 1
for score in alice:
 results.append(last - bisect(ranking, score))
This code pass all the tests, job done.

However. Do I really need to pay for the bisect() search for each element of alice?

Double scan solution

Well, actually, we don't. Since we know that both list are sorted, we can use also the ordering in alice to move linearly in ranking.

Since we are not using anymore bisect, we don't need to revert the sorting order in ranking, and the duplicate cleanup is getting a bit simpler:
ranking = [scores[0]]
for score in scores[1:]:
 if score < ranking[-1]:

Now we compare each item in alice against the items in ranking moving linearly from bottom to head:
results = []
for score in alice:
 while ranking and score >= ranking[-1]:
 results.append(len(ranking) + 1)
We don't have to be alarmed by the nested loops, they don't have a multiplicative effect on the time complexity, since we always move forward on both lists, the result is a O(M + N) time complexity.

Is this a better solution than the first one? Well, it depends. We should know more on the expected input. However, for large and close values of N and M, it looks so.

I pushed the python script for both solutions and a test case to GitHub.

Go to the full post

HackerRank Divisible Sum Pairs

Given a list of integers, we want to know how many couples of them, when summed, are divisible by a given integer k.

So, for instance, given [1, 2, 3, 4, 5, 6], we have five couples of items with sum divisible by 3:
(1, 2), (1, 5), (2, 4), (3, 6), (4, 5)
This is a HackerRank algorithm problem, implementation section.

Naive solution

Test all the couples, avoiding duplicates. If we check (a1, a2), we don't have to check (a2, a1).

The code I have written for this solution should be of immediate comprehension, even if you are not that much into Python:
result = 0
for i in range(len(values) - 1):  # 1
    for j in range(i+1, len(values)):  # 2
        if (values[i] + values[j]) % k == 0:  # 3
            result += 1
1. Loops on all the indices in the list but the last one.
2. Loops from the next index to the current "i" to the last one.
3. Check the current couple, and increase the result if compliant.

Even if this is what HackerRank was expecting from us (the problem is marked as easy), we can do better than this, considering its disappointing O(N^2) time complexity.

Linear solution

The problem could be restated as counting the couples that, added up, equal to zero modulo k. Following this insight, let's partition the items accordingly to their own modulo.
remainders = [0] * k
for i in range(len(values)):
    remainders[values[i] % k] += 1
Each element in the "remainders" list represents the number of items in the original list having as modulo the index of the element.

For the example shown above we'll get this remainders:
[2, 2, 2]
Now, if we add an element having remainder x to element with remainder k - x, we'll get a number equal zero modulo k. We want all the possible combinations of the x elements with the k - x ones, so we apply the Cartesian product to the two sets, that has a size that is their product.

There are a couple of exceptions to this rule. The elements having modulo zero have to be added among themselves, and the same happens to the element having as modulo half k, if k is even. The number of combinations of a set of N elements could be expressed as N * (N-1) / 2.

Putting all together we have this code:
result = remainders[0] * (remainders[0] - 1) // 2  # 1

pivot = k // 2  # 2
if k%2:
    pivot += 1  # 3
    result += remainders[k//2] * (remainders[k//2] - 1) // 2  # 4

for i in range(1, pivot):  # 5
    result += remainders[i] * remainders[k-i]
1. Initialize "result" using the above described formula for the modulo zero items.
2. Let's calculate the central element in the list, where we have stop looping to sum up.
3. If k is odd, we won't have a lonely central element, and the pivot should be moved a step to the right.
4. When k is even, the elements having half-k modulo are managed as the zero modulo ones.
5. Normal cases.

After the for-loop, result contains the answer to the original question.

I pushed a python script with both solutions, naive and remainder based, to GitHub, along with a few tests.

Go to the full post

Boost ASIO echo UDP asynchronous server

A change request for the echo UDP client-server app discussed before. We want keep the client as is, but we need the server be asynchronous.

Instead of using the synchronous calls receive_from() and send_to() on a UDP socket, we are going to use their asynchronous versions async_receive_from() and async_send_to().

The asynchronicity leads naturally to implement a class, having a socket has its private data member, so that we can make our asynchronous call on it.
const int MAX_LEN = 1'024;
const uint16_t ECHO_PORT = 50'015;

class Server
    udp::socket socket_;  // 1
    udp::endpoint endpoint_;  // 2
    char data_[MAX_LEN];  // 3

// ...
1. Our ASIO UDP socket.
2. The endpoint we use to keep track of the client currently connected to the server.
3. Data buffer, used to keep track of the message received from the client.

The constructor gets the app ASIO I/O context by reference from the caller and uses it to instantiate its member socket. Then it calls its private method receive() to start its endless job.
Server(ba::io_context& io) : socket_(io, udp::endpoint(udp::v4(), ECHO_PORT))  // 1

void receive()
    socket_.async_receive_from(ba::buffer(data_), endpoint_, [this](bs::error_code ec, std::size_t len) {  // 2
        if (!ec && len > 0)  // 3
            receive();  // 4
1. The socket requires also an endpoint describing the associated protocol and port. We create it on the fly.
2. Call asynchronously receive_from on the socket. ASIO would put in the data buffer what the client sends and store its network information in the passed endpoint. When the socket receive is completed, ASIO would call the handler passed as third parameter, here a lambda that captures "this" and honors the expected parameters.
3. If the receiving worked fine - no error_code reported - and the message is not empty, we'll call our Server send() method, to echo the message.
4. Otherwise - error or empty message - we don't have to send anything back, so we call the enclosing receive() method, to serve a new client.

When a "good" message is received from a client, our server sends it back to it as is:
void send(std::size_t len)
    socket_.async_send_to(ba::buffer(data_, len), endpoint_, std::bind(&Server::receive, this));
The socket gets the job of asynchronously send the data, as stored in the Server member variable, with the length, as passed in as parameter, to the endpoint saved as Server member variable. When the data transfer is completed, ASIO would call the handler passed as third argument. Here we don't want to do anything in case or error, not even logging something, so we can simply bind to the handler a call to "this" receive(), ignoring error code and length of the transferred data.

I pushed the complete C++ source file to GitHub. The code is based on the UDP asynchronous echo example in the official Boost ASIO documentation.

Go to the full post

Boost ASIO echo TCP asynchronous server

Let's refactor the echo TCP server to achieve asynchrony. It's going to be a lot of fun. If you feel that it is too much fun, you could maybe have first a look at the similar but a bit simpler asynchronous server discussed in a previous post.


This server works with the same clients as seen for the synchronous server, here we deal just with the server. All the job is delegated to the Server class, whose constructor gets a reference to the application ASIO I/O context.
namespace ba = boost::asio;
// ...

Server server(io);

The server ctor initialize its own acceptor on the ASIO I/O context on the endpoint specifying the TCP IP protocol and port chosen, then it calls its private member method accept():
using ba::ip::tcp;
// ...

const uint16_t ECHO_PORT = 50'014;
// ...

class Server
 tcp::acceptor acceptor_;

 void accept()
  acceptor_.async_accept([this](bs::error_code ec, tcp::socket socket)  // 1
   if (!ec)
    std::make_shared<Session>(std::move(socket))->read();  // 2
   accept();  // 3
 Server(ba::io_context& io) : acceptor_(io, tcp::endpoint(tcp::v4(), ECHO_PORT))
1. As handler to async_accept() is a lambda that gets as parameters an error code that let us know if the connection from the client has been accepted correctly, and the socket eventually created to support the connection itself.
2. A beautiful and perplexing line. We create a shared_prt smart pointer to a Session created from a rvalue reference to the socket received as parameter, and call on it its read() method. However this anonymous variable exits its definition block on the next line, so its life is in danger - better see what is going on in read(). Actually, we are in danger, too. If something weird happens in this session object, we don't have any way to do anything about.
3. As soon as a Session object is created, a new call to accept() is issued, an so the server puts itself in wait for a new client connection.


As we have seen just above, we should expect some clever trick from Session, especially in its read() method. Thinking better about it, it is not a big surprise seeing that its superclass is enable_shared_from_this:
class Session : public std::enable_shared_from_this<Session>
 tcp::socket socket_;
 char data_[MAX_LEN];

// ...
 Session(tcp::socket socket) : socket_(std::move(socket)) {}  // 1

 void read()  // 2
  std::shared_ptr<Session> self{ shared_from_this() };  // 3
  socket_.async_read_some(ba::buffer(data_), [self](bs::error_code ec, std::size_t len) {  // 4
   if (!ec)
1. The ctor gets in the socket that we seen was created by the acceptor and moved in, in its turn, the constructor moves it to its data member.
2. The apparently short lived Session object created by the handler of async_accept() calls this method.
3. A new shared_ptr is created from this! Actually, being such, it is the same shared_prt that we have seen in the calling handler, just its use counter increased. However, our object is still not safe, we need to keep it alive until the complete read-write cycle between client and server is completed.
4. We read asynchronously some bytes from the client. To better see the effect, I have set the size of the data buffer to a silly low value. But the more interesting part here is the handler passed to async_read_some(). Notice that in the capture clause of the lambda we pass self, the shared pointer from this. So our object is safe till the end of the read.

So far so good. Just remember to ensure the object doesn't get invalidated during the writing process:
void write(std::size_t len)
 std::shared_ptr<Session> self{ shared_from_this() };
 ba::async_write(socket_, ba::buffer(data_, len), [self](bs::error_code ec, std::size_t) {
  if (!ec)
Same as in read(), we ensure "this" stays alive creating a shared pointer from it, and passing it to the async_write() handler.

As required, as the read-write terminates, "this" has no more live references. Bye, bye, session.

I have pushed my C++ source file to GitHub. And here is the link to the original example from Boost ASIO.

Go to the full post

Boost ASIO echo UDP synchronous client-server

Close to the previous post. The main difference that there we have seen a TCP-based data exchange while here we see a UDP echo.


This server is simpler than the previous one. Just one connection is served at a time.
udp::socket socket(io, udp::endpoint(udp::v4(), ECHO_PORT));  // 1

for (;;)  // 2
 char data[MAX_LEN];
 udp::endpoint client;
 size_t len = socket.receive_from(ba::buffer(data), client);  // 3

 // ...
 socket.send_to(ba::buffer(data, len), client);  // 4
1. Create an ASIO UDP socket on the app io_context, on a UDP created on the fly where the UDP IP protocol and the port to be used are specified.
2. Forever loop to serve, in strict sequential order, all the requests coming from clients.
3. ASIO blocks here, expecting the socket to receive a connection from a client. Make sure that the buffer data is big enough.
4. Since this is an echo server, nothing exciting happens between receiving and sending. Here we send the data, as received, to the endpoint as set by receive_from().

char request[MAX_LEN];
// ...

udp::socket socket{ io, udp::endpoint(udp::v4(), 0) };  // 1
udp::resolver resolver{ io };
udp::endpoint destination = *resolver.resolve(udp::v4(), host, ECHO_PORT_STR).begin();  // 2
socket.send_to(ba::buffer(request, std::strlen(request)), destination);  // 3

char reply[MAX_LEN];
udp::endpoint sender;
size_t len = socket.receive_from(ba::buffer(reply), sender);  // 4
// ...
1. Create a UDP socket on the ASIO I/O context. Notice that the UDP endpoint passed specify the IP protocol but not a valid port.
2. The destination endpoint, that refers to the server, is generated by the resolver created on the line above, that resolves the specified host and port for the given UDP IP protocol. Then the first result is taken (by the begin iterator and then dereferencing). In case of any trouble we have guarantee an exception is thrown by resolve().
3. Send the data through buffer to the socket, that mediates the connection to the server.
4. Once send_to() has ended its job (notice that it is a blocking function), we get the reply from the server calling receive_from(). The socket knows where to go and get the data, and will fill the passed endpoint (sender) with these information.

I pushed the full C++ code - both client and server in the same source file - to GitHub. I based them on blocking_udp_echo_server.cpp and blocking_udp_echo_client.cpp from the official Boost ASIO Tutorial.

Go to the full post

Boost ASIO echo TCP synchronous client-server

I think this echo client-server application is a good introduction to ASIO. The server creates a new TCP socket each time it receives a request from a client, and run it in a new thread, where the read-write activity is accomplished in a synchronous way. The client sends some data to the server, gets it back, and then terminates.
The structure is simple, still a few interesting points are touched.


Given io, the app ASIO io_context, and the server hostname as a string, the client tries this block, and eventually just output to console an exception.
namespace ba = boost::asio;
using ba::ip::tcp;
// ...

tcp::socket socket{ io };  // 1
tcp::resolver resolver{ io };
ba::connect(socket, resolver.resolve(host, ECHO_PORT_STR));  // 2

// ...
ba::write(socket, ba::buffer(request, reqLen));  // 3

char reply[CLIENT_MAX_LEN];  // 4
size_t repLen = ba::read(socket, ba::buffer(reply, reqLen));  // 4
// ...
1. Create an ASIO TCP socket and a resolver on the current io_context.
2. Then resolve() the resolver on the host and port of the echo server (in my case, localhost:50014), and use the resulting endpoints to estabilish a connection on the socket.
3. If the connection holds, write to the socket the data we previously put in the char buffer named request, for a size of reqLen.
4. We reserve a confidently large buffer where to store the server reply. Since we are writing a echo application, we know that the size of the data we are about to get from the client should be the same of the size we have just sent. This simplify our code to the point that we can do a single read for the complete data block.
5. Use the socket for reading from the server. We use the buffer, and the size of the data we sent, for what said on (4).

At this point we could do whatever we want with the data we read in reply with size repLen.

Server loop

Once we create an acceptor on the ASIO io_context, specifying as endpoint the IP protocol we want (here I used version 4) and the port number, we loop forever, creating a new socket through a call to accept() on the acceptor each time a request comes from a client, passing it to the session() function that is going to run in a new thread.
tcp::acceptor acceptor{ io, tcp::endpoint(tcp::v4(), ECHO_PORT) };

for (;;)
 std::thread(session, acceptor.accept()).detach();
Notice that each thread created in the loop survives the exiting of the block only because it is detached. This is both handy and frightening. In production code, I would probably push them in a collection instead, so that I could explicitly kill anyone that would stop behave properly.

Server session

Since we don't know the size of the data sent by the client, we should be ready to split it and read it in chunks.
for (;;)
 char data[SERVER_MAX_LEN];  // 1

 bs::error_code error;
 size_t len = socket.read_some(ba::buffer(data), error);  // 2
 if (error == ba::error::eof)
 else if (error)
  throw bs::system_error(error);

 ba::write(socket, ba::buffer(data, len)); // 3
1. To better see the effect, I have chosen a ridiculously small size for the server data buffer.
2. The data coming from the client is split in chunks from read_some() on the socket created by the acceptor. When the read is completed, read_some() sets the passed boost system error to eof error. When we detect it, we know that we could terminate the session. Any other error says that the something went wrong.
3. If read_some() set no error, we use the current chunk of data to do what the server should do. In this case, we just echo it back to the client.

Full C++ code on GitHub. The original source is the official Boost ASIO tutorial, divided in two parts, client and server.

Go to the full post

Boost ASIO UDP asynchronous server

Having already seen how to establish an ASIO UDP synchronous connection and how create ASIO TCP asynchronous server, we sort of put them together to write an ASIO UDP asynchronous server.


As client we could happily recycle the one written for the UPD synchronous connection - only, be sure to use the same IP protocol for both. So in the main function we just instantiate an ASIO io_context (also known as io_service), pass it by reference to the ctor of a Server object, and then call run() on it.

In a second time, while running a number of clients to play with the app, you would want to run io also in other threads - be sure to do that between the server creation and the io run on the current thread.

Server class

The server would sit on port 50013 and send to the clients always the same message, concatenated with to a counter. To work it needs an ASIO UPD socket and a UDP endpoint that would identify the current client.
// ...
const int HELLO_PORT = 50'013;
const std::string MSG("Async UDP hello from ASIO ");

class Server
 udp::socket socket_;
 udp::endpoint endpoint_;
 uint16_t counter_ = 0;
// ...
 Server(ba::io_context& io) : socket_(io, udp::endpoint(udp::v6(), HELLO_PORT))
The server ctor sets the socket data member up using the reference to the ASIO io context received from the instantiator and a UDP endpoint created on the fly, specifying the required IP protocol (here version 6) and the server port.

Then the server start() private method is called:
void start()
 std::array<char, 0> buffer;  // 1
 socket_.async_receive_from(ba::buffer(buffer), endpoint_,
  std::bind(&Server::recvHandler, this, std::placeholders::_1));  // 2
 std::cout << "Server ready" << std::endl;
1. The client is expected to send an empty message, so the receive buffer could be zero sized.
2. We call async_receive_from() to receive asynchronously from the client a message in buffer. We'll get the client endpoint information in the data member and, on receive completion, it will call another Server's private method, recvHandler(), passing to it the first parameter that ASIO was expected to send, namely a reference to the boost system error_code describing how the async_receive_from() was completed.

If no error was detected in async_receive_from(), the recvHandler() creates a message and sends it to the client:
void recvHandler(const bs::error_code& error)
 if (!error)
  std::shared_ptr<std::string> data(new std::string(MSG + std::to_string(++counter_)));  // 1

  socket_.async_send_to(ba::buffer(*data), endpoint_,
   std::bind(&Server::sendHandler, this, data, std::placeholders::_1, std::placeholders::_2));  // 2
1. This piece of code is a bit involuted. We create on the heap a string containing the data to be send to the client, and we wrap it in a shared pointer. In this way we can keep it alive in a multithreading environment until we need it, that is, the end of the sendHandler() method invoked by async_send_to() at the end of its operation.
2. async_send_to() uses the endpoint set by async_receive_from() to know where sending the data. At the end, sendHandler() is called.

From the ASIO point of view, sendHandler() could be an empty method. The only important thing is that the data created in recvHandler() gets here in the shared smart pointer, so that it can ensure it not to be destroyed when still required.
void sendHandler(std::shared_ptr<std::string> data, const bs::error_code& error, std::size_t size)
 if (!error)
  std::cout << size << " byte sent from [" << *data << "]" << std::endl;
I pushed the full C++ source code on GitHub. It is based on the Daytime.6 example from the official Boost ASIO tutorial.

Go to the full post

Boost ASIO synchronous UDP client/server

If you know how to write an app that uses an ASIO TCP connection, you are close to know also how to do it on UDP.

Large part of the differences are taken care for us in ASIO, and we just have to use the socket as defined in boost::asio::ip::udp instead of its tcp counterpart.


First thing, we create a udp socket, that requires the ASIO I/O context and a udp endpoint, that needs as parameters the IP protocol to be used - version 4 or 6 - and the port - here I picked up 50013.
namespace ba = boost::asio;
namespace bs = boost::system;
using ba::ip::udp;
// ...

const unsigned short HELLO_PORT = 50'013;
// ...

void server(ba::io_context& io)
    udp::socket socket{ io, udp::endpoint{udp::v6(), HELLO_PORT} };
 // ...
Then we repeat how many times we like this block - in my tester I did it just once:
std::array<char, 0> recvData;  // 1
udp::endpoint endpoint;  // 2
bs::error_code error;  // 3
socket.receive_from(ba::buffer(recvData), endpoint, 0, error);  // 4
if (error)
 throw bs::system_error(error);  // 5

std::string message{ "UDP hello from ASIO" };

bs::error_code ec;
socket.send_to(boost::asio::buffer(message), endpoint, 0, ec);  // 6
1. In this buffer we store the message sent from the client. It has no use here, so it could be it even zero sized.
2. The endpoint, that will be used to sent the message to the client, is set by the socket receive_from() method, two lines below.
3. This error code is set by receive_from(), in case of problems.
4. The server wait synchronously here for the client. The three parameters are output ones. When the connection starts, the data coming from the client is put in the first parameter (here, an empty message is expected), the second parameter is filled with the client endpoint, the last one stores the possible error in the operation.
5. If receive_from() fails, throw the boost system error code relative exception, that is a standard runtime_error subclass.
6. Send the message to the client, using the endpoint as set by receive_from() and not specifying any flag. Any possible error code returned is disregarded.


The client function tries this code:
udp::resolver resolver{ io };
udp::endpoint destination = *resolver.resolve(udp::v6(), host, HELLO_PORT_STR).begin();  // 1

udp::socket socket{ io };
socket.open(udp::v6());  // 2

std::array<char, 0> sendData;
socket.send_to(ba::buffer(sendData), destination);  // 3

std::array<char, 128> recvData;  // 4
udp::endpoint sender;
size_t len = socket.receive_from(ba::buffer(recvData), sender);

std::cout.write(recvData.data(), len);  // 5
1. Having instantiated a udp resolver on the previous line, we resolve() on it for the same IP protocol of the server - here I used version six - specifying its host and port. Since resolve() returns at least one endpoint or fails, we could safely access the first one dereferencing its begin() iterator.
2. We need an ASIO upd socket. Having created it on the previous line, passing the current ASIO I/O control, we open it for the required UDP version.
3. We start the communication with the server, sending an empty message - as nothing more is expected from it.
4. We'd better have enough room for the message coming from the server, the actual size of it is returned by the call to receive_from().
5. Let's see what we get, outputting it to the console.

Client and server are two free functions in the same C++ file that I pushed to GitHub. Passing no parameter to its main you run it as server, otherwise is a client.

This example is based on Daytime.4 and Daytime.5 from the official Boost ASIO tutorial.

Go to the full post