Showing posts with label C++. Show all posts
Showing posts with label C++. Show all posts

Codility CountDiv

We are asked to write a function that tells how many integers in a given interval are divisible by a given number.

The solution is very easy, if you don't get confused by the fact that this problem is in the Codility Prefix Sum section.

You could actually devise an implementation that makes use of partial sum, but it is not the right direction. An hint is that the expected complexity should be constant in time and size.

Let's write a couple of test cases (C++ for GoogleTest) while thinking to a better solution:
TEST(CountDiv, Given) // 1
{
    ASSERT_EQ(3, solution(6, 11, 2));
}

TEST(CountDiv, Minimal) // 2
{
    ASSERT_EQ(1, solution(0, 1, 3));
}

TEST(CountDiv, BothSides)
{
    ASSERT_EQ(7, solution(0, 12, 2));
}
1. Codility provided test case. In the interval [6, 11] there are 3 numbers divisible by 2: 6, 8, 10.
2. Remember that you can divide zero by any numbers, so in [0, 1] there is one number (zero) divisible by three.
3. A simple case, we want to check all the integer in [0 .. n].

And here is a possible solution:
int solution(int first, int last, int div)
{
    return (last / div) - (first / div) + !(first % div);
}
Firstly I get how many integers up to the end are divisible by div, then subtract the ones to the left. But remember to add the left limit, if it is needed.

***

You could think of saving some processor seeing that "div" is the dividend for both "last" and "first", and rewriting it as a single division. After all, we know that:
(A / C) - (B / C) = (A - B) / C
Right?

Yes, if we worked on floating point numbers. However, here we have integers, so we are applying the integer division operator, discarding each time the remaind, and having (sometimes) a different result from the floating point division operator applied to the same values.

For instance consider this test case:
TEST(CountDiv, Single)
{
    ASSERT_EQ(1, solution(6, 11, 7));
}
Here we are looking for the divisors of 7 in the interval [6, 11]. It is easy to spot that the solution is one, value 7 should be accepted. Applying the integer division to "last" and "first", we are actually answering to the question "how many values have seven as divisor in the interval [1, X]?"

In this context we can't apply the distributive property of division, because we are not working on the realm of real numbers.

We can't apply subtraction on "last" and "first" and then divide the result by 7, because this would mean that we are checking the size of the interval from "first" to "last" against 7.

If I try to be smarter that required, calculating (11 - 6) / 7, I get a zero. I'm not checking if there is a number in the interval that could be divided by 7, I'm checking how big is the interval compared to 7.

Go to the full post

Process name by procfs

In a UNIX(-like) environment, the init process is the parent all the other processes. It is usually created first at startup, with a 1 as pid (process identifier). However, there is no guarantee that is true in a specific environment. In my case, I found out that Ubuntu 13.10, Saucy Salamander, follows a different convention. We have an init process with pid 1, but our user processes are managed by a "user" init.Here I show how we could get on Linux the name of the executable associated to a specific process.

We get Linux system information checking the /proc (pseudo) file system. There we could also find a subdirectory for each process currently running on the system, identified by its pid. Here we are interested in checking the "comm" file. Notice that "comm" is available only from recent versions of the Linux kernel, in case it is not available on you environment, you could fallback to "cmdline". Besides, the "comm" stored value is truncated to TASK_COMM_LEN, that should be currently set to 16, characters.

It is easy to write a C++ function that does the trick, a bit more job is required for its C little brother. Here are their prototypes:
std::string getProcessName(int pid); // 1
bool getProcessName(int pid, char* name, int size); // 2
1. C++ let me define a clean interface. I pass in input the pid, I get back the name. If the pid has no associated process, I expect an empty string as output.
2. The C version is a bit clumsier. I need to pass as input parameter the buffer where to write the name, and its size. To simplify its usage I return also a boolean (you can change it to int, for older C compilers support) reporting for success of failure.

As usual, I write a few test cases (for Google Test) to let them drive me in the development:
TEST(ProcessName, GetInitPlus)
{
  ASSERT_STREQ("init", getProcessName(1).c_str()); // 1
  ASSERT_STREQ("init", getProcessName(1680).c_str()); // 2
}

TEST(ProcessName, GetOnePlus)
{
  ASSERT_STREQ("bash", getProcessName(2292).c_str()); // 3
}

TEST(ProcessName, GetMissingPlus)
{
  ASSERT_TRUE(getProcessName(8799).empty()); // 4
}
1. Pid 1 should refer to the init process.
2. My "user" init had that specific pid when I tested my code. You should change it as required.
3. Same as for (2), the pid of my bash shell was 2292, do not expect this test to succeed without proper editing.
4. I had no process with such id, so I expected an empty string as result.

TEST(ProcessName, GetInit)
{
  const int size = 16; // 1
  char buffer[size];
  ASSERT_TRUE(getProcessName(1, buffer, size)); // 2
  ASSERT_STREQ("init", buffer);
}

TEST(ProcessName, GetUserInit)
{
  const int size = 16;
  char buffer[size];
  ASSERT_TRUE(getProcessName(1680, buffer, size)); // 3
  ASSERT_STREQ("init", buffer);
}

TEST(ProcessName, GetOne)
{
  const int size = 16;
  char buffer[size];
  ASSERT_TRUE(getProcessName(2292, buffer, size));
  ASSERT_STREQ("bash", buffer);
}

TEST(ProcessName, GetMissing)
{
  const int size = 16;
  char buffer[size];
  ASSERT_FALSE(getProcessName(8799, buffer, size));
}
1. Accordingly to the official documentation, the constant TASK_COMM_LEN should be defined in "linux/sched.h". For some reason, I couldn't find it there or in any linux include. Maybe I have outdated includes on my machine. I didn't investigate much, and I used a local constant instead.
2. I assert that a process should be found and, in the next line, that the name should be "init".
3. As for the C++ version, check the actual pid for the "user" init on your current environment before testing.

Here is the C++11 version of my function:
std::string getProcessName(int pid)
{
  std::ifstream ifs { ("/proc/" + std::to_string(pid) + "/comm").c_str() }; // 1
  if(ifs.bad()) // 2
    return {};

  std::string command;
  std::getline(ifs, command);
  return command;
}
1. I try to open an input file stream for the proc/{pid}/comm file. The C++11 to_string() function converts an integer value to a string, so that its result could make use of the operator+ std::string overload to join it to the plain C-strings on its left and right. However ifstream requires a plain C-string in input, so a c_str() conversion on the result is required.
2. If the file can't be opened, I return an empty string. Otherwise the file content is extracted and returned.

And this is my C99 version:
bool getProcessName(int pid, char* name, int size)
{
  char filename[80];
  sprintf(filename, "/proc/%d/comm", pid);

  FILE* file = fopen(filename, "r"); // 1
  if(!file)
    return false;

  fgets(name, size, file); // 2
  name[strlen(name) - 1] = '\0';
  return true;
}
1. If I can't open the "comm" file for reading, I return an error, without caring about setting the buffer.
2. Otherwise I use fgets() to read the file content, then I get rid of backslash-n at the end of the string.

Go to the full post

CURLOPT_WRITEFUNCTION and C++

Even though I am developing for C++, I don't use the curlpp wrapper to libcurl, the well known file transfer library, preferring to access its bare C interface. I feel more comfortable in this way, still there are a few low level details that require to be explicitly considered. For instance, the CURLOPT_WRITEFUNCTION option accepts as parameter only addresses to free functions (or static member functions). When we want to use a (non static) member function, we have to be prepared to deal with a certain amount of ugliness.

I'd like Curl to put the data it fetches (for more details, please have a look to the previous post where I talked about the plain Curl setup) in a (non static) member variable of the same class from which curl_easy_perform() is called. Something like that:
class CurledClass
{
  // ...

private:
  std::string data_; // 1

  CURLcode curling(/* ... */)
  {
    CURL* curl = curl_easy_init();

    // ...

    CURLcode code = curl_easy_perform(curl); // 2
    curl_easy_cleanup(curl);

    return code;
  }
};
1. I want to store the answer I get from Curl in this data member.
2. Calling Curl to perform the job.

As I said before, we can't pass a pointer to a non-static member function to CURLOPT_WRITEFUNCTION, but we pass a pointer to a static function, but we can ask Curl to pass an extra parameter to that function. We can let that parameter, CURLOPT_WRITEDATA, to be the pointer to this object, so that our static function could actually call a non-static member function:
CURL* curl = curl_easy_init();

// ...
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, &CurledClass::write); // 1
curl_easy_setopt(curl, CURLOPT_WRITEDATA, this); // 2
// ...

data_.clear(); // 3
// ...
curl_easy_perform(curl);
1. I specify the static member function I want to be called.
2. I put in the CURLOPT_WRITEDATA option the pointer to the current CurledClass object
3. It is usually a good idea to cleanup the buffer before using it.

The static function specified as CURLOPT_WRITEFUNCTION now just acts like an adapter to the member function I want to use instead:
class CurledClass
{
  // ...
  static size_t write(void* buf, size_t size, size_t nr, void* self) // 1
  {
    return static_cast<LbsTest*>(self)->mWrite(static_cast<char*>(buf), size, nr); // 2
  }

  size_t mWrite(char* buf, size_t size, size_t nr) // 3
  {
    data_.append(buf, size * nr); // 4
    return size * nr;
  }
};
1. Notice the last parameter, that I called "self". It stores the value we have put to CURLOPT_WRITEDATA.
2. The static function write() simply calls the member function mWrite(). To do that it (unsafely) casts the self parameter to this, and the first parameter to a pointer to char..
3. Here is the member function that does the job of copying the data from the Curl buffer to the local one.
4. Remember that Curl could call many times the CURLOPT_WRITEFUNCTION, and we are expected to splice the passed data to build the actual result. I am using a standard STL string as buffer, so I just append the new data as I get it.

Go to the full post

Bidimensional array annihilator

We have in input a bidimensional array of integers. We should modify it so that any row and column that has originally at least a single element set to zero would have all of them set to zero.

Conceptually it is an easy problem, its main issue is a technical one, since that working with multidimensional arrays in C/C++ has never been exactly a piece of cake.

The idea is to store in two boolean vectors the status of each row and column, setting to true anyone that needs a reset.

Then using that flags to set to zero the elements that need it.

Firstly, let's see how to implement a solution that won't use any STL container.

The C harder way

All is done through C-arrays and raw pointers.

Here is the prototype of the function I want to define:
void cAnnihilator(int* data, const unsigned nRows, const unsigned nCols);
The problem does not specify the array size, so we need to keep it open in our call. For this reason I am forced to pass to the function just the pointer to the first element.

A couple of test case:
TEST(cAnnihilator, CaseSimple) // 1
{
  const int nRows = 5;
  const int nCols = 3;
  int data[nRows][nCols] = {
    { 1, 2, 3 },
    { 4, 5, 6 },
    { 7, 8, 9 },
    { 10, 11, 0 },
    { 13, 14, 15 }
  };

  cAnnihilator(&data[0][0], nRows, nCols); // 2

  for(int i = 0; i < nRows; ++i)
    ASSERT_EQ(data[i][2], 0);
  for(int i = 0; i < nCols; ++i)
    ASSERT_EQ(data[3][i], 0);
}

TEST(cAnnihilator, CaseNothing) // 3
{
  const int nRows = 2;
  const int nCols = 3;
  int data[2][3] = {
    { 1, 2, 3 },
    { 4, 5, 6 }
  };

  cAnnihilator(&data[0][0], nRows, nCols);

  for(int i = 0; i < nRows; ++i)
    for(int j = 0; j < nCols; ++j)
    ASSERT_NE(data[i][j], 0);
}
1. The simple case creates a 5x3 array with only a zero element, after the call to the annihilator, all the elements in column 2 and row 3 are expected to be set to zero.
2. I pass to the annihilator the pointer to the first element in the array and its dimensions.
3. This "nothing" test case gives in input a 3x2 array with no zero element. No change is expected in output.

Here is how I implemented the function:
void cAnnihilator(int* data, const unsigned nRows, const unsigned nCols)
{
  bool rows[nRows]; // 1
  for(unsigned i = 0; i < nRows; ++i)
    rows[i] = false;
  bool cols[nCols];
  for(unsigned i = 0; i < nCols; ++i)
    cols[i] = false;

  for(unsigned i = 0; i < nRows; ++i)
    for(unsigned j = 0; j < nCols; ++j)
      if(*(data + (i * nCols) + j) == 0) // 2
        rows[i] = cols[j] = true;

  for(unsigned i = 0; i < nRows; ++i)
    for(unsigned j = 0; j < nCols; ++j)
      if(rows[i] || cols[j])
        *(data + (i * nCols) + j) = 0; // 3
}
1. The rows and cols boolean array are initialized to "all false". When I'll find a zero element I'll set the relative row and column flag to true.
2. Convert the current element position in bidimensional array notation to its plain distance from the first array element.
3. Set to zero all the elements in rows and columns that require it.

The C++ way

There are a couple of improvement I'd like to to on the function parameter. Firstly, I'd like to give more work to the compiler, let it the burden of deducing the array's size, instead of requiring the client to pass the number of rows and columns. And secondly, I'd prefer to use a STL container instead the raw C-array.

As a result of these consideration, I wrote this function declaration:
template<size_t N, size_t M>
void annihilator(std::array<std::array<int, N>, M>& data);
Now I am passing to the annihilator a reference to an STL array of arrays. A M-sized array of N N-sized int arrays. N and M are template parameters that are deduced from the passed bidimensional array, so that the user doesn't have to specify them in the call.

The test cases for this version of my function look like this:
TEST(Annihilator, CaseSimple)
{
  std::array<std::array<int, 3>, 5> data {{ // 1
    {{ 1, 2, 3 }},
    {{ 4, 5, 6 }},
    {{ 7, 8, 9 }},
    {{ 10, 11, 12 }},
    {{ 13, 14, 15 }}
  }};

  unsigned row = 3; // 2
  unsigned col = 2;
  data[row][col] = 0;

  annihilator(data); // 3

  for(unsigned i = 0; i < data.size(); ++i) // 4
    ASSERT_EQ(data[i][col], 0);
  for(unsigned i = 0; i < data[3].size(); ++i)
    ASSERT_EQ(data[row][i], 0);
}

TEST(Annihilator, CaseNothing)
{
  std::array<std::array<int, 3>, 2> data {{
    {{ 1, 2, 3 }},
    {{ 4, 5, 6 }}
  }};

  annihilator(data);

  for(unsigned i = 0; i < data.size(); ++i)
    for(unsigned j = 0; j < data[i].size(); ++j)
      ASSERT_NE(data[i][j], 0);
}
1. This bracket surplus is a bit annoying. It is not required by the C++11 standard but the current GCC implementation. Your compiler could understand a single bracket.
2. Set a value to zero.
3. The annihilator call is so much cleaner.
4. Ensure the annihilator works as expected.

And finally, here is my C++11 annihilator implementation:
template<size_t N, size_t M>
void annihilator(std::array<std::array<int, N>, M>& data)
{
  std::array<bool, N> cols {}; // 1
  std::array<bool, M> rows {};

  for(unsigned i = 0; i < M; ++i)
    for(unsigned j = 0; j < N; ++j)
      if(data[i][j] == 0) // 2
        rows[i] = cols[j] = true;

  for(unsigned i = 0; i < M; ++i)
    for(unsigned j = 0; j < N; ++j)
      if(rows[i] || cols[j])
        data[i][j] = 0;
}
1. The initializer list is much more expressive than a for-loop. All the elements in cols and rows are initialized to false.
2. Accessing STL-array elements as if they were C-array one (and they actually are, after all).

The source code is on github.

Go to the full post

Hello libcurl world

Reading a resource on the Internet through http via libcurl is very easy. Here I get the google news feed and I output it to the standard console.

Getting curl

There a number of ways to get the libcurl on your machine, accordingly to your development platform. You would probably want to check the official documentation on curl.haxx.se to get the right solution for you.

In my case, developing on a Ubuntu box for GCC C++, I could rely on the Debian repository to install the curl 4 development package for GnuTLS, simplifying a bit this step:
sudo apt-get install libcurl4-gnutls-dev
After that, I can see a new usr/include/curl directory containing all the .h files I need. There is also a libcurl.so, that I need to link to my C++ project. In my case, I find it in the /usr/lib/x86_64-linux-gnu directory (I have a 64 bit distribution).

Libcurl is written in C-language, and exposes an API in C. There is C++ wrapper, curlpp, that aims to provide a simpler access to them. I am not especially fond of it, what I usually do instead is creating my own thin C++ layer around the C-API. In any case, here I show the bare C access to the library.

Initialization

Before using the curl functionality in the code, we have to initialize the library. And, symmetrically, we should also cleanup when we are done. So, typically, we'll have something like:
#include <curl/curl.h>
#include <iostream>

// ...

CURLcode result = curl_global_init(CURL_GLOBAL_NOTHING); // 1
if(result != CURLE_OK) // 2
{
  std::cout << "Curl initialization failure" << result << std::endl;
  return result;
}

// ...

curl_global_cleanup(); // 3
1. There are a few option we could pass to the curl global initialization routine. Here I am happy with the plain vanilla setup, so I pass a nothing (more than the usual stuff) flag.
2. In case of error, we can't use curl.
3. We are done with curl.

CURL object

Once we have the curl library correctly initialized, we ask to it a CURL handler on which we will perform our request. Again, when we are done with it, we should cleanup.

The usual way to perform a call with curl would follow this pattern:
CURL* curl = curl_easy_init(); // 1
if(!curl)
  return CURLE_FAILED_INIT; // 2

// ...

CURLcode code = curl_easy_perform(curl); // 3
curl_easy_cleanup(curl); // 4

return code; // 5
1. The curl library exposes two interfaces. The classic one, and the easy one. Let's keep it simple.
2. If curl-easy can't provide us a CURL handler, we can't do anything more than returning an error code.
3. After setting up our request, we ask curl-easy to perform it. We get back a status code.
4. We are done with the session, clean it up.
5. The user could be interested in knowing what happened to his job. For instance, if the resource can't be fetched because a timeout occurred, a 28, defined as CURLE_OPERATION_TIMEDOUT in curl.h, is returned.

Setting options on CURL

We tell to libcurl what we want to do setting options on the CURL object we get back from the easy_init function. An example clarifies the matter:
curl_easy_setopt(curl, CURLOPT_URL, url.c_str()); // 1
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, &curlWriter); // 2
curl_easy_setopt(curl, CURLOPT_TIMEOUT, timeout); // 3
if(follow)
  curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L); // 4
1. The URL I want it to access. I had stored it in a STL string, so I have to extract a C-string from it.
2. The address of a function that it has to use to give my back what it gets calling that URL. Be patient, this callback function is showed below.
3. I am very impatient. I want the call to fail if libcurl is not able to get result in a specific time. Be aware that the timeout is specified in seconds.
4. If the resource I request has changed is URL, libcurl could find just a redirect instruction there. Do we want to follow it or not? Setting this option to one (as long) means that, yes, we want to follow it.

Write function

As promised, here is the callback function I passed to libcurl to let it give me back what it got:
size_t curlWriter(void* buf, size_t size, size_t nmemb) // 1
{
  if(std::cout.write(static_cast<char*>(buf), size * nmemb)) // 2
    return size * nmemb; // 3
  return 0;
}
1. The minimal interface for the callback function used by libcurl expects three parameters, buf is the pointer to the block of memory where it put a chunk of the answer, size is the block size in which that chunk is organized, and nmemb is the number of members we have there.
2. I just get that block of memory, and send it to the standard output console, assuming it is text resource.
3. Here I am saying to libcurl that I have correctly managed all the stuff it passed me, please send me some more - if you have it.

Full source code on github.

Go to the full post

Palindrome string

You know what a palindrome is, right? For instance, SOS by ABBA, is a song with a palindromic title played by a pop group with a palindromic name.

It is easy to write a function that detects if a string is a palindrome, so an interviewer sometimes asks it as a first question, just to break the ice.

Here is a no-frills implementation in C:
int cPalindrome(const char* input)
{
    if(input == NULL) // 1
        return 1;

    int len = strlen(input); // 2
    for(int i = 0; i < len / 2; ++i) // 3
    {
        if(input[i] != input[len - i - 1]) // 4
            return 0;
    }

    return 1; // 5
}
1. If the caller pass a NULL, should we consider it a palindrome or not? It is questionable, at it should be a matter of a little chat to decide what the expected behavior should actually be. I voted for the most permissive option.
2. Painfully, we have to loop on the string once, just to get its length. But we have no choice, we can't even start checking the string if we don't know where is its end.
3. We check each couple of elements, moving from the extremes to the center.
4. As soon as we detect a mismatch, we return a "No, sir, this is not a palindrome!".
5. All string checked (notice that it could be even an empty one), no difference spotted, we claim it is a palindrome.

We can rewrite it in C++, and the result is an amazing one-liner:
bool palindrome(const std::string& input)
{
    return std::equal(input.begin(), input.begin() + input.length() / 2, input.rbegin());
}
Right, one line, but an intense one. There are a few things to notice.

No check on NULL string. It is not possible to convert a NULL to a string object, so such test, when required, should be moved outside the function.

To compare the elements in the string, I used the STL equal() function (defined in the algorithm include file). First two parameters are the delimiter for the first sequence to be checked, third one is the begin of the second sequence. Notice that the third parameter is rbegin(), the "reverse begin" of the string, the iterator to the last valid element of the container (or, in this case, string) that, incremented, moves to the left, and not to the right, as we usually expect in an iterator.

A variation to this classical question, is asking to write it as a recursive function. Here is a possible C++ implementation:
bool rPalindrome(const std::string& input)
{
    if(input.length() < 2) // 1
        return true;
    if(input.front() == input.back()) // 2
        return rPalindrome(input.substr(1, input.length() - 2)); // 3
    return false; // 4
}
1. Empty strings, or single-character ones, are palindrome, without need of any other test.
2. The current string has more that two characters, let's check the one to the extreme left with the one to the extreme right. If they are the same, the string is a palindrome if and only if the internal string is (recursively) a palindrome.
3. So we check the internal string, that is extracted from the current one using the STL string::substr() method.
4. Otherwise the current string is not a palindrome.

Go to the full post

Reading about ZeroMQ

I have just got this book, ZeroMQ - Use ZeroMQ and learn how to apply different message patterns, written by Faruk Akgul for Packt Publishing, and I about to read it. Some more substantial feedback in the near future.

It is a bit under my level, since it is thought for C developers at their first experience with ZeroMQ, but it is a while I don't actually work with this fine framework, and I felt it would be nice to have a refresh of the basic concepts starting from a different perspective.

Browsing the index, you could see how the book is structured to provide an introduction to ZeroMQ, letting you know how to write a simple C client-server application, describing how it works. Chapter two is mainly dedicated to a couple of messaging patterns, pub-sub and pipeline, and there is also a section dedicated to Valgrind, and how to use it to detect memory leaks on 0MQ. Chapter three gives more details on what a ZeroMQ socket is, and how to use it. In its second part, an introduction to CZMQ, the high(er) level C wrapper to the standard ZeroMQ library, is given. Chapter four delves a bit more on some more advanced topics.

The code in the book has been written for ZMQ version 3.2, and CZMQ 1.3.1; for what I have seen, the building instruction are provided only for the GCC compiler (version 4.7.2 is cited, I quite confident the newest 4.8 would be alright).

On Windows, Visual C++ is the suggested compiler. I couldn't see any detail on how to set MSVC for ZeroMQ in the book, still I could assure you that it is very easy to build a ZMQ solution in that environment too. If you need an hint, have a look at this ancient post of mine, that should be still valid.

Go to the full post

Matching bracket

Think of a string (it could be very long) composed exclusively by round brackets (aka parentheses). We want to check if it is correct, in the sense that each open bracket should have a matching close one. So, a string starting with a ")" is considered wrong, while the "()()()" string is a good one. Nested brackets are allowed, so "((()())(()()))" is a good sequence.

It is not difficult to think to a solution, still, you should not forget that unharmful looking hint on the possible large input size.

I bumped into this problem when having a coffee-break with colleagues. We were talking about interview questions, and someone remembered this one. In a few minutes we envisioned a couple of possible solutions, neither of them looked satisfactory. After a bit more of talking, and we found out a third candidate, that looked quite good.

First attempt

Naively, we could iteratively remove each "()" from the input string, until we could not find anymore such pattern. After that, we simply check if the string is empty. The good thing about this solution is that is easy to describe and implement. Less convincing is that we should copy the input to a buffer, and then perform on it a lot of splitting and splicing of substrings.

Let's see a possible C++ implementation of this idea:
bool simpleChecker(const std::string& input)
{
  std::string buffer(input);

  if(buffer.empty()) // 1
    return true;

  if(buffer.length() % 2) // 2
    return false;

  while(true)
  {
    std::string::size_type pos = buffer.find("()"); // 3
    if(pos == std::string::npos)
      break;

    buffer.replace(pos, 2, ""); // 4
  }

  return buffer.empty(); // 5
}
1. The requisites do not say if an empty string should be considered valid or not. Let's assume it should.
2. If we have an odd number of elements, the string is obviously invalid.
3. We stop looping if we can't find a "()" element.
4. When we find a "()", we replace it with an empty string (i.e., we remove it).
5. If the string is empty, the validation succeeded.

This code works fine, even for moderately large strings. In the best case, a monotonous sequence of "()()()()", the standard string replace() function is so nicely written that on my machine it takes a handful of millisecs to check an input of thousands characters. On the other side, it is easy to spot a sequence that gives troubles to replace(). The test case here below takes me something around one second on a 50K string:
TEST(CPPSimple, FiftyKWorstCase)
{
  const int SIZE = 50 * 1024;

  char buffer[SIZE + 1];
  for(int i = 0; i < SIZE / 2; ++i)
    buffer[i] = '(';
  for(int i = SIZE / 2; i < SIZE; ++i)
    buffer[i] = ')';
  buffer[SIZE] = '\0';

  EXPECT_TRUE(simpleChecker(buffer));
}
And his bigger brother, 250K sized, takes something like half a minute.

Recursive approach

The main issue of the first try is that it modifies the string. A better approach would be checking the string for each matching parenthesis, without doing any change. We read the first character in the passed string, it should be an '(', if the next one is a ')', cool, we have completed a sequence, we can move to the third character. Otherwise, we call recursively the same function, to extract a subsequence. It should return the length of the found sequence, or zero, if the subsequence is not valid. The caller would check the returned value, and move in the string accordingly, to see if the next character closes its sequence, or if it has to call again itself to check if we have another subsequence.

As you can see, this solution is more contrived, but promises to be more performant. Still, it has a big issue. In the same worst case as seen above, we have such a huge number of recursive function calls that it could easily lead to corruption in the memory stack.

Just count them

Yes, just count the brackets. After all, what we want is having the same number of open and close parenthesis.

Actually, we have also to take care that we don't want to see a close bracket when there is nothing to close, but it is easy to take care of it. Each time we have an open bracket, we put it on a stack, and we pop out one of them as we get a close bracket. If we get a closing bracket before any opening one has been scanned, the stack is empty, and so we know the string is not correct.

Here is how I have implemented this solution:
bool stackChecker(const char* input)
{
  int tracker = 0; // 1

  for(int i = 0; input[i] != '\0'; ++i)
  {
    switch(input[i])
    {
    case '(': // 2
      ++tracker;
      break;
    case ')': // 3
      if(tracker)
        --tracker;
      else
        return false;
    } // 4
  }

  return tracker ? false : true; // 5
}
1. I use this integer as the stack I talked above. Since I am not interested in the actual position of the brackets, I don't need a real stack, I just need to keep track of how many open brackets have been read.
2. I see an open bracket in the string, put it in the stack, and go to the next character.
3. Close bracket. If the stack is not empty, it matches with the last open bracket in the stack, I pop it and I am ready to check the next one. If the stack is empty, the input string is not valid.
4. What if the string contains anything else? I assumed a "don't care" behavior, but possibly the string could be considered invalid. If that is the case, a default section should be added to the switch, to always return false.
5. The complete string has been scanned, if no open bracket is left in the stack, we consider it valid.

Go to the full post

Redis hello world

Once your environment is set up for Redis, it is easy to use the hiredis C interface to work with it. Here I write and use a minimal, shamelessly almost-C, bunch of functions to connect, set, get, and finally disconnect from a Redis server.

I am writing in C++, but here you won't see much of a difference from a plain C implementation. This is to keep the example focused on the hiredis features. I plan to refactor this code to have some C++ fun in a next post.

What I want to do, is connecting to a Redis server (assuming it runs locally on the default port), store on it a key/value pair, and immediately fetch it back. Something like this:
redisContext* ctx = TTR::connect("localhost", 6379); // 1
if(ctx)
{
    std::string key("key"); // 2
    std::string value("value");

    TTR::set(ctx, key, value); // 3
    std::string cache = TTR::get(ctx, key); // 4
    if(value == cache)
    {
        std::cout << "Value stored and fetched correctly" << std::endl;
    }
    else
    {
        std::cout << "Something weird happened" << std::endl;
    }

    TTR::disconnect(ctx); // 5
}
1. redisContext is the hiredis structure that keeps the context for a connection to Redis. Namespace is one of the few C++ features I'm using here, all my wrapper functions are in a namespace named TTR, so to avoid any name clash. The TTR::connect() function is one of them and, as you should expect, it establish a connection to the specified Redis server, or returns NULL in case of failure.
2. The key/value pair that I want to store on Redis.
3. Push a key/value to the server.
4. Fetch the value from the Radis server for the same key I have already used for the set(). I wouldn't ever expect the fetched value being different from the original one.
5. Do not forget to disconnect!

Writing code like that in the real world is asking for trouble. Converting my bunch of free functions in a class is easy, straightforward and immediately pays off, saving the nuisance of passing around the Redis context and be forced of taking care of its disposal. But I'll do it another time.

Let's now see how I have implemented my functions - remember that all of them are in the TTR namespace.

The most interesting one is connect():
redisContext* connect(const std::string& server, int port)
{
    std::string sport = server + ":" + boost::lexical_cast<std::string>(port); // 1

    if(!port || server.empty()) // 2
    {
        std::cout << "Can't connect to Redis [" + sport + "]" << std::endl;
        return NULL;
    }

    redisContext* context = redisConnect(server.c_str(), port); // 3
    if(!context) // paranoid
    {
        std::cout << "No memory for Redis on " + sport << std::endl;
        return NULL;
    }

    if(context->err) // 4
    {
        std::cout << "Can't connect to Redis on " + sport + " - " + context->errstr << std::endl;

        redisFree(context); // 5
        return NULL;
    }

    std::cout << "Connected to Redis [" + sport + "]" << std::endl;
    return context;
}
1. I concatenate the server name to the port number for debugging purpose, notice the usage of the handy boost lexical cast to convert an integer to a C++ string.
2. Test for valid user input. It could, and probably should, be more strict. But you get the idea.
3. Call the hiredis connection function. It would fail, returning NULL, for an out of memory problem. This is quite improbable. Still, better safe than sorry.
4. When there is a trouble connecting to the Redis server, we have it reported in the fields err and errstr on the Redis context object returned. If this is the case, I log a message, release the context, and return a fat NULL to the caller.
5. Remember, any context returned by redisConnect() has to be cleaned up calling redisFree().

The disconnect() is so boring that one would happily hide it in a class destructor:
void disconnect(redisContext* context)
{
    if(context) // 1
    {
        std::cout << "Disconnecting from Redis" << std::endl;
        redisFree(context);
    }
}
1. Passing a NULL to redisFree could result in a disaster (AKA a segmentation fault), so, I'd better check for it.

The real stuff, setting and getting a value on Redis, is done by these functions:
void set(redisContext* context, const std::string& key, const std::string& value) // 1
{
    if(!context) // 2
        return;

    void* reply = // 3
        redisCommand(context, "SET %b %b", key.c_str(), key.length(), value.c_str(), value.length());
    if(reply)
    {
        freeReplyObject(reply);
        return;
    }

    // unexpected
    std::cout << "No reply from Redis" << std::endl;
}

std::string get(redisContext* context, const std::string& key)
{
    if(!context)
        return "";

    redisReply* reply = static_cast<redisReply*>(redisCommand(context, "GET %b", key.c_str(), key.length()));
    if(!reply)
        return "";

    std::string result = reply->str ? reply->str : ""; // 4
    freeReplyObject(reply);
    return result;
}
1. We don't care about the Redis reply, any failure in setting a value for a specified key for the passed Redis context is (almost) silently ignored.
2. As we have already seen, Redis doesn't check if the context we pass to it is good or not. To avoid an unpleasent segmentation fault, it's better to check it ourself.
3. Here I don't care of what is the actual reply of the Redis server, I only check if it actually emits an answer and, if so, I clean it up.
As you can see, redisCommand() is designed to be similar to the standard C fprintf() function. First argument is the Redis context on which the call is performed, Than we have a string, containing the name of the actual operation (here is SET) and any required parameter, identified by a percent flag. If you know that you are about to send plain strings with no special character in it, you can use the "%s" placeholder. By I want to play safe, so I use the "%b", that allows binary strings to be sent. In this case I have to double the number of subsequent arguments, passing both the relative C-string and its size.
4. When I GET from Redis, I am much more interested in the reply object, so I cast the redisCommand() return value (a void pointer) to a pointer to its actual type, redisReply. I need its str field, that contains the value that Redis stores for the key passed as GET parameter, so firstly I check if the reply is not NULL, then I copy its str content in a C++ string, so that I can safely clean the reply object up before returning.

Go to the full post

Preparing to write a Redis client

I have to add some caching capabilities to a C++ module in my current project. There are a few viable alternatives, but we already have a Redis server available, and we don't have any compelling reason to look to anything else. So, it is almost done, I have just to download Redis source code, and preparing the environment to write a tiny test client.

There are a few ways to install Redis, I downloaded it from the official site, redis.io, following the instructions you could find there. In a matter of minutes I had a plain Redis server up and running and I checked it through redis-cli (Redis command line interface utility).

There are plenty of available Redis clients for different programming languages and you typically want to get an existing one matching your elected language. But in C++ case, I see only one alternative in the list proposed by Redis, and there are a few reasons not to peek it up. The code in the repository is quite old (2/3 years), and in the list it is not marked as recommended. So I fell back to hiredis, the official C client, included in the distribution (you could find it under deps/hiredis).

All I need is at hand, I just have to write a Makefile (you can download it from github), and then I am ready to write some code:
#
# Makefile for hiredis client
#

MY_NAME := hrc
MY_SRCS := $(wildcard *.cpp)
MY_OBJS := ${MY_SRCS:.cpp=.o}

MY_INCLUDE_DIRS := /usr/local/include/hiredis
MY_LIBRARY_DIRS := /usr/local/lib
MY_LIBRARIES := hiredis

CXXFLAGS += $(foreach includedir,$(MY_INCLUDE_DIRS),-I$(includedir))
CXXFLAGS += -Wall -g -std=c++11
LDFLAGS += $(foreach librarydir,$(MY_LIBRARY_DIRS),-L$(librarydir))
LDLIBS += $(foreach library,$(MY_LIBRARIES),-l$(library))

.PHONY: all clean

all: $(MY_NAME)

$(MY_NAME): $(MY_OBJS)
    $(LINK.cc) -o $(MY_NAME) $(MY_OBJS) $(LDLIBS)

clean:
    @- rm -rf $(MY_OBJS) $(MY_NAME)
My client is going to be named hrc (as MY_NAME shows), it is written in C++, and the code is contained in .cpp files (MY_SRCS). Any cpp file would have a matching object file identified by the "o" extension (MY_OBJS).

The only custom directory I want to include now is the one for the hiredis client (MY_INCLUDE_DIRS), and I want the make tool to check just in one custom directory for non standard library (MY_LIBRARY_DIRS), and the only custom library I currently want (MY_LIBRARIES) is hiredis. You would probably ask makefile to look in other directories. Remember that the actual library names are decorated versions for the bare name you should put in the makefile. In this case, Redis compilation has generated an archive named libhiredis.a, for static linkage, and a shareable object named libhiredis.so.

Besides, the linker would search for the actual versioned file, as maintained by ldd. In the worst case scenario, it could happens that the linker would complain for missing a versioned so. You could fix this by ldconfig, or even by hand, creating a reference to the missing file through symlink.

I add to the CXXFLAGS firstly the dependencies for include directories and then a bunch of option - I want all the warnings up (-Wall), the executable filled with debug information (-g), and the code generated to support the latest C++ standard (-std=c++11).

In the LDFLAGS I add the dependencies for library directories, and in LDLIBS the actual requested library names.

An then I defines a few targets. A couple of phony ones, "all" and "clean", as tradition wants, and the one that actually creates my target.

Notice that I don't specify explicitly the name of the C++ compiler, but I rely on the make tool to be smart enough to deduce it from the environment. For this reason I use LINK.cc, that in my case contains the expected g++.

In the next post, I am going to write some C++ code meant to be compiled with this Makefile.

Go to the full post

Httpd virtual hosts

I wanted to manage a couple of web sites, let's call them one.dd and two.dd, with my Apache Web Server, and I wanted them to live on the same machine, sharing the same IP address. We know that to do that in the real life, I could not choose randomly a fancy name, like I have just said, but I have to register a proper name under well known limitations. But if I play just on my local machine(s), I can forget about that, and being free and foolish. Still I have to follow a few basic rules.

I am working on a Debian box, on a Apache httpd 2.2 built from scratch, downloading the package from the official Apache site. I reckon you can adapt very easily what I have done to your current setup.

Setting the hosts

The operating system should be aware of the names I want to use on the current machine. This is done in a text file, typically (for *x environments) named /etc/hosts. There we see, among the other things, the standard mapping between 127.0.0.1 and localhost, and we are about to extend it to add our two host names:
127.0.0.1 localhost one.dd two.dd
Setting the httpd configuration

Apache has to know how to manage our virtual hosts, too. The standard http configuration file, conf/httpd.conf, has a commented line that, when activated, includes the specific configuration file for virtual hosts.
# Virtual hosts
#Include conf/extra/httpd-vhosts.conf
It is usually considered a better idea to let the provided example alone, and work on a different file.

This is my virtual host configuration file:
# Virtual Hosts
NameVirtualHost *

<VirtualHost *>
    DocumentRoot /site/www/one.dd
    ServerName www.one.dd
</VirtualHost>

<VirtualHost *>
    DocumentRoot /site/www/two.dd
    ServerName www.two.dd
</VirtualHost>
I guess this is the simplest configuration file one could conceive.

The directive NameVirtualHost says to Apache that we want to attach one or more virtual hosts to the specified address/port. Here I passed a star to it, meaning "anything you get to this point". Usually you want to be more choosy. Besides, I didn't specify any port number. In this case, Apache assumes I expect it to use the one specified in the Listen directive.

Then I have a VirtualHost block for each host I want to define. If anything not matching with the ServerName's specified is getting here, the first one is considered as the default one.

The DocumentRoot says to Apache which directory to use as root for the site. I have created the specified document root directories, and put in both of them an index.html file.

Looks easy, doesn't it? Still, even at this basic level, there are a few thing that could go wrong. And the resulting error messages could look cryptical.

Wrong!

If NameVirtualHost is not matching with any VirtualHost (a different port number is enough) Apache doesn't know what to do of that directive, and a "NameVirtualHost has no VirtualHosts" warning is issued at startup.

I have already noted that if the NameVirtualHost port is not explicitly given, the one specified in the Listen directive is used. But you should ensure to keep the same convention for the associated VirtualHost, too. Otherwise you could get a "VirtualHost mixing * ports and non-* ports with a NameVirtualHost address is not supported, proceeding with undefined results".

Go to the full post

Simple Apache module

My first Apache module needs to be improved in many ways. Here I am addressing to a few basic requirements, answering only to one specific request; logging using the Apache built-in facility; and generating an HTML document as answer.

We want to set our Apache web server so that it would answer with an HTML page containing a few information on the actual received request, and we want it to give this feedback only when we ask for "hello" on it.

Request-module connection

Apache should know about how to associate a user request to the handler that our module is designed to manage. To do that we add a Location directive in the Apache httpd configuration file.

You'll find this file in the Apache conf directory, named httpd.conf, we open it and we add this section:
<Location /hello>
    SetHandler hello
</Location>
We ask Apache to set the handler "hello" so that it is invoked when is issued a request to the hello page directly under the root of my server.

Secondly, we change the C source code, so that we check the passed request, and we ensure the passed handler matches our expectation:
// ...

static const char* MOD_HANDLER = "hello"; // 1

static int handler(request_rec* r)
{
    if(r->handler == NULL || strcmp(r->handler, MOD_HANDLER)) // 2
    {
        // ...
        return DECLINED; // 3
    }

    // ...

    return OK; // 4
}
1. The handler for this module, it's value is used in httpd.conf, as shown above.
2. Check the handler as passed by Apache.
3. If our module has nothing to say here we return DECLINED, to let know to Apache that it has to look elsewhere.
4. Otherwise, after we did our job, we return OK, saying in this way that Apache could considered the request accomplished.

Apache logging

It is often a good idea to log what is going on in our code, both for debugging and administrative purpose. In this module, it would be nice to have a feedback also when a request is discarded. We get this effect adding this line before "return DECLINED":
ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, "handling declined: %s", r->handler);
The Apache ap_log_rerror() lets us writing to the Apache error log file, you'll find it in the logs folder, in your Apache httpd installation directory, named "error_log".
APLOG_MARK is just a way to combine the standard __FILE__ and __LINE__ defines in a single symbol, to save a few keystrokes.
After it, we specify the log level, that ranges from APLOG_EMERG down to APLOG_DEBUG (not to mention the TRACE defines, not commonly used, as far as I know). In the httpd.conf file we configure which level of logging we actually want to print to the log file, setting the log level:
LogLevel debug
As you can imagine, in production it is usually a smart move to set the configured log level higher than debug.
Next parameter, here set to 0, is not very interesting here, and it is followed by the pointer to the request, as we get it from Apache.
Finally we have a string, representing what we want to log, and that could include printf-style parameters.

Building a HTML response

Admittedly, an Apache module is not the most programmer-friendly tool to create a HTML page. But it still make sense in such a simple case:
ap_set_content_type(r, "text/html"); // 1
ap_rputs("<html><head><title>Hello Module</title></head><body>", r); // 2
ap_rprintf(r, "handler: %s<br />\n", r->handler); // 3
ap_rprintf(r, "filename: %s<br />\n", r->filename);
ap_rprintf(r, "the_request: %s<br />\n", r->the_request);
ap_rprintf(r, "header_only: %d<br />\n", r->header_only);
ap_rprintf(r, "hostname: %s<br />\n", r->hostname);
ap_rputs("</body></html>", r);
1. Firstly, set the reply content type.
2. To put a static string, as this one, ap_rputs() does an excellent job.
3. When we need to send parametrized stuff, we'd better using ap_rprintf().

As you can see, I generated the answer from the request, extracting a few (more or less) interesting values as received from Apache.

The complete C code for this module is on github. To compile it and add the generated .so to the server, you can use the apxs utility (more details in the previous post). Remember to set the Apache configuration file with the Location/SetHandler directives.

Go to the full post

A minimal Hello World Apache module

After you have set up the Apache development environment, it is time to create a first module.

I tried to create a minimal module, following the K&R's "Hello World" spirit, that I reckon is so rewarding when you are approaching new stuff.

This module is going to be very impolite, trying to answer to all the user requests to the Apache server in the same way. It would even override the standard index.html page on root.

It is made of a module declaration, logically the first thing we are interested in, but traditionally placed at the bottom of the file, and a couple of static functions (I forgot to mention it, but I am developing in plain old C language).

One of the two functions, hooks(), is passed to the module declaration, and it sets a connection between Apache and the other function, that I named handler(), that it is going to be called to handle the user's jobs.

Here is the three guys above mentioned in detail:
static int handler(request_rec* r) // 1
{
    ap_set_content_type(r, "text/plain"); // 2
    ap_rputs("Hello Apache httpd module", r); // 3
    return OK; // 4
}

static void hooks(apr_pool_t* p) // 5
{
    ap_hook_handler(handler, NULL, NULL, APR_HOOK_REALLY_FIRST); // 6
}

module AP_MODULE_DECLARE_DATA hello_module = // 7
{
    STANDARD20_MODULE_STUFF, NULL, NULL, NULL, NULL, NULL, hooks // 8
};
1. This function is called by Apache so that we can provide a reply to a client request. As we can see, as parameter we get a pointer to a structure that actually represents the user request.
2. I am not doing any check here, I always prepare a plain text reply, by calling the Apache function that sets the content type, ...
3. ... and I fill it with a puny string, with the Apache version of the well known puts() function, but reinterpreted for working on a request_rec.
4. Finally I return OK, meaning that my module has been able to fulfill the user request, and Apache could happily consider this job as done.
5. Here I set the hooks that let Apache know which are the functions in my module it can call.
6. The first parameter to ap_hook_handler() is a function that Apache could call to reply to a user request, and the last one is the priority this hook should have in the collection of hooks owned by Apache. Here I am saying that I want full priority.
7. Here is my module declaration. It is known to Apache by the suggestive name of hello_module.
8. And this are a bunch of information we are passing to Apache about our module. The STANDARD20_MODULE_STUFF define is an aggregate of constants that are saying to Apache this is a standard module version 2, and there is not much more to say about it. We'll say something more on the subsequent five NULLs, but more interesting is the last parameter, the function name Apache needs to know to perform the module initialization.

This is almost everything about it. There are a couple of header inclusions you have to perform to let the compiler knowing what the heck are all those ap_... things, namely httpd.h and http_config.h, but you can see the full code on github.

And, well, you have to compile and register this code before you can actually use it on Apache. To do that, there is a nifty Apache utility, apxs, that basically does all the job for us.

In this case, I would call it something like this:
/path/to/apache/bin/apxs -a -i -c mod_hello.c
Then I stop and start Apache, run it, submit any request whatsoever to it through a we browser, and I should always get back the same reply.

Go to the full post

Installing Apache Httpd

In my current project I am also developing some stuff on the Apache HTTP Server, also known as Apache httpd, or even just Apache, in a Debian box. The environment setup is not complicated, but it does have a couple of twists. So I reckoned it was worthy to put down a few notes about the process.

Currently, on the official Apache httpd download page, we have access to three versions (2.0, 2.2, and 2.4) and a number of different formats, ready to be grabbed.

Since I want a developer install, I should avoid the lean standard cuts provided by apt-get (for Debian) or packaged in a msi (for Windows), and I should go for the "Source" releases. So in my case, I go for a 2.2 (this is the version used at work) Unix Source download.

If you check your chosen link, you would see a different provider accordingly to your geographical position, but the archive name should end like ".../httpd/httpd-2.2.22.tar.gz" (being 2.2.22 the current 2.2 version). I downloaded it through wget, and then I have extracted it by tar xvfz (that is why I have picked up the tar.gz flavor), getting all the raw stuff in a subfolder. I changed directory to there and, before starting the real installation process, I decided where to put the thing, let's call it $APACHE2, that would usually be something like $HOME/apache2.

Firstly I have to prepare the configuration, this is usually done by calling the command
./configure --prefix=$APACHE2
In my case, I want to enable the Dynamic Shared Object (DSO) Support, so that I could install shared modules in a next step. To do that, configure has to be called passing the enable-so option too:
./configure --prefix=$APACHE2 --enable-module=so
Once configure has run, it is time to make Apache. This needs two steps, a first call to "make", and a second one, to "make install".

Almost done. Still I had to set the Apache endpoint in its configuration file: I went to $APACHE2/conf, I edited httpd.conf, setting the property Listen to localhost:8080 - a common setup.

Now I can start and stop my Apache HTTP server, going to $APACHE2/bin, and running:
./apachectl start
./apachectl stop
After starting, and before stopping, I should be able to connect from my web browser to the Apache local home page, by accessing the page on localhost:8080 - if this is not the case, that means I have some trouble.

Building Apache 1.3 on a modern environment

If you need to install a legacy Apache httpd server, you would follow more or less the same procedure, but you could bump in a couple of issues.

Firstly, running configure could lead to errors and a garbled output. This is easy to solve, just run it through a new bash shell, like this:
bash ./configure --prefix=$APACHE13 --enable-module=so
Secondly, you could get a failure in making Apache, due to the use of the symbol getline, that now is part of the C standard library.
The solution here is editing the offending files, htdigest.c htpasswd.c logresolve.c, to rename the local getline function.

Go to the full post

Building czmq on MSVC

CZMQ, the high level C binding for ØMQ, is available for download from the zeromq.org official page, currently in version 1.1.0. The problem is that, if you want to use it for Visual Studio, there are a few issues.

Firstly, the zip file currently available misses some stuff, most notably, the solution files for MSVC. You can overcome this, getting the latest czmq version from github:
git clone git://github.com/zeromq/czmq

Next step is compiling it, and, as one should expect, the czmq source code includes references to zeromq. You can get it from here:
git clone git://github.com/zeromq/zeromq2-x

Now you are ready to open the czmq solution in the win32 subfolder. Ensure in the project - properties that the relation to zeromq is carried on correctly.

Compile it. If no changes occurred from when I have written this post, compiling for MSVC2010 you should get a few errors:
1>..\src\zframe.c(461): error C2440: 'initializing' : cannot convert from 'void *' to 'char *'
1>..\src\zmsg.c(194): error C2440: 'initializing' : cannot convert from 'void *' to 'zframe_t *'
1>..\src\zmsg.c(503): error C2440: '=' : cannot convert from 'void *' to 'byte *'
1>..\src\zmsg.c(772): error C2440: 'initializing' : cannot convert from 'void *' to 'byte *'
1>..\src\zsockopt.c(138): error C2440: 'initializing' : cannot convert from 'void *' to 'char *'
Explicit cast are required, patch the code like this:
// zframe.c:461
char *buffer = static_cast<char*>(malloc(1024));

// zmsg.c:194
zframe_t *frame = static_cast<zframe_t *>(zlist_pop (self->frames));

// zmsg.c:503
*buffer = static_cast<byte *>(malloc (buffer_size));

// zmsg.c:772
byte *blank = static_cast<byte *>(zmalloc (100000));

// zsockopt.c:138
char *identity = static_cast<char*>(zmalloc (option_len));
Once corrected these errors, you should complete correctly the czmq project, producing czmq.lib in the win32 folder.

Create a project that uses zmq and czmq, setting properly include and library directories, and a reference to the actual lib files among the linker additional dependencies.

Finally you can write a first tiny hello czmq application that just instantiate and destroy a zmq context:
#include <czmq.h>

// ...
{
    zctx_t* ctx = zctx_new();
    assert(ctx);
    zctx_destroy(&ctx);
    assert(ctx == NULL);
}

Go to the full post

Hello Windows Sockets 2 - client

The Winsock API is a collection of quite low level functions. As we have seen in the previous post, writing an echo server, substantially a simple procedure (wait for data, send them back, shutdown when the connection closes), it is not complicated but longish, we have access to the tiniest detail in the process. The most tedious part of the job is that we have to specify any single step by hand, even the ones that could be easily automatized. A slender C++ wrapper would be enough to make the job much more simpler, but till now I haven't found around anything like that. Any suggestion is welcomed.

Let's now complete our echo application, thinking about the client side.

Initializing Winsock

This is the same stuff that we have already seen for the server side. We need to enclose out code in a pair of function call to WSAStartup() and WSACleanup(), the Ws2_32.lib should be linked to our project, and we have to include a couple of header files, winsock2.h and ws2tcpip.h, in our C/C++ source file.

Open/close socket

Almost the same as for the server:
void client(const char* host, const char* port) // 1
{
    ADDRINFO hints;
    wsa::set(hints, 0, AF_UNSPEC, SOCK_STREAM, IPPROTO_TCP); // 2

    ADDRINFO* ai = wsa::get(host, port, &hints);
    if(ai)
    {
        wsa::list(ai); // 3

        SOCKET sk = wsa::createClientSocket(ai);
        freeaddrinfo(ai);

        if(sk != INVALID_SOCKET)
        {
            coreClient(sk); // 4
            closesocket(sk);
        }
    }
}
1. The user should provide us both machine address an port, so that we can connect to the server socket.
2. Tiny variation, we specify AF_UNSPEC as address family, so that both TCP/IP v4 and v6 are accepted, when available. This does not make much sense here, since we already know that the server is using a TCP/IP v4, but in a real case scenario could be a useful trick.
3. We asked to winsock for all the address info matching our requirements. If its current version supports both TCP/IP v4 and v6, we should get both of them. This tiny utility function, see it below, checks all the retrieved address info and dump to the console some of their settings.
4. The rest of the code is just the same of what we have already seen for the server, before calling the specific client code, we create the socket, and then we'll close it. Notice also that we had to free by hand the memory associated to the address info.

As promised, here is the address info listing utility function:
namespace wsa
{
    void list(ADDRINFO* ai)
    {
        while(ai)
        {
            std::cout << ai->ai_family << ' ' << ai->ai_socktype << ' ' << ai->ai_protocol << std::endl;
            ai = ai->ai_next; // 1
        }
    }
}
1. ADDRINFO is actually a linked list of records. Its ai_next field points to the next element, if any. So here we are looping on all the items available.

Send and receive

Here is the client specific code:
void coreClient(SOCKET sk)
{
    char* message = "Hello Winsock 2!";

    int size = send(sk, message, strlen(message), 0); // 1
    if(size == SOCKET_ERROR)
    {
        std::cout << "send failed: " << WSAGetLastError() << std::endl;
        return;
    }

    std::cout << size << " bytes sent" << std::endl;

    if(shutdown(sk, SD_SEND) == SOCKET_ERROR) // 2
    {
        std::cout << "shutdown failed: " << WSAGetLastError() << std::endl;
        return;
    }

    do { // 3
        char buffer[BUFLEN];
        size = recv(sk, buffer, BUFLEN, 0); // 4
        if(size > 0)
        {
            std::cout << "Buffer received: '";
            std::for_each(buffer, buffer + size, [](char c){ std::cout << c; });
            std::cout << '\'' << std::endl;
        }
        else if(!size)
            std::cout << "Connection closed." << std::endl;
        else
            std::cout << "recv failed: " << WSAGetLastError() << std::endl;
    } while(size > 0);
}
1. The message is sent through the socket, winsock returns the number of bytes actually sent, or an error code.
2. This simple application sends just one message, so here we can already close the send component of the socket, releasing some resources doing that, still keeping alive its receive component.
3. Loop until we receive data from the server
4. The received data is put in the buffer, its size is returned by the call. A zero size message is interpreted as a shutdown signal coming from the server. A negative value has to be intepreted as an error code.

If we run the client alone, it won't find any server socket to connect, and would return an error to the user.

The full client/server source code is on github.

Go to the full post

Hello Windows Sockets 2 - server

The Winsock API, also known as WSA, implements the socket paradigm not much differently from the BSD (Berkeley Software Distribution) concept. Here I am going to write a simple echo application that should help to familiarize with its basic functionality.

Initializing Winsock

To use Winsock 2 we need to link Ws2_32.lib (if the 32 bit version is OK for you), and we can do that specifying it in the project Properties - Linker - Input - Additional Dependencies field. Alternatively, we could use a pragma directive:
#pragma comment(lib, "Ws2_32.lib")
Then include files required are winsock2.h and, almost ever, ws2tcpip.h, the WinSock2 extension for TCP/IP protocols header.

The winsock code should be within two calls that take care of initializing and shutting down the socket support. Typically your code should look like:
WSADATA wsaData; // 1
int failCode = WSAStartup(MAKEWORD(2,2), &wsaData); // 2
if(failCode)
{
    std::cout << "WSAStartup failed: " << failCode << std::endl;
    return;
}

// 3
// ...

WSACleanup(); // 4
1. WSADATA is a structure containing a few data related to the current active Winsock API.
2. WSAStartup() try to initialize winsock for the passed version support (2.2 here) and sets the wsadata structure. If it works fine it returns 0, otherwise it returns an error code.
3. Your winsock code goes here.
4. Winsock shutdown.

Open/close socket

To create a socket we need to specify an ADDRINFO structure. Here is the how to for the server:
void server(const char* port) // 1
{
    ADDRINFO hints;
    wsa::set(hints, AI_PASSIVE, AF_INET, SOCK_STREAM, IPPROTO_TCP); // 2

    ADDRINFO* ai = wsa::get(NULL, port, &hints); // 3
    if(ai)
    {
        SOCKET sk = wsa::createServerSocket(ai); // 4
        freeaddrinfo(ai); // 5

        if(sk != INVALID_SOCKET)
        {
            coreServer(sk); // 6
            closesocket(sk);
        }
    }
}
1. The user tells on which port the socket should sit.
2. See below for this tiny utility function I have written thinking it makes the code clearer. The addrinfo is initialize to say to winsock that we want to create a server TCP v4 socket.
3. A second tiny utility function of mine, to wrap a call to getaddrinfo() that returns an available addrinfo matching our request, or NULL.
4. Another utility function, this one wrapping a call to socket(), that creates a winsock socket, and bind() it to the passed address.
5. Once we get the socket, we can get rid of the addrinfo.
6. If we have got a good socket, we can do the real job.
7. We are done with the socket, so we close it.

Let see the utility functions used above:
namespace wsa
{
    void set(ADDRINFO& that, int flags =0, int family =0, int type =0, int protocol =0, // 1
        size_t addrlen =0, char* name =0, SOCKADDR* addr =0, ADDRINFO* next =0)
    {
        that.ai_flags = flags;
        that.ai_family = family;
        that.ai_socktype = type;
        that.ai_protocol = protocol;
        that.ai_addrlen = addrlen;
        that.ai_canonname = name;
        that.ai_addr = addr;
        that.ai_next = next;
    }

    ADDRINFO* get(const char* host, const char* port, ADDRINFO* hints)
    {
        ADDRINFO* ai = NULL;
        int failCode = getaddrinfo(host, port, hints, &ai); // 2
        if(failCode)
        {
            std::cout << "getaddrinfo failed: " << failCode << std::endl;
            return NULL;
        }
        return ai;
    }

    SOCKET createSocket(ADDRINFO* ai) // 3
    {
        SOCKET sk = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
        if(sk == INVALID_SOCKET)
        {
            std::cout << "error creating socket: " << WSAGetLastError() << std::endl;
            return INVALID_SOCKET;
        }
        return sk;
    }

    SOCKET createServerSocket(ADDRINFO* ai)
    {
        SOCKET sk = createSocket(ai); // 4

        if(sk != INVALID_SOCKET && bind(sk, ai->ai_addr, ai->ai_addrlen) == SOCKET_ERROR) // 5
        {
            closesocket(sk); // 6

            std::cout << "Unable to bind: " << WSAGetLastError() << std::endl;
            return INVALID_SOCKET;
        }
        return sk;
    }
}
1. It ensures all the fields are set to zero/NULL or to a valid value.
2. Wraps a call to getaddrinfo().
3. First step in the creation of a server socket, it wraps a call to socket().
4. After creating a socket in (3) ...
5. ... we bind it to the passed address.
6. Some error handling.

Accepting a client

After all this setting up, we can finally do something with our socket:
void coreServer(SOCKET skServer)
{
    if(listen(skServer, SOMAXCONN) == SOCKET_ERROR) // 1
    {
        std::cout << "Listen failed with error: " << WSAGetLastError() << std::endl;
        return;
    }

    SOCKET skClient = accept(skServer, NULL, NULL); // 2
    if(skClient == INVALID_SOCKET)
    {
        std::cout << "accept failed: " << WSAGetLastError() << std::endl;
        return;
    }

    echoLoop(skClient); // 3
    closesocket(skClient);
}
1. We have a passive (server) socket, bound to a specific address. Before working on it, we have to prepare it to listen for incoming traffic.
2. Here we hangs till we receive an input from a client socket. In this simple example, there could be just one client accessing the server. Once the server accepts a client connection, it has no way to accept another one of them.
3. The real job is done here.
4. And finally we close the client socket.

Receive and send

We have the client socket counterpart. We can receive on it, and send back some data to it. Here we implements an echo procedure:
void echoLoop(SOCKET skClient)
{
    char buffer[BUFLEN];

    int rSize;
    do {
        rSize = recv(skClient, buffer, BUFLEN, 0); // 1
        if(rSize > 0)
        {
            std::cout << "Buffer received: '";
            std::for_each(buffer, buffer + rSize, [](char c){ std::cout << c; });
            std::cout << '\'' << std::endl;

            int sSize = send(skClient, buffer, rSize, 0); // 2
            if(sSize == SOCKET_ERROR)
            {
                std::cout << "send failed: " << WSAGetLastError() << std::endl;
                return;
            }
            std::cout << sSize << " bytes sent back" << std::endl;
        }
        else if(rSize == 0)
            std::cout << "closing connection ..." << std::endl;
        else
        {
            std::cout << "Error receiving: " << WSAGetLastError() << std::endl;
            return;
        }
    } while(rSize > 0);
}
1. Receive until the client shuts down the connection
2. Echo the buffer back to the sender

If we run the server alone, it will hang forever waiting for a client. We'll see it in the next post, but you can already go to github and get the full client/server source code. This example is based on the official Getting started with Winsock topic on MSDN.

Go to the full post

Pointer to function, functor, lambda function

Let's see an example where the use of pointer to function, functor, or lambda function is just a matter of personal taste. I'm assuming you are working with a modern C++ compiler and you don't have any special constrain.

We have an array of doubles, and we want to know how many values in there satisfy a given condition, that could vary. To do that, we want to call the STL function count_if(), using a predicate that we can select accordingly to the user request.

We'll do some test on this data:
double values[] = { 41.97, 12.34, 95.43, 9, 44.51 };
const int SIZE = sizeof(values) / sizeof(double);
And we want just to count how many elements are less than 42.

Pointer to function

The STL count_if() function expects as input the delimiters for the sequence it has to operate on, and a predicate accepting in input a value, matching the collection base type, and returning a boolean. We can think to implement a first rough solution like this:
bool isLessThan42(double value) { return value < 42; }
// ...
std::cout << std::count_if(values, values + SIZE, isLessThan42) << std::endl;
The most painful limitation of this implementation is that the comparator is too inflexible. But we can easily relax it using a binder.

Pointer to function with STL binder

We had a problem in the first implementation, count_if() asks for a one-parameter function, but we want to use a two-parameter one. No problem, we can adapt our function:
bool isLessThan(double value, double limit)
{
    return value < limit;
}
// ...
const double LIMIT = 42;
std::cout << std::count_if(values, values + SIZE, std::bind(isLessThan, std::placeholders::_1, LIMIT)) << std::endl;
The STL bind adapter connects our isLessThan() function to count_if, using as first parameter (first placeholder) the one provided by count_if and adding LIMIT in the second place.

Functor

The second pointer to function implementation works fine, but why passing each time the limit for comparison? Wouldn't be enough to pass it once, and use it for all the iteration? We can do it using a functor:
class LessThan : public std::unary_function<double, bool> // 1
{
private:
    argument_type limit_;
public:
    LessThan(argument_type limit) : limit_(limit) {}
    result_type operator() (argument_type value) { return value < limit_; }
};
// ...
const double LIMIT = 42;
std::cout << std::count_if(values, values + SIZE, LessThan(LIMIT)) << std::endl; // 2
1. unary_function is an STL utility class provided to help keeping consistence for the creating (unary) functors. It is not mandatory to use it, but it is a good idea, even just for documentation. Reading the class header is enough to see that LessThan is a functor that gets in input a double (argument_type) and returns a boolean (result_type).
2. A LessThan object is created passing LIMIT to its ctor, and then passed to count_if(). Each iteration of count_if() calls the LessThan operator() function.

C++11 lambda

If your compiler implements the 2011 C++ standard lambda function, you could simplify the code in this way:
const double LIMIT = 42;
std::cout << std::count_if(values, values + SIZE, [LIMIT](double value){ return value < LIMIT; }) << std::endl;
The access to the LIMIT constant is granted to the lambda function body putting it in the capture section.

Boost lambda

If your compiler does not implement the standard C++ lambda, you could still use it through the Boost Libraries support:
std::cout << "Boost lambda: " << std::count_if(values, values + SIZE, boost::lambda::_1 < LIMIT ) << std::endl;
STL Functor

In such a simple case, is probably not worthy to create a custom functor. We could use a standard one, and adapt it to do it what we want. In this case, we have a ready made less<> functor (better categorized as an adaptable binary predicate) that could do:
std::cout << std::count_if(values, values + SIZE, std::bind2nd(std::less<double>(), LIMIT)) << std::endl;

Go to the full post

Pointer to function

A C++ programmer could use almost indifferently a pointer to function, a functor, or a lambda function to implement the same design request. But pointer to functions are so contrived, difficult to read and maintain, that are commonly used only when one can't avoid to do it. Namely when developing in pure C.

I guess you know what a pointer to function is, and how it works, but let's have a tiny example to refresh the matter, before comparing them with functor and lambda functions.

Say that we want a free function to behave polimorfically, it should run an adder or a multiplier, accordingly to a specific situation, on a couple of doubles, and then output the result. We have some obscure constrain (legacy code that uses our free function, probably) that forces us to think in terms of pointer to functions.

These are the functions that have to be called:
double sum(double a, double b) { return a + b; }
double multiply(double a, double b) { return a * b; }
We define what is the type of a pointer to these functions:
typedef double (*PDF_DD) (double, double);
As one would expect, we defined PDF_DD as a pointer to a function that gets in input two doubles and gives back another double.

This is a possible implementation for our function:
void dispatcher(double x, double y, PDF_DD f) // 1
{
    if(f == NULL) // 2
    {
        std::cout << "Bad pointer to function!" << std::endl;
        return;
    }
    double result = (*f)(x, y); // 3
//  double result = f(x, y); // 4
    std::cout << "Result is: " << result << std::endl;
}
1. The third input parameter is a pointer to function, as defined above.
2. Being a pointer (to function or data doesn't matter), f could be NULL. So we should check it, to avoid the risk of a NULL dereferencing crash.
3. The pointer to function is dereferenced, and the actual function is called. It is not mandatory explicitly dereferencing a pointer to function, the compiler is smart enough to do it implicitly.
4. Same as (2), but relying on the compiler to convert the pointer to function, before calling the referenced function.

The function should be called like this:
dispatcher(3, 2, sum);
dispatcher(3, 2, multiply);
dispatcher(3, 2, NULL); // 1
1. PDF_DD is a pointer, so this code is legal. Hence we should remember to check this parameter value before dereferencing it.

Go to the full post

Frobenius Norm

The Frobenius norm is the same concept of the Euclidean norm, but applied to matrices.
It is easy to write a pure C function calculating the Frobenius norm:
double frobeniusNorm(double* matrix, int size1, int size2)
{
    double result = 0.0;
    for(int i = 0; i < size1; ++i)
    {
        for(int j = 0; j < size2; ++j)
        {
            double value = *(matrix + (i*size2) + j);
            result += value * value;
        }
    }
    return sqrt(result);
}
This piece of code would look a bit counterintuitive to someone not used to how multidimensional arrays are managed in C, but once you get the point that they are flatted and seen just like a one dimensional array, the things start looking clearer.

We could have written the same function in a more friendly way, using the array notation, ending up in something like this:
double frobeniusNorm(double matrix[][MAT_SIZE_2], int size1)
{
    double result = 0.0;
    for(int i = 0; i < size1; ++i)
    {
        for(int j = 0; j < MAT_SIZE_2; ++j)
        {
            double value = matrix[i][j];
            result += value * value;
        }
    }
    return sqrt(result);
}
But the compiler has to know one matrix dimension, so that it could converts the matrix notation in the actual address of the memory cell containing its value, so we are forced to pass it (here is used an int constant, MAT_SIZE_2, whose definition is showed below).

In C, a 3x3 matrix would be typically defined in one of this two ways:
double dm[] = { // 1
    1, 2, 3,
    4, 5, 6,
    7, 8, 9
};
const int MAT_SIZE_2 = 3;
double dm2[][MAT_SIZE_2] = { // 2
    {1, 2, 3},
    {4, 5, 6},
    {7, 8, 9}
};
1. In this way we keep the representation close to what is the actual way the data are stored in memory, but the code could be a bit confusing: is this a matrix or a vector?
2. Here we are stating clearly that we are dealing with a two dimensional vector, but this is what we have to do to call our generic Frobenius norm calculator:
frobeniusNorm(&dm2[0][0], 3, 3)
We can't pass a bidimesional array when a bare pointer is expected, so we have to explicitly say to the compiler that it should get the address to the first element in the array.

We could save all this trouble using the Boost uBLAS matrix class:
// ...
#include <boost/numeric/ublas/matrix.hpp>
#include <boost/numeric/ublas/io.hpp>
namespace ublas = boost::numeric::ublas;
// ...
ublas::matrix<double> matrix(3, 3);
int k = 0;
for(unsigned int i = 0; i < matrix.size1(); ++i)
    for(unsigned int j = 0; j < matrix.size2(); ++j)
        matrix(i, j) = ++k;
// ...
       
double frobeniusNorm(const ublas::matrix<double>& matrix)
{
    double result = 0.0;
    for(unsigned int i = 0; i < matrix.size1(); ++i)
    {
        for(unsigned int j = 0; j < matrix.size2(); ++j)
        {
            double value = matrix(i, j);
            result += value * value;
        }
    }
    return sqrt(result);
}
Naturally, the above implementation of a Frobenius norm calculator is useful only to show how to work with an uBLAS matrix. In real code, it does not make much sense using uBLAS data structure and not its functions:
std::cout << ublas::norm_frobenius(matrix) << std::endl;

Go to the full post