Showing posts with label spirit. Show all posts
Showing posts with label spirit. Show all posts

Generator Semantic Actions

It gets natural using an attribute to feed Boost Spirit Karma, but we can use semantic actions too.

Karma calls a semantic action before the generator is invoked, giving the action a way of providing data to the generator. So, for a double_ generator, the relative action should be in the form
void F(double& n);
As we have seen for parser semantic action, also generator semantic actions should be designed to support other two parameter (a reference to a Context object and to a boolean). In case of action implemented as function we could simply forget about them, Karma is smart enough to let that working by itself. But for functor and standard lambda we have to explicitly mention them as unused_type parameters.

To use an action that generates an integer, like this one:
void readInt(int& i) { i = 42; }
We call karma::generate() in this way:
using boost::spirit::karma::generate;
using boost::spirit::karma::int_;

std::string generated;
std::back_insert_iterator<std::string> sink(generated);
generate(sink, '{' << int_[&readInt] << '}');
std::cout << "Simple function: " << generated << std::endl;

As a result, the generated string should contain 42 between curly braces.

Have a look at the original Boost Spirit Karma documentation for more details. Here is a C++ source file from there with more examples on generator semantic actions.

Go to the full post

Generating text with Spirit Karma

Spirit Karma is the counterpart of Spirit Qi. Qi is used to parse a string using a specified grammar to extract data and put it in internal data structure, while Karma accepts data in internal format and generates a string from them accordingly to the specified grammar.

If you have no Karma at hand, you would probably use sprintf() or std::stringstream to perform the same task. But Karma gives you more flexibility, keeping the code easy to read and maintain, producing code that is usually faster. Even code size should not be an issue. And, if you already use Spirit Qi, Spirit Karma comes out quite naturally. On the negative side, Spirit code generation is not that easy, given the complex structure of the framework, and this could slow down a bit the project build time.

From double to string.

Using Karma to put a double in a string looks a bit of overkilling. But also using standard C++ the resulting code is not the simplest piece of code on Planet Earth. Typically I come out with something like this:
std::string d2str(double d)
{
  std::ostringstream oss;
  oss << d;
  return std::move(oss.str());
}
A standard stringstream object is used as mediator from internal type (double in this case) and a string.

Here is a way to get the same result using Karma:
bool kGenerate1(double d, std::string& s)
{
  using boost::spirit::karma::double_;
  using boost::spirit::ascii::space;
  using boost::spirit::karma::generate;

  std::back_insert_iterator<std::string> sink(s);
  return generate(sink, double_, d);
}
Basically, all the job resolves in calling karma::generate() passing an insert iterator to the collection where we want to push the result (typically a string), the grammar that has to be used (here just double_), and the value that has to be used to generate the result.

Lot of namespaces, but the code itself is even clearer than the one that uses standard functionality. And it should be faster too.

From two double to string.

If the doubles in input are two we have to face a couple of issue. How to pass them to Karma, and how to insert a delimiter between them. First issue could be solved passing explicitly both values to the generator [see comment to this post for details], or using an STL container that contains both of them; and the second issue calling karma::generate_delimited(), that accepts as input a delimiter and a skipper sequence:
bool kGenerate2(const double d1, const double d2, std::string& s) // 1
{
  using boost::spirit::karma::double_;
  using boost::spirit::ascii::space;
  using boost::spirit::karma::generate_delimited;

  std::back_insert_iterator<std::string> sink(s);
  return generate_delimited(sink, double_ << ',' << double_, space, d1, d2); // 2
}

bool kGenerate2(const std::deque<double>& dd, std::string& s) // 3
{
  using boost::spirit::karma::double_;
  using boost::spirit::ascii::space;
  using boost::spirit::karma::generate_delimited;

  std::back_insert_iterator<std::string> sink(s);
  return generate_delimited(sink, double_ << ',' << double_, space, dd);
}
1. First overload for our generating function: it explicitly requires two double values.
2. The grammar could be read as: a double followed by a comma, followed by a second double. Notice there the use of the "put to" operator (<<) that differs from Qi, where the "get from" (>>) is used. As a skipper is used the blank character.
3. Second overload, here I used std::deque as container for the values to be used.

One or more doubles to string.

Let's rewrite the previous case in a more generic way. Now we expect the doubles in input to be one or more, and we leave the user the choice of the used container for them. The grammar now specify that we expect a double followed by zero or more blocks made of a comma followed by a double:
template <typename Container>
bool kGenerate(const Container& c, std::string& s)
{
  using boost::spirit::karma::double_;
  using boost::spirit::ascii::space;
  using boost::spirit::karma::generate_delimited;

  std::back_insert_iterator<std::string> sink(s);
  return generate_delimited(sink, double_ << *(',' << double_), space, c);
}

Output as double or as int

As Nikhil pointed out, my d2str() function above behaves differently to kGenerate1(). The standard stringstream class do not output the fractional part of a double if the passed value has an actual integral internal representation, while Karma relies on the type the user specifies to decide the format to use. I had told it to always use the double_ format, and so I always get a double in output.

If we want our code to mimic the default stringstream behaviour, we could check explicitly on our own the fractional part of the double value. Something like this:
bool kGenerate1b(double d, std::string& s)
{
  using boost::spirit::karma::double_;
  using boost::spirit::karma::int_;
  using boost::spirit::ascii::space;
  using boost::spirit::karma::generate;

  std::back_insert_iterator sink(s);

  double asInt;
  if(std::modf(d, &asInt) == 0.0) // 1
    return generate(sink, int_, d); // 2
  else
    return generate(sink, double_, d);
}

1. The standard C math library provides this useful function modf() that split a double value in its integral and fractional parts. It is just what we need. Only if the fractional part, as returned by modf(), is zero (as double) we are going to dump the input value formatted as a double.
2. When the double in input is actually an integer, I use the karma::int_ format instead.

I based this post on a C++ source file provided by the original Boost Spirit Karma documentation.

Go to the full post

Parser attribute

Each Boost Spirit parser has an attribute. The attribute of a simple parser is easy to spot: for the built-in double_ is variable of double type.

The attribute of a list parser is a std::vector whose underlying type is determined by the attribute type of its elements. So the std::vector<double> is the attribute of
double_ % ','
The interesting consequence is that we can use attributes instead of semantic actions to work on the sequence elements, like we are going to see here below.

The problem

We have a CSV list of floating point numbers in a string, we want to generate from it a std::vector<double> containing its values.

Solution by parser semantic action

Combining the usage of semantic action (implemented by a standard lambda function) with the Boost Spirit shortcut syntax for list we should easily get code like this:
bool parse2vector(const std::string& input, std::vector<double>& vd)
{
using boost::spirit::qi::phrase_parse;
using boost::spirit::qi::unused_type;
using boost::spirit::qi::double_;
using boost::spirit::qi::ascii::space;

auto pushBack = [&vd](double d, unused_type, unused_type){ vd.push_back(d); };

std::string::const_iterator beg = input.begin();
bool ret = phrase_parse(beg, input.end(), double_[ pushBack ] % ',', space);
return beg != input.end() ? false : ret;
}

Piece of cake. For each double_ element found by the parser in the input string the semantic action pushBack is called. The semantic action is a lambda function that has reference access to the vector we want to fill in, and receive as input parameter the double value as parsed by the parser (and other two parameters not used here); then it simply push back in the vector the value.

Could there be anything simpler than that? Actually, yes, there is.

Solution by parser attribute

We could rewrite the same function passing to phrase_parse() an extra parameter, that should match the parser type. As we said above, a list of double_ has a std::vector<double> as attribute:
bool parse2vector(const std::string& input, std::vector<double>& vd)
{
using boost::spirit::qi::phrase_parse;
using boost::spirit::qi::double_;
using boost::spirit::qi::ascii::space;

std::string::const_iterator beg = input.begin();
bool ret = phrase_parse(beg, input.end(), double_ % ',', space, vd);
return beg != input.end() ? false : ret;
}

No more action specified, we let Spirit doing the dirty job for us.

I based this post on a C++ source file provided by the original Boost Spirit Qi documentation.

Go to the full post

Spirit syntax for list

In the previous post we have seen a function, csvSum() that adds up all the elements of a CSV (Comma Separated Value) list returning the result in a double that is passed as input parameter. A user could be surprised by the fact that the double value provided in input is not considered at all. Besides, if we just accept the fact that the function caller could initialize the returned value as he likes, the resulting code is more linear, since we are allowed to manage all the elements in the list in the same way.

That's how the call to phrase_parse() was written in our function:
bool r = phrase_parse(beg, input.end(),
double_[ref(sum) = _1] >> *(',' >> double_[ref(sum) += _1]), space);

The first element in the list is assigned to sum, overwriting whatever the caller has put in it. Instead of it, now we want to use the same parser semantic action for all the elements: ref(sum) += _1.

Having a CSV list (or, more generically speaking, a list of values separated by something) that requires each if its element to be managed in the same way is such a common task that Spirit provides an elegant syntax just for this case.

So, a CSV list of floating point numbers could be expressed with the full notation that we already know:
double_ >> *(',' >> double_)
Or with a more compact syntax that use the percent character to show an indefinite repetition with a separator:
double_ % ','
Usually the separator is just a character, and normally a comma. But the same notation could be used also if the separator is a string. For instance, we could have values separated by an uppercase triple X sequence:
double_ % "XXX"
Said that, this is the change we are about to do in our code:
bool r = phrase_parse(beg, input.end(), double_[ref(sum) += _1] % ',', space);

And we are about to call the adding function like this:
std::string s("1, 3, 4, 5");
double sum = 1000; // 1.

if(csvSum(s, sum) == false)
std::cout << "Bad CSV" << std::endl;

std::cout << "Sum is " << sum << std::endl;

1. Now we should remember to initialize correctly the startup value, otherwise we would get an unexpected result back.

Go to the full post

From CSV to sum

A common, albeit dated, way of storing information in a file, is using CSV (Comma Separated Values) format. Say that you have lists of numbers stored in this way, and you have to provide a function that converts a string containing an unknown number of elements in its resulting sum.

The first idea that jumped to my mind involved std::accumulate(). It requires a standard container to work, so the input should be prepared converting the original string (of characters) to a vector (of doubles).

Not a bad try, but simply exposing the idea, I saw that the most tricky part of the job is not in summing the values, but in parsing the input string to extract the actual values to work with. And if parsing has to be, better using Spirit and Phoenix.

From a parser perspective, the focus is on the grammar that determine if we should accept or not the input. In our case, we could think to the input as a sequence of numbers containing at least an element, having as mandatory separator a comma, and white spaces as skip elements. The first element in the sequence will trigger a semantic action that would intialize the result, all the other elements will perform a semantic action to increase the sum.

So, the grammar should be something like:
double_[setSum] >> *(',' >> double_[incrSum])
Once we found out the grammar, large part of the job is done.

Here is a possible implementation of a function that checks a string for values in the expects format, and return a success flag and the sum of the values:
// ...
#include <string>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>

bool csvSum(const std::string& input, double& sum)
{
using boost::spirit::qi::phrase_parse;
using boost::spirit::qi::double_;
using boost::phoenix::ref;
using boost::spirit::qi::_1;
using boost::spirit::ascii::space;

std::string::const_iterator beg = input.begin();
bool r = phrase_parse(beg, input.end(),
double_[ref(sum) = _1] >> *(',' >> double_[ref(sum) += _1]), space);

if(beg != input.end())
return false;
return r;
}

If it is not clear to you what is going on in the code, I suggest you to check the previous post on Spirit and Phoenix.

The first semantic action gets the sum by reference, so that it could change it, and assign to it the double value as parsed by Spirit. The second semantic action is called for each subsequent element found by Spirit in the sequence, passing to it the parsed double value, that would be added to sum.

I based this post on a C++ source file provided by the original Boost Spirit Qi documentation.

Go to the full post

Semantic actions with Phoenix

In the Boost Spirit Qi documentation, Phoenix is described as a library defining a sort of "lambda on steroids", and its usage is recommended for implementing semantic actions that have a certain degree on complexity.

Here I'll show the difference when using as a semantic action a free function, a standard lambda function and Phoenix.

What we want to do is a parser that would generate a standard complex, or would give an error in case of malformed input. Our expected input is a string containing just a number, "42", or a complex representation with real and imaginary part included in round brackets - as a bonus we could even accept just a number, for the real part only: "(43, 12)" or "(33)".

Our parser should look something like this:
double_[funR] | '(' >> double_[funR] >> -(',' >> double_[funI]) >> ')';

That reads: A double, on which applying the funR semantic action, or an open round bracket followed by a double, on which applying the funR semantic action, followed by no or one block made of a comma followed by a double, on which applying the funI semantic action; the (possibly missing) block is followed by a close round bracket.

Firstly, we are going to implement the semantic action with a free function. This one:
void setValue(double& lhs, const double rhs) { lhs = rhs; }

And we are going to use it in this way:

template <typename Iterator>
bool parseCpxFun(Iterator first, Iterator last, std::complex<double>& c)
{
using boost::spirit::qi::double_;
using boost::spirit::ascii::space;
using boost::spirit::qi::phrase_parse;

double cReal = 0.0;
double cImg = 0.0; // 1.

auto funR = std::bind(&setValue, std::ref(cReal), std::placeholders::_1); // 2.
auto funI = std::bind(&setValue, std::ref(cImg), std::placeholders::_1);

auto expr = double_[funR] | '(' >> double_[funR] >> -(',' >> double_[funI]) >> ')'; // 3.

bool r = phrase_parse(first, last, expr, space); // 4.

if (!r || first != last) // 5.
return false;
c = std::complex<double>(cReal, cImg); // 6.
return r;
}

1. The parser would set the complex components here defined using the semantic action. The real part has always to be specified, so it is not so important to initialize this variable, but the imaginary part could be missing, so it is crucial having it defaulted to zero.
2. I used a few non strictly necessary variable to make the code more readable, funR and funI bind the semantic action, the function setValue(), to the variables that store the temporary real and imaginary parts. Notice the usage of std::ref() to let std::bind() know the variable should be passed by reference. As third parameter we have a placeholder to the parameter passed to the resulting function. Here I use std::bind() and std::ref(), but if your compiler does not support them yet you can substitute them with boost::bind() and boost::ref().
3. That is the parser, described above, how is implemented here, calling the binds to the free function working as semantic action.
4. And here is how we call phrase_parse(), passing a couple of delimiting iterators, the expression to be used as parser, and what to use as separator.
5. We want our parsing to be quite strict.
6. If the parsing succeeded, we create a complex using the components generated by the semantic actions.

We can simplify a bit the code, using lambda functions to implement semantic actions.

We rearrange the code writter for free function, getting rid of the free function itself, and rewriting parser and function used by it in this way:

// ...
using boost::spirit::qi::unused_type; // 1.

double cReal = 0.0;
double cImg = 0.0;

auto setCR = [&cReal](const double value, unused_type, unused_type) { cReal = value; }; // 2.
auto setCI = [&cImg](const double value, unused_type, unused_type) { cImg = value; };

auto expr = double_[setCR] | '(' >> double_[setCR] >> -(',' >> double_[setCI]) >> ')';

1. It's a real nuisance. As we have already seen, Spirit expect that a lambda function, or even a functor, when used as parser semantic action, accepting three parameter. When we just need the first one, we should use this unused_type for the other two parameters.
2. The lambda works on a variable in scope, cReal or cImg, accessed by reference, accepts a value passed by the caller (and other two unused parameters), and then does the required assignment.

Using Phoenix the code gets even simpler, given that Phoenix is based on boost::lambda that has a looser syntax than standard lambda:

using boost::spirit::qi::_1; // 1.
using boost::phoenix::ref; // 2.

double cReal = 0.0;
double cImg = 0.0;

auto setCR = (ref(cReal) = _1); // 3.
auto setCI = (ref(cImg) = _1);

auto expr = double_[setCR] | '(' >> double_[setCR] >> -(',' >> double_[setCI]) >> ')';

1. The Spirit Qi placeholders are used, since that version is designed explicitly for this usage.
2. Ditto for the Phoenix version of ref.
3. The resulting code is so clean that it makes not much sense putting it in a temporary. The only reason why I left it here is to better show the difference with the previous implementations.

I based this post on a C++ source file provided by the original Boost Spirit Qi documentation.

Go to the full post

Parser Semantic Actions

Usually you don't parse data just to see if they match with a given pattern, but you want take some action that involves the extracted information. This is done in Boost Spirit Qi by the so called parser semantic actions.

An action is nothing more than a C++ function, functor, or lambda function that is called by the parser any time it finds a matching element. The action signature should match the expected prototype relative to the actual parsed element type.

For instance, if a double_ element is parsed, the relative function should be something like
void F(double n);
Actually, there should be other two parameter, that we can safely ignore, and even not put in the function declaration - Spirit Qi takes care of picking up the correct overload - but, as we'll see in a moment, have to be declared as unused when we keep the functor or lambda function approach.

As an example, say that we want to parse a string that should contain an integer between curly brackets, something like "{42}" and, when the integer is detected by the parser, we want a function to be called.

A generic action manager for our problem could be written in this way:
template<typename Fun>
inline void bsGenericAction(const std::string& input, const Fun& fun) // 1.
{
using boost::spirit::qi::int_;

if(!boost::spirit::qi::parse(input.begin(), input.end(), '{' >> int_[fun] >> '}')) // 2.
std::cout << "Bad parsing for " << input << std::endl; // 3.
}

1. It expects in input the string to be parsed and the action that has to be called.
2. Here we are parsing the input using the Spirit Qi parse() function, passing to it the delimiting iterators to the sequence to be checked, and the parser. The function to be called is specified in square brackets.
3. If the parsing fails, we output a log message.

If we want to use as action this free function:
void print(int i) { std::cout << i << std::endl; }
We can use this utility function:
void bsActFFun(const std::string& input)
{
bsGenericAction(input, &print);
}
That should be called like that:
std::string test("{99}");
bsActFFun(test);

If we want to call a member function, like this Writer::print()
class Writer
{
public:
void print(int i) const { std::cout << i << std::endl; }
};
We could use:
void bsActMFun(const std::string& input)
{
Writer w;
auto fun = std::bind(&Writer::print, &w, std::placeholders::_1); // 1.
bsGenericAction(input, fun);
}

1. Maybe is worth spending a few words on the std::bind() usage. With it we are saying to the compiler to use the Writer::print() function associated to the w object, and passing to it the first value that it will be passed to the wrapper. If your compiler does not support yet std::bind(), a C++0x feature, you could rely on the Boost implementation.

Let's consider a functor:
using boost::spirit::qi::unused_type;
class PrintAction
{
public:
void operator()(int i, unused_type, unused_type) const
{
std::cout << i << std::endl;
}
};

As we said above, when using a functor we should specify three parameters. Currently we are interested only in the first one, so we say to Spirit we don't care about the other two in the showed peculiar way.

We use it by calling this function:
void bsActFOb(const std::string& input)
{
bsGenericAction(input, print_action());
}

And finally a lambda function:
void bsActLambda(const std::string& input)
{
using boost::spirit::qi::unused_type;
bsGenericAction(input, [](int i, unused_type, unused_type){ std::cout << i << std::endl; });
}

As a functor, it requires all three parameter specified, even if we don't plan to use them. Your compiler could miss also this feature, this is too a C++0x goodie, and also in this case you could use instead the Boost implementation. Actually, in this case Boost could even look smarter, since it doesn't explicitely ask for the parameters passed to the lambda, we don't have to rely on the unused_type declaration.

The original Boost Spirit Qi documentation provides a supporting C++ source file that I used as a base for this post.

Go to the full post

Parsing with Spirit Qi

A cool thing about Spirit is that it has been designed keeping in mind scalability. That means we could have a limited knowledge of Spirit and yet be able to work with it, if we don't need any fancy feature.

Spirit Qi provides a good number of built-in parsers that could be combined to create our own specific parser. The Spirit tutorial shows us how to start from a built-in parser (double_) to end up with a more complex parser that accepts a list of comma separated floating point numbers.

All the examples we are about to see in this post share the same structure: we call the Spirit Qi function phrase_parse() on a string containing our input specifying the parser that has to be applied, and the "skip-parser" element that could be in the input sequence and should not interfere with the evaluation (typically, and in this case too, anything that is considered a space - blank, return, ...). What we are changing is the parser that has to be used, so I wrote a function that implements the generic behaviour, and requires in input, besides the string containing the text to be evaluate, an expression that represents the parser to use:
#include <boost/spirit/include/qi.hpp>
#include <string>

// ...

template<typename Expr>
inline bool genericParse(const std::string& input, const Expr& expr)
{
std::string::const_iterator first = input.begin();

bool r = boost::spirit::qi::phrase_parse( // 1.
first, // 2.
input.end(),
expr, // 3.
boost::spirit::ascii::space // 4.
);
if(first != input.end()) // 5.
return false;
return r;
}

1. The phrase_parse() returns true if the input sequence is parsed correctly.
2. First two arguments: iterators delimiting the sequence.
3. Third argument: the parser.
4. Fourth argument: the skip-parser element
5. Here we implement a stricter parsing: we check that there is no trailing leftover.

Parsing a number

Now it is quite easy to implement a function that parses a floating point number:
bool bsParseDouble(const std::string& input)
{
return genericParse(input, boost::spirit::qi::double_);
}

boost::spirit::qi::double_ is the built-in parser that is used to identify a number that could be stored in a double variable.

I find that test cases are very useful not only to verify the correctness of the code we produce, but also to understand better what existing code actually does. So I have written a bunch of test cases to verify how the above written code behaves. Here is just the first one I have written:
TEST(BSParseDouble, Double)
{
std::string input("1.21");
EXPECT_TRUE(bsParseDouble(input));
}

Parsing two numbers

For parsing two floating point numbers we have to create a custom parser:
bool bsParseTwoDouble(const std::string& input)
{
auto expr = boost::spirit::qi::double_ >> boost::spirit::qi::double_;
return genericParse(input, expr);
}

Spirit overloads the operator right shift (>>) as a way to convey the meaning of "followed by". So we could read the custom parser we create as: a double followed by another double. And here is it one of the tests I have written for this function:
TEST(BSParseTwoDouble, Double)
{
std::string input("1.21");
EXPECT_FALSE(bsParseTwoDouble(input));
}

Parsing zero or more numbers

A postfix star (known as Kleene Star) is the usual way a zero or more repetition of a expression is represented in regular expressions. The problem is that there is no postfix start operator in C++, so that was not a possible choice for the Spirit designers. That's the reason why a postfix star is used instead:
bool bsParseKSDouble(const std::string& input)
{
return genericParse(input, *boost::spirit::qi::double_);
}

A test I wrote for this function ensures that a sequence of three double is accepted; another one is to check that a couples of ints in a few blanks are accepted too:
TEST(BSParseKSDouble, TrebleDouble)
{
std::string input("1.21 7.44 8.03");
EXPECT_TRUE(bsParseKSDouble(input));
}

TEST(BSParseKSDouble, BlankIntIntBlank)
{
std::string input(" 42 33 ");
EXPECT_TRUE(bsParseKSDouble(input));
}

Parsing a comma-delimited list of numbers

Finally, the big fish of this post. We expect at least one number, and a comma should be used as delimitator:
bool bsParseCSDList(const std::string& input)
{
auto expr = boost::spirit::qi::double_ >>
*(boost::spirit::qi::char_(',') >> boost::spirit::qi::double_);
return genericParse(input, expr);
}

We can read the parser in this way: a double followed by zero or more elements of the expression made by a comma followed by a double.
Actually, we didn't have to cast explicitely the character comma to the parser for it, since the operator >>, having on its right an element of type parser, is smart enough to infer the conversion on its own. So, we could have written:
auto expr = boost::spirit::qi::double_ >> *(',' >> boost::spirit::qi::double_);
But it has been a good way to show the built-in char_ parser.

Being this parsing a bit more interesting, I'd suggest you to write a lot of test cases, to check if your expectations match the actual parsing behaviour. Here is a few of them:
TEST(BSParseCSDList, Empty)
{
std::string input;
EXPECT_FALSE(bsParseCSDList(input));
}

TEST(BSParseCSDList, Double)
{
std::string input("1.21");
EXPECT_TRUE(bsParseCSDList(input));
}

TEST(BSParseCSDList, DoubleDouble)
{
std::string input("1.21,7.44");
EXPECT_TRUE(bsParseCSDList(input));
}

TEST(BSParseCSDList, DoubleDouble2)
{
std::string input("1.21, 7.44");
EXPECT_TRUE(bsParseCSDList(input));
}

TEST(BSParseCSDList, DoubleDoubleBad)
{
std::string input("1.21 7.44");
EXPECT_FALSE(bsParseCSDList(input));
}

Go to the full post

Atoi with Spirit

The first example I found in the excellent Boost Spirit documentation is about using IQ (Spirit parser) to implement something like the standard C atoi() function, and Karma (Spirit generator) to do the same thing the other way round, as the non-standard itoa() function does.

I am tendentially a TDD guy, so I have written a few test cases for a couple of functions that I have derived from my reading to see in action how close the behavior is to the expected one. I spare you the list of tests I generated, just showing the first one (I use Google Test, if you wonder):
TEST(BoostSpiritAtoi, Simple)
{
   std::string s("42");
   EXPECT_EQ(std::atoi(s.c_str()), bsAtoi(s)); // 1
}
1. The expected result for a call to the Boost Spirit Atoi should be the same of a call to standard atoi.

For what I have seen, the document's authors left out (as a simple chore for the reader) only the skipping of leading white characters. So the code I have written for the atoi emulation is only a tad different from what you would find in the documentation:
int bsAtoi(const std::string& input)
{
   std::string::const_iterator beg = input.begin();
   while(beg != input.end() && std::isspace(*beg)) // 1
      ++beg;

   int value = 0; // 2
   boost::spirit::qi::parse(beg, input.end(), boost::spirit::int_, value); // 3
   return value;
}
1. Loop to skip all the possible leading white spaces. We won't care about trailing ones, since they are correctly managed by the Spirit parser - and I know what I am saying, I have written and checked a few specific test cases to verify it.
2. If we don't initialize the returned value and the parser doesn't have anything to do (empty string in input, for instance) we would get back a dirty value.
3. Parse with Spirit.

The itoa() emulation is even simpler:
std::string bsIota(int value)
{
   std::string output;
   boost::spirit::karma::generate(std::back_inserter(output), boost::spirit::int_, value); // 1
   return std::move(output); // 2
}
1. We pass to Karma the back insert iterator to a local string, so that it could put there the result of the generation from the input value we pass to generate().
2. We can save some execution time relying on the move copy ctor for std::string.

A friendly reader wondered why I originally used in (1) the verbose but bit more explicit formula
std::back_insert_iterator<std::string>(output)
instead of its sleeker inline redefinition
std::back_inserter(output)
Actually, I don't remember. And I guess he is right, and his suggestion makes the code more readable.

By the way, if it is not clear to you what an inserter is, you could search for other posts on this same blog talking about them. I guess you could start reading this couple: Using inserters has a few examples involving back, front, and a generic inserter. std::copy_if could give you some fun with an example that uses std::copy_if, a back inserter and even a lambda function in the same line (cool, isn't it?).

Go to the full post