Pages

Atoi with Spirit

The first example I found in the excellent Boost Spirit documentation is about using IQ (Spirit parser) to implement something like the standard C atoi() function, and Karma (Spirit generator) to do the same thing the other way round, as the non-standard itoa() function does.

I am tendentially a TDD guy, so I have written a few test cases for a couple of functions that I have derived from my reading to see in action how close the behavior is to the expected one. I spare you the list of tests I generated, just showing the first one (I use Google Test, if you wonder):
TEST(BoostSpiritAtoi, Simple)
{
   std::string s("42");
   EXPECT_EQ(std::atoi(s.c_str()), bsAtoi(s)); // 1
}
1. The expected result for a call to the Boost Spirit Atoi should be the same of a call to standard atoi.

For what I have seen, the document's authors left out (as a simple chore for the reader) only the skipping of leading white characters. So the code I have written for the atoi emulation is only a tad different from what you would find in the documentation:
int bsAtoi(const std::string& input)
{
   std::string::const_iterator beg = input.begin();
   while(beg != input.end() && std::isspace(*beg)) // 1
      ++beg;

   int value = 0; // 2
   boost::spirit::qi::parse(beg, input.end(), boost::spirit::int_, value); // 3
   return value;
}
1. Loop to skip all the possible leading white spaces. We won't care about trailing ones, since they are correctly managed by the Spirit parser - and I know what I am saying, I have written and checked a few specific test cases to verify it.
2. If we don't initialize the returned value and the parser doesn't have anything to do (empty string in input, for instance) we would get back a dirty value.
3. Parse with Spirit.

The itoa() emulation is even simpler:
std::string bsIota(int value)
{
   std::string output;
   boost::spirit::karma::generate(std::back_inserter(output), boost::spirit::int_, value); // 1
   return std::move(output); // 2
}
1. We pass to Karma the back insert iterator to a local string, so that it could put there the result of the generation from the input value we pass to generate().
2. We can save some execution time relying on the move copy ctor for std::string.

A friendly reader wondered why I originally used in (1) the verbose but bit more explicit formula
std::back_insert_iterator<std::string>(output)
instead of its sleeker inline redefinition
std::back_inserter(output)
Actually, I don't remember. And I guess he is right, and his suggestion makes the code more readable.

By the way, if it is not clear to you what an inserter is, you could search for other posts on this same blog talking about them. I guess you could start reading this couple: Using inserters has a few examples involving back, front, and a generic inserter. std::copy_if could give you some fun with an example that uses std::copy_if, a back inserter and even a lambda function in the same line (cool, isn't it?).

2 comments:

  1. std::back_insert_iterator < std::string > (output)

    could be written as

    std::back_inserter(output)

    ReplyDelete
  2. Yeah, you are right. Using back_inserted we save some typing, and the resulting code is more readable. I modified the post using your suggestion, thanks!

    ReplyDelete