Pages

Euclidean norm

The Euclidean norm is defined on a sequence of numbers as the square root of sum of square of each element or, equivalently, as the square root of the inner product of a vector and itself. Simple, isn't it? Actually it is. In C++ we could get it in a whiff, but walking through all the details would make it much more interesting.

We have to apply the square root on a value that could be get through a couple of recipes. Let's go for the first one.

Sum of squares

An intuitive function to calculate the sum of squares is:
double sumSquares(const std::vector<double>& v)
{
    double result = 0.0;
    for(unsigned int i = 0; i < v.size(); ++i)
        result += (v[i] * v[i]);
    return result;
}
With some extra-thinking we should ends up producing a more generic solution like:
template<typename C> // 1
typename C::value_type sumSquares(const C& c) // 2
{
    return std::accumulate(c.begin(), c.end(), static_cast<C::value_type>(0), // 3
        [](C::value_type sum, C::value_type value){return sum + value * value; }); // 4
}
1. This template function should be called for containers, so I used C as typename.
2. The return value type is defined as the passed container underlying value_type. We have to stress the fact that C::value_type is a type name, otherwise the compiler wouldn't understand what we meant and would signal its perplexity with a series of errors that on VC++2010 starts with this warning:
warning C4346: 'C::value_type' : dependent name is not a type
3. I use the STL accumulate() function, specifying beginning and end of the sequence, the initializing value for the sum, and (next line) the predicate that we want to use to generate the accumulation. The third parameter is not just a simple zero, is a zero for the right underlying type. Using a "simple" zero could lead to truncating issues, given that the type of variable used by accumulate() the store the sum is deducted by the type of this value.
4. As predicate I used a lambda function that gets as input parameter the generating sum and the value of the current collection element in evaluation. It calculates the square of the value, add it to the sum, and returns it.

Here is a pre-C++11 (no lambda) implementation of the same functionality:
template<typename T>
T square(T sum, T value)
{
    return sum  + value * value;
}

template<typename C>
typename C::value_type sumSquares(const C& c)
{
    return std::accumulate(c.begin(), c.end(), static_cast<C::value_type>(0), square<C::value_type>);
}
Having available a sumSquares() function, it is easy to calculate the Euclidean mean for a vector:
std::vector<double> v;
// ...
double euclideanNorm = sqrt(sumSquares(v));

Inner product

The inner product, also known as dot or scalar product, is an algebraic operation that given two vectors of the same size returns the sum of the product of each corresponding element in the vectors.

A trivial C/C++ implementation for inner product is:
double innerProduct(double* x, double* y, unsigned int size)
{
    double result = 0.0;
    for(unsigned int i = 0; i < size; ++i)
        result += x[i] * y[i];
    return result;
}
Major weakness of this implementation is that it lacks genericity: it is tailored just for arrays of double, any other container, and any other data type, would require another function.

A rapid browsing in the STL C++ library would give as a more interesting algorithm, std::inner_product(), that comes in a couple of flavors, the plain one, and one that uses whichever functors we like instead of the standard plus and multiply for the current datatype.

The simpler inner_product() version requires in input three iterators, two for delimiting the first sequence, one for the beginning of the second - remember that the two sequences should have the same size. A fourth parameter is required as initializer for the returned value, and usually is set to zero.

The testing code below calls my trivial implementation above, and then the standard inner_product(), so we can compare the two solutions:
double d1[] = { 1, 3 };
double d2[] = { 6, 2 };
std::cout << innerProduct(d1, d2, 2) << std::endl;
typedef std::vector<double> DVector;
DVector v1(d1, d1 + 2);
DVector v2(d2, d2 + 2);
std::cout << std::inner_product(d1, d1 + 2, d2, 0.0) << std::endl;
std::cout << std::inner_product(v1.begin(), v1.end(), v2.begin(), 0.0) << std::endl;
Notice that I passed the initializing value to sum as an explicit double. If I used a plain 0, the compiled would have inferred that I wanted to have the result as an int, truncating it as we asked.

The amazing result of all this chatting is that we can calculate a vector Euclidean norm writing a single line of code:
std::vector<double> v;
// ...
double euclideanNorm = sqrt(std::inner_product(v.begin(), v.end(), v.begin(), 0.0));

Still, we can do even better, at least from the point of view of the expressiveness and maintainability of our code, using an algebraic library.

uBLAS

The Boost Numeric uBLAS library provides a number of common algebraic function, and among them there is also the vector Euclidean norm. We only have to pay attention to the fact that we have to work with uBLAS vector and not STL ones:
// ...
#include <boost/numeric/ublas/vector.hpp>
#include <boost/numeric/ublas/io.hpp>

namespace ublas = boost::numeric::ublas;

// ...
ublas::vector<double> uv(2);
uv[0] = 6;
uv[1] = 2;

std::cout << "Euclidean Norm: " << ublas::norm_2(uv) << std::endl;

No comments:

Post a Comment