Pages

Listing files in a directory

We have used the Boost Filesystem library to create a function that dump on standard console the size of the file whose name is passed to the function itself.

We have seen why it is a good idea using Boost Filesystem for such a task: the resulting code could be compiled and run successfully on Windows and Unix, being the differences in file system convention managed internally by the boost::filesystem::path class and by the provided functions; unexpected failures are seen as exceptions of type boost::filesystem::filesystem_error so that they could be gracefully shielded using a try-catch construct.

Now we are using this library for a function slightly more complicated. We expect as input a directory name, and we print to standard console the files contained in that directory, if any.

I have tested the code I'm showing here on Windows / Visual C++ 2010 and on Linux RedHat / gcc. No difference at all in the source files, just the project files have to be built differently accordingly to the requirements of the specific platform.

Actually, since I had forgot to add the library names in the makefile, I got a list of "undefined reference to" to a number of function, like boost::system::generic_category().

The solution was, naturally, adding the missing library files to the makefile:
MY_LIBRARIES := boost_system boost_filesystem
If you have some doubt on this point, there is a previous post where I have shown an example of a makefile for Boost.

A generic file name dump function

It is not strictly relevant here, but it is a bit of extra fun. Since I plan to put the filenames in a couple of different C++ standard containers, I wrote also a generic function for dumping all the elements of a container based on type boost::filesystem::path using the standard algorithm for_each and a lambda function:
template <typename Container>
void dumpFileNames(const Container& c)
{
std::for_each(c.begin(), c.end(), [](boost::filesystem::path p) // 1.
{
std::cout << (boost::filesystem::is_directory(p) ? 'D' : ' '); // 2.
std::cout << " " << p.filename() << std::endl;
});
}

1. We go through all the elements in the container. The current element, that should be of type boost::filesystem::path, is passed to the lambda function defined as third parameter of the for_each() call.
2. We could print more information for each file, but here we just put a 'D' if the current filename refers to a directory. The function is_directory() gives us the answer using the appropriate method for the current operating system.
3. We use the boost::filesystem::path filename() method to extract the filename relative to the local directory.

What if I try to misuse dumpFileNames() like this?
std::vector<int> vi;
dumpFileNames(vi);

Well, I'll get a number of compile errors that should help me understanding what is my mistake. Visual Studio immediately tells me that it "cannot convert parameter 1 from 'const int' to 'boost::filesystem2::path'". A message that looks reasonable.

Listing files in a directory /1

If a boost::filesystem::path refers to a directory, we can extract from it a boost::filesystem::directory_iterator, and use it to navigate among all the files that it contains. This consideration leads to our first version of a listing function:
void listDirAsIs(boost::filesystem::path filename)
{
typedef std::vector<boost::filesystem::path> Files; // 1.
Files files;

boost::filesystem::directory_iterator beg(filename); // 2.
boost::filesystem::directory_iterator end; // 3.
std::copy(beg, end, std::back_inserter(files)); // 4.

dumpFileNames(files); // 5.
}

1. When there is no special requirement for the container that should be used, the standard vector could be a good choice.
2. Constructing a directory_iterator from the filename gives us an iterator pointing to the first contained file, if any.
3. The default ctor for directory_iterator generates an invalid iterator, that could be used as "end".
4. We push back all the elements in the interval from beg to end in our vector, using the well-known copy-inserter idiom.
5. And finally we dump the vector.

Listing files in a directory /2

On Windows this function is just what we need, but on UNIX we get a result that is a bit unsatisfactory. The fact is the Windows keeps the filenames in a directory alphabetically ordered, while UNIX does not impose any special ordering rule. If we want that our code behave the same when running on different Operating Systems we should change our code. An idea would be to explicitly check if we are on Windows or UNIX, probably using macros, and calling the specific code. Alternatively we could call std::sort() on our resulting vector of filenames before dumping it. But I choose a third approach, using as container std::set, that keeps in order the items as they are inserted:
void listDirOrd(boost::filesystem::path filename)
{
typedef std::set<boost::filesystem::path> Files;
Files files;

boost::filesystem::directory_iterator beg(filename);
boost::filesystem::directory_iterator end;
std::copy(beg, end, std::inserter(files, files.begin())); // 1.

dumpFileNames(files);
}

1. Since here we are using std::set that does not implement a back (or front) inserter, we call use std::inserter.

Adding some checks

Basically our job is done. We add some checking to avoid unexpected behaviour in case of errors and we get this wrapper:
void listDir(boost::filesystem::path filename)
{
try
{
if(!boost::filesystem::exists(filename)) // 1.
{
std::cout << filename << " does not exist" << std::endl;
return;
}

if(!boost::filesystem::is_directory(filename)) // 2.
{
std::cout << filename << " is not a directory" << std::endl;
return;
}

std::cout << filename << " is a directory containing:" << std::endl;
listDirAsIs(filename); // 3.
std::cout << "---" << std::endl;
listDirOrd(filename);
}
catch(const boost::filesystem::filesystem_error& ex) // 4.
{
std::cout << ex.what() << std::endl;
}
}

1. Non-existing objects are detected here.
2. Non-directory files do not requires more than a warning.
3. Just to see the result, we leave also the first implementation of our listing functionality.
4. Any file system error is trapped here.

The code in this post is based on an example that you can find in the official boost filesystem tutorial. For more background information, you could have a look at the Boost filesystem version 3 home page.

No comments:

Post a Comment