Boost Application Development Cookbook
Online Examples
This site contains all the source codes and introductions from the book.
You are free to experiment with code, compile, modify, run and use examples.
- Show all recipes
- Hide all recipes
Chapters
Click on the chapter to view it's recipes. Click on the recipe to see it's sources.
-
Chapter 01: Starting to Write Your Application
- Getting configuration options (part 1, part 2)
- Storing any value in a container/variable
- Storing multiple chosen types in a variable/container
- Using a safer way to work with a container that stores multiple chosen types (part 1, part 2)
- Returning a value or flag where there is no value
- Returning an array from a function
- Combining multiple values into one
- Reordering the parameters of function
- Binding a value as a function parameter
- Using the C++11 move emulation
- Making a noncopyable class
- Making a noncopyable but movable class
- Chapter 02: Converting Data
-
Chapter 03: Managing Resources
- Managing pointers to classes that do not leave scope
- Reference counting of pointers to classes used across methods
- Managing pointers to arrays that do not leave scope
- Reference counting pointers to arrays used across methods
- Storing any functional objects in a variable (part 1, part 2)
- Passing a function pointer in a variable)
- Passing C++11 lambda functions in a variable
- Containers of pointers
- Doing something at scope exit
- Initializing the base class by a member of the derived class
- Chapter 04: Compile-time Tricks
- Chapter 05: Multithreading
-
Chapter 06: Manipulating Tasks
- Registering a task for processing an arbitrary datatype (part 1, part 2)
- Making timers and processing timer events as tasks (part 1, part 2)
- Network communication as a task (part 1, part 2)
- Accepting incoming connections (part 1, part 2)
- Executing different tasks in parallel (part 1, part 2)
- Conveyor tasks processing
- Making a nonblocking barrier
- Storing an exception and making a task from it
- Getting and processing system signals as tasks (part 1, part 2)
-
Chapter 07: Manipulating Strings
- Changing cases and case-insensitive comparison
- Matching strings using regular expressions
- Searching and replacing strings using regular expressions
- Formatting strings using safe printf-like functions
- Replacing and erasing strings
- Representing a string with two iterators
- Using a reference to string type
- Chapter 08: Metaprogramming
- Chapter 09: Containers
-
Chapter 10: Gathering Platform and Compiler Information
- Detecting int128 support
- Detecting RTTI support
- Speeding up compilation using C++11 extern templates
- Writing metafunctions using simpler methods
- Reducing code size and increasing performance of user-defined types (UDTs) in C++11
- The portable way to export and import functions and classes (part 1, part 2, part 3)
- Detecting the Boost version and getting latest features
- Chapter 11: Working with the System
- Chapter 12: Scratching the Tip of the Iceberg
Recipe's Intro
Recipe: Getting configuration options
Take a look at some of the console programs, such as cp
in Linux. They all have a fancy help,
their input parameters do not depend on any position, and have a human readable syntax,
for example:
$ cp --help
Usage: cp [OPTION]... [-T] SOURCE DEST
-a, --archive same as -dR --preserve=all
-b like --backup but does not accept an argument
You can implement the same functionality for your program in 10 minutes. And all you need is the Boost.ProgramOptions library.
If you have been programming in Java, C#, or Delphi, you will definitely miss the ability to
create containers with the Object
value type in C++. The Object
class in those languages is
a basic class for almost all types, so you are able to assign (almost) any value to it at any time.
Just imagine how great it would be to have such a feature in C++
Are you aware of the concept of unrestricted unions in C++11? Let me tell you about it in
short. C++03 unions can only hold extremely simple types of data called POD (plain old data).
So in C++03, you cannot, for example, store std::string
or std::vector
in a union.
C++11 relaxes this requirement, but you'll have to manage the construction and destruction
of such types by yourself, call in-place construction/destruction, and remember what type is
stored in a union. A huge amount of work, isn't it?
Imagine that you are creating a wrapper around some SQL database interface. You decided
that boost::any
will perfectly match the requirements for a single cell of the database
table. Some other programmer will be using your classes, and his task would be to get a
row from the database and count the sum of the arithmetic types in a row.
Imagine that we have a function that does not throw an exception and returns a value or
indicates that an error has occurred. In Java or C# programming languages, such cases are
handled by comparing a return value from a function value with a null
pointer; if it is null
then
an error has occurred. In C++, returning a pointer from a function confuses library users and
usually requires dynamic memory allocation (which is slow).
Let's play a guessing game! What can you tell about the following function?
char* vector_advance(char* val);
Should return values be deallocated by the programmer or not? Does the function attempt to deallocate the input parameter? Should the input parameter be zero-terminated, or should the function assume that the input parameter has a specified width? And now, let's make the task harder! Take a look at the following line:
char ( &vector_advance( char (&val)[4] ) )[4];
Please do not worry; I've also been scratching my head for half an hour before getting an idea
of what is happening here. vector_advance
is a function that accepts and returns an array
of four elements. Is there a way to write such a function clearly?
There is a very nice present for those who like std::pair
. Boost has a library called Boost.Tuple,
and it is just like std::pair
, but it can also work with triples, quads, and even bigger
collections of types.
This recipe and the next one are devoted to a very interesting library, whose functionality at first glance looks like some kind of magic. This library is called Boost.Bind and it allows you to easily create new functional objects from functions, member functions, and functional objects, also allowing the reordering of the initial function's input parameters and binding some values or references as function parameters.
If you work with the STL library a lot and use the <algorithm>
header, you will definitely
write a lot of functional objects. You can construct them using a set of STL adapter functions
such as bind1st
, bind2nd
, ptr_fun
, mem_fun
, and mem_fun_ref
, or you can write them
by hand (because adapter functions look scary). Here is some good news: Boost.Bind can
be used instead of all of those functions and it provides a more human-readable syntax.
One of the greatest features of the C++11 standard is rvalue references. This feature allows us to modify temporary objects, "stealing" resources from them. As you can guess, the C++03 standard has no rvalue references, but using the Boost.Move library you can write some portable code that uses them, and even more, you actually get started with the emulation of move semantics.
You must have almost certainly encountered situations where providing a copy constructor and move assignment operator for a class will require too much work, or where a class owns some resources that must not be copied for technical reasons:
class descriptor_owner {
void* descriptor_;
public:
explicit descriptor_owner(const char* params);
~descriptor_owner() {
system_api_free_descriptor(descriptor_);
}
};
The C++ compiler, in the case of the previous example, will generate a copy constructor and an assignment operator, so the potential user of the descriptor_owner class will be able to create the following awful things:
descriptor_owner d1("O_o");
descriptor_owner d2("^_^");
// Descriptor of d2 was not correctly freed
d2 = d1;
// destructor of d2 will free the descriptor
// destructor of d1 will try to free already freed descriptor
Now imagine the following situation: we have a resource that cannot be copied, which should be correctly freed in a destructor, and we want to return it from a function:
descriptor_owner construct_descriptor() {
return descriptor_owner("Construct using this string");
}
Actually, you can work around such situations using the swap method:
void construct_descriptor1(descriptor_owner& ret) {
descriptor_owner("Construct using this string").swap(ret);
}
But such a workaround won't allow us to use descriptor_owner in STL or Boost containers. And by the way, it looks awful!
Converting strings to numbers in C++ makes a lot of people depressed because of its
inefficiency and user unfriendliness. Let's see how string 100 can be converted to int
:
#include <sstream>
std::istringstream iss("100");
int i;
iss >> i;
// And now, 'iss' variable will get in the way all the time,
// till end of the scope
// It is better not to think, how many unnecessary operations,
// virtual function calls and memory allocations occurred
// during those operations
C methods are not much better:
#include <cstdlib>
char * end;
int i = std::strtol ("100", &end, 10);
// Did it converted all the value to int, or stopped somewhere
// in the middle?
// And now we have 'end' variable will getting in the way
// By the way, we want an integer, but strtol returns long
// int... Did the converted value fit in int?
In this recipe we will continue discussing lexical conversions, but now we will be converting
numbers to strings using Boost.LexicalCast . And as usual, boost::lexical_cast
will provide a very simple way to convert the data.
You might remember situations where you wrote something like the following code:
void some_function(unsigned short param);
int foo();
// Somewhere in code
// Some compilers may warn that int is being converted to
// unsigned short and that there is a possibility of losing
// data
some_function(foo());
Usually, programmers just ignore such warnings by implicitly casting to unsigned short datatype, as demonstrated in the following code snippet:
// Warning suppressed. Looks like a correct code
some_function(
static_cast<unsigned short>(foo())
);
But this may make it extremely hard to detect errors. Such errors may exist in code for years before they get caught:
// Returns -1 if error occurred
int foo() {
if (some_extremely_rare_condition()) {
return -1;
} else if (another_extremely_rare_condition()) {
return 1000000;
}
return 65535;
}
There is a feature in Boost.LexicalCast that allows users to use their own types in
lexical_cast . This feature just requires the user to write the correct std::ostream
and std::istream
operators for their types.
Imagine that some programmer designed an awful interface as follows (this is a good example of how interfaces should not be written):
struct object {
virtual ~object() {}
};
struct banana: public object {
void eat() const {}
virtual ~banana(){}
};
struct pidgin: public object {
void fly() const {}
virtual ~pidgin(){}
};
object* try_produce_banana();
Our task is to make a function that eats bananas, and throws exceptions if something
instead of banana came along (eating pidgins gross!). If we dereference a value returned
by the try_produce_banana()
function, we are getting in danger of dereferencing
a null pointer.
It is a common task to parse a small text. And such situations are always a dilemma: shall we use some third-party professional tools for parsing such as Bison or ANTLR, or shall we try to write it by hand using only C++ and STL? The third-party tools are good for handling the parsing of complex texts and it is easy to write parsers using them, but they require additional tools for creating C++ or C code from their grammar, and add more dependencies to your project. Handwritten parsers are usually hard to maintain, but they require nothing except C++ compiler.
Let's start with a very simple task to parse a date in ISO format as follows:
YYYY-MM-DD
The following are the examples of possible input:
2013-03-01
2012-12-31 // (woo-hoo, it almost a new year!)
In the previous recipe we were writing a simple parser for dates. Imagine that some time has passed and the task has changed. Now we need to write a date-time parser that will support multiple input formats plus zone offsets. So now our parser should understand the following inputs:
2012-10-20T10:00:00Z // date time with zero zone offset
2012-10-20T10:00:00 // date time with unspecified zone offset
2012-10-20T10:00:00+09:15 // date time with zone offset
2012-10-20-09:15 // date time with zone offset
10:00:09+09:15 // time with zone offset
There are situations where we are required to dynamically allocate memory and construct a class in that memory. And, that's where the troubles start. Have a look at the following code:
void foo1() {
foo_class* p = new foo_class("Some initialization data");
bool something_else_happened = some_function1(p);
if (something_else_happened) {
delete p;
return false;
}
some_function2(p);
delete p;
return true;
}
This code looks correct at first glance. But, what if some_function1()
or some_function2()
throws an exception? In that case, p
won't be deleted. Let's fix it in the
following way:
void foo2() {
foo_class* p = new foo_class("Some initialization data");
try {
bool something_else_happened = some_function1(p);
if (something_else_happened) {
delete p;
return false;
}
some_function2(p);
} catch (...) {
delete p;
throw;
}
delete p;
return true;
}
Now the code is ugly and hard to read but is correct. Maybe we can do better than this.
Imagine that you have some dynamically allocated structure containing data, and you want to process it in different execution threads. The code to do this is as follows:
#include <boost/thread.hpp>
#include <boost/bind.hpp>
void process1(const foo_class* p);
void process2(const foo_class* p);
void process3(const foo_class* p);
void foo1() {
while (foo_class* p = get_data()) // C way
{
// There will be too many threads soon, see
// recipe 'Executing different tasks in parallel'
// for a good way to avoid uncontrolled growth of threads
boost::thread(boost::bind(&process1, p))
.detach();
boost::thread(boost::bind(&process2, p))
.detach();
boost::thread(boost::bind(&process3, p))
.detach();
// delete p; Oops!!!!
}
}
We cannot deallocate p
at the end of the while loop because it can still be used by threads
that run process functions. Process functions cannot delete p
because they do not know that
other threads are not using it anymore.
We already saw how to manage pointers to a resource in the Managing pointers to classes
that do not leave scope recipe. But, when we deal with arrays, we need to call delete[]
instead of a simple delete , otherwise there will be a memory leak. Have a look at the
following code:
void may_throw1(const char* buffer);
void may_throw2(const char* buffer);
void foo() {
// we cannot allocate 10MB of memory on stack,
// so we allocate it on heap
char* buffer = new char[1024 * 1024 * 10];
// Here comes some code, that may throw
may_throw1(buffer);
may_throw2(buffer);
delete[] buffer;
}
We continue coping with pointers, and our next task is to reference count an array. Let's take a look at a program that gets some data from the stream and processes it in different threads. The code to do this is as follows:
#include <cstring>
#include <boost/thread.hpp>
#include <boost/bind.hpp>
void do_process(const char* data, std::size_t size);
void do_process_in_background(const char* data, std::size_t size) {
// We need to copy data, because we do not know,
// when it will be deallocated by the caller
char* data_cpy = new char[size];
std::memcpy(data_cpy, data, size);
// Starting thread of execution to process data
boost::thread(boost::bind(&do_process, data_cpy, size))
.detach();
// We cannot delete[] data_cpy, because
// do_process1 or do_process2 may still work with it
}
Just the same problem that occurred in the Reference counting of pointers to classes used across methods recipe.
C++ has a syntax to work with pointers to functions and member functions' pointers. And, that is good! However, this mechanism is hard to use with functional objects. Consider the situation when you are developing a library that has its API declared in the header files and implementation in the source files. This library shall have a function that accepts any functional objects. How would you pass a functional object to it? Have a look at the following code:
// Required for std::unary_function<> template
#include <functional>
// making a typedef for function pointer accepting int
// and returning nothing
typedef void (*func_t)(int);
// Function that accepts pointer to function and
// calls accepted function for each integer that it has
// It cannot work with functional objects :(
void process_integers(func_t f);
// Functional object
class int_processor: public std::unary_function<int, void> {
const int min_;
const int max_;
bool& triggered_;
public:
int_processor(int min, int max, bool& triggered)
: min_(min)
, max_(max)
, triggered_(triggered)
{}
void operator()(int i) const {
if (i < min_ || i > max_) {
triggered_ = true;
}
}
};
We are continuing with the previous example, and now we want to pass a pointer to a function
in our process_integeres()
method. Shall we add an overload for just function pointers,
or is there a more elegant way?
We are continuing with the previous example, and now we want to use a lambda function with
our process_integers()
method.
There are such cases when we need to store pointers in the container. The examples are: storing polymorphic data in containers, forcing fast copy of data in containers, and strict exception requirements for operations with data in containers. In such cases, the C++ programmer has the following choices:
* Store pointers in containers and take care of their destructions using the operator delete. Such an approach is error prone and requires a lot of writing.
* Store smart pointers in containers. For the C++03 you'll have to use std::auto_ptr
. However the
std::auto_ptr
class is deprecated, and it is not recommended to use it in containers. For the C++11 version
you'll have to use std::unique_ptr
. This solution is a good one, but it cannot be used in C++03, and you still need to
write a comparator functional object.
* Use Boost.SmartPtr in the container. This solution is portable, but you still need to write comparators, and it adds performance penalties (an atomic counter requires additional memory, and its increments/decrements are not as fast as nonatomic operations).
If you were dealing with languages such as Java, C#, or Delphi, you were obviously using the
try{} finally{}
construction or scope(exit)
in the D programming language. Let me
briefly describe to you what do these language constructions do.
When a program leaves the current scope via return or exception, code in the finally or scope(exit) blocks is executed. This mechanism is perfect for implementing the RAII pattern as shown in the following code snippet:
// Some pseudo code (suspiciously similar to Java code)
try {
FileWriter f = new FileWriter("example_file.txt");
// Some code that may trow or return
// ...
} finally {
// Whatever happened in scope, this code will be executed
// and file will be correctly closed
if (f != null) {
f.close()
}
}
Is there a way to do such a thing in C++?
We are continuing with the previous example, and now we want to use a lambda function with
our process_integers()
method.
Let's take a look at the following example. We have some base class that has virtual functions
and must be initialized with reference to the std::ostream
object:
#include <boost/noncopyable.hpp>
#include <sstream>
class tasks_processor: boost::noncopyable {
std::ostream& log_;
public:
explicit tasks_processor(std::ostream& log)
: log_(log)
{}
};
We also have a derived class that has a std::ostream object:
class fake_tasks_processor: public tasks_processor {
std::ostringstream logger_;
public:
fake_tasks_processor()
: tasks_processor(logger_) // Oops! logger_ does not exist here
, logger_()
{}
};
This is not a very common case in programming, but when such mistakes happen, it is not
always simple to get the idea of bypassing it. Some people try to bypass it by changing the
order of logger_
and the base type initialization:
fake_tasks_processor()
: logger_() // Oops! logger_ still will be constructed AFTER tasks_processor
, tasks_processor(logger_)
{}
It won't work as they expect because direct base classes are initialized before nonstatic data members, regardless of the order of the member initializers.
Let's imagine that we are writing some serialization function that stores values in buffer of a specified size:
#include <cstring>
#include <boost/array.hpp>
template <class T, std::size_t BufSizeV>
void serialize(const T& value, boost::array<unsigned char, BufSizeV>& buffer) {
// TODO: fixme
std::memcpy(&buffer[0], &value, sizeof(value));
}
This code has the following troubles:
* The size of the buffer is not checked, so it may overflow
* This function can be used with non-plain old data (POD) types, which would lead to incorrect behavior
We may partially fix it by adding some asserts, for example:
template <class T, std::size_t BufSizeV>
void serialize(const T& value, boost::array<unsigned char, BufSizeV>& buffer) {
assert(BufSizeV >= sizeof(value));
// TODO: fixme
std::memcpy(&buffer[0], &value, sizeof(value));
}
But, this is a bad solution. The BufSizeV
and sizeof(value)
values are known at compile
time, so we can potentially make this code to fail compilation if the buffer is too small, instead
of having a runtime assert (which may not trigger during debug, if function was not called, and
may even be optimized out in release mode, so very bad things may happen).
It's a common situation, when we have a templated class that implements some functionality. Have a look at the following code snippet:
// Generic implementation
template <class T>
class data_processor {
double process(const T& v1, const T& v2, const T& v3);
};
After execution of the preceding code, we have additional two optimized versions of that class, one for integral, and another for real types:
// Integral types optimized version
template <class T>
class data_processor {
typedef int fast_int_t;
double process(fast_int_t v1, fast_int_t v2, fast_int_t v3);
};
// SSE optimized version for float types
template <class T>
class data_processor {
double process(double v1, double v2, double v3);
};
Now the question, how to make the compiler to automatically choose the correct class for a specified type, arises.
We continue working with Boost metaprogramming libraries. In the previous recipe, we saw
how to use enable_if_c
with classes, now it is time to take a look at its usage in template
functions. Consider the following example.
Initially, we had a template function that works with all the available types:
template <class T>
T process_data(const T& v1, const T& v2, const T& v3);
Now that we write code using process_data function, we use an optimized process_data
version for types that do have an operator +=
function:
template <class T>
T process_data_plus_assign(const T& v1, const T& v2, const T& v3);
But, we do not want to change the already written code; instead whenever it is possible, we want to force the compiler to automatically use optimized function in place of the default one.
We have now seen examples of how we can choose between functions without
boost::enable_if_c
usage. Let's consider the following example, where we have a generic
method for processing POD datatypes:
#include <boost/static_assert.hpp>
#include <boost/type_traits/is_pod.hpp>
// Generic implementation
template <class T>
T process(const T& val) {
BOOST_STATIC_ASSERT((boost::is_pod<T>::value));
// ...
}
And, we have the same function optimized for sizes 1, 4, and 8 bytes. How do we rewrite process function, so that it can dispatch calls to optimized versions?
We need to implement a type trait that returns true
if the std::vector
type is passed to it
as a template parameter.
Imagine that we are working with classes from different vendors that implement different amounts of arithmetic operations and have constructors from integers. And, we do want to make a function that increments by one when any class is passed to it. Also, we want this function to be effective! Take a look at the following code:
template <class T>
void inc(T& value) {
// call ++value
// or call value ++
// or value += T(1);
// or value = value + T(1);
}
In the previous recipes, we saw some examples on boost::bind
usage. It is a good and
useful tool with a small drawback; it is hard to store boost::bind
metafunction's functor
as a variable in C++03.
#include <functional>
#include <boost/bind.hpp>
const ??? var = boost::bind(std::plus<int>(), _1, _1);
In C++11, we can use auto keyword instead of ???
, and that will work. Is there a way to
do it in C++03?
On modern multi-core compilers, to achieve maximal performance (or just to provide a good user experience), programs usually must use multiple execution threads. Here is a motivating example in which we need to create and fill a big file in a thread that draws the user interface:
#include <algorithm>
#include <fstream>
#include <iterator>
void set_not_first_run();
bool is_first_run();
// Function, that executes for a long time
void fill_file_with_data(char fill_char, std::size_t size, const char* filename) {
std::ofstream ofs(filename);
std::fill_n(std::ostreambuf_iterator<char>(ofs), size, fill_char);
set_not_first_run();
}
// ...
// Somewhere in thread that draws a user interface
if (is_first_run()) {
// This will be executing for a long time during which
// user's interface will freeze.
fill_file_with_data(0, 8 * 1024 * 1024, "save_file.txt");
}
Now that we know how to start execution threads, we want to have access to some common resources from different threads:
#include <cassert>
#include <cstddef>
// In previous recipe we included
// <boost/thread.hpp>, which includes all
// the classes of Boost.Thread
#include <boost/thread/thread.hpp>
int shared_i = 0;
void do_inc() {
for (std::size_t i = 0; i < 30000; ++i) {
// do some work
// ...
const int i_snapshot = ++ shared_i;
// do some work with i_snapshot
// ...
}
}
void do_dec() {
for (std::size_t i = 0; i < 30000; ++i) {
// do some work
// ...
const int i_snapshot = -- shared_i;
// do some work with i_snapshot
// ...
}
}
void run() {
boost::thread t1(&do_inc);
boost::thread t2(&do_dec);
t1.join();
t2.join();
// assert(shared_i == 0); // Oops!
std::cout << "shared_i == " << shared_i;
}
This 'Oops!'
is not written there accidentally. For some people it will be a surprise, but there
is a big chance that shared_i won't be equal to 0:
shared_i == 19567
And it will get even worse in cases when a common resource has some non-trivial classes; segmentation faults and memory leaks may (and will) occur. We need to change the code so that only one thread modifies the shared_i variable at a single moment of time and so that all of the processor and compiler optimizations that inflict multithreaded code are bypassed.
In the previous recipe, we saw how to safely access a common resource from different threads. But in that recipe, we were doing two system calls (in locking and unlocking the mutex) just to get the value from an integer:
{ // Critical section begin
boost::lock_guard<boost::mutex> lock(i_mutex);
i_snapshot = ++ shared_i;
} // Critical section end
This looks lame! And slow! Can we make the code from the previous recipe better?
Let's for shortness call the functional object that takes no arguments a task.
typedef boost::function<void()> task_t;
And now, imagine a situation where we have threads that post tasks and threads that execute posted tasks. We need to design a class that can be safely used by both types of thread. This class must have methods for getting a task (or blocking and waiting for a task until it is posted by another thread), checking and getting a task if we have one (returning an empty task if no tasks remain), and a method to post tasks.
Imagine that we are developing some online services. We have a map of registered users with some properties for each user. This set is accessed by many threads, but it is very rarely modified. All operations with the following set are done in a thread-safe manner via acquireing an unique lock on the mutex.
But any operation, even getting/reading resources will result in waiting on a locked mutex; therefore, this class will become a bottleneck very soon.
Can we fix it?
Let's take a glance at the recipe Creating a work_queue class. Each task there can be executed in one of many threads and we do not know which one. Imagine that we want to send the results of an executed task using some connection.
#include <boost/noncopyable.hpp>
class connection: boost::noncopyable {
public:
// Opening a connection is a slow operation
void open();
void send_result(int result);
// Other methods
// ...
};
We have the following solutions:
* Open a new connection when we need to send the data (which is slow)
* Have a single connection for all the threads and wrap them in mutex (which is also slow)
* Have a pool of connections, get a connection from it in a thread-safe manner and use it (a lot of coding is required, but this solution is fast)
* Have a single connection per thread (fast and simple to implement)
So, how can we implement the last solution?
Sometimes, we need to kill a thread that ate too many resources or that is just executing for too long. For example, some parser works in a thread (and actively uses Boost.Thread), but we already have the required amount of data from it, so parsing can be stopped. All we have is:
boost::thread parser_thread(&do_parse);
// Some code goes here
// ...
if (stop_parsing) {
// no more parsing required
// TODO: stop parser
}
How can we do it?
Those readers who were trying to repeat all the examples by themselves or those who were experimenting with threads must already be bored with writing the following code to launch threads:
boost::thread t1(&some_function);
boost::thread t2(&some_function);
boost::thread t3(&some_function);
// ...
t1.join();
t2.join();
t3.join();
Maybe there is a better way to do this?
First of all, let's take care of the class that will hold all the tasks and provide methods for their execution. We were already doing something like this in the Creating a work_queue class recipe, but some of the following problems were not addressed:
* A task may throw an exception that leads a call to std::terminate
* An interrupted thread may not notice interruption but will finish its task and interrupt only during the next task (which is not what we wanted; we wanted to interrupt the previous task)
* Our work_queue class was only storing and returning tasks, but we need to add methods for executing existing tasks
* We need a way to stop processing the tasks
It is a common task to check something at specified intervals; for example, we need to check some session for an activity once every 5 seconds. There are two popular solutions to such a problem: creating a thread or sleeping for 5 seconds. This is a very lame solution that consumes a lot of system resources and scales badly. We could instead use system specific APIs for manipulating timers asynchronously. This is a better solution, but it requires a lot of work and is not very portable (until you write many wrappers for different platforms). It also makes you work with OS APIs that are not always very nice.
Receiving or sending data by network is a slow operation. While packets are received by the machine, and while the OS verifies them and copies the data to the user-specified buffer, multiple seconds may pass. And we may be able to do a lot of work instead of waiting. Let's modify our tasks_processor class so that it will be capable of sending and receiving data in an asynchronous manner. In nontechnical terms, we ask it to "receive at least N bytes from the remote host and after that is done, call our functor. And by the way, do not block on this call". Those readers who know about libev , libevent , or Node.js will find a lot of familiar things in this recipe.
A server side working with a network usually looks like a sequence where we first get data, then process it, and then send the result. Imagine that we are creating some kind of authorization server that will process a huge number of requests per second. In that case, we will need to receive and send data asynchronously and process tasks in multiple threads.
In this recipe, we'll see how to extend our tasks_processor class to accept and process incoming connections, and in the next recipe, we'll see how to make it multithreaded.
Now it is time to make our tasks_queue
process tasks in multiple threads. How hard could
this be?
Sometimes there is a requirement to process tasks within a specified time interval. Compared to previous recipes, where we were trying to process tasks in the order of their appearance in the queue, this is a big difference.
Consider an example where we are writing a program that connects two subsystems, one of which produces data packets and the other writes modified data to the disk (something like this can be seen in video cameras, sound recorders, and other devices). We need to process data packets one by one, smoothly with the least jitter, and in multiple threads.
Our previous tasks_queue was bad at processing tasks in a specified order, so how can we solve this?
In multithreaded programming, there is an abstraction called barrier. It stops execution of threads that reach it until the requested number of threads are not blocked on it. After that, all the threads are released and they continue with their execution.
For example, we want to process different parts of the data in different threads and then send the data:
void runner(std::size_t thread_index, boost::barrier& data_barrier, data_t& data) {
for (std::size_t i = 0; i < 1000; ++ i) {
fill_data(data.at(thread_index));
data_barrier.wait();
if (!thread_index) {
compute_send_data(data);
}
data_barrier.wait();
}
}
The data_barrier.wait()
method blocks until all the threads fill the data. After that,
all the threads are released; the thread with the index 0 will compute data to be sent using
compute_send_data(data)
, while others are again waiting at the barrier.
Looks lame, isn't it?
Processing exceptions is not always trivial and may take a lot of time. Consider the situation where an exception must be serialized and sent by the network. This may take milliseconds and a few thousand lines of code. After the exception is caught is not always the best time and place to process it.
So, can we store exceptions and delay their processing?
When writing some server applications (especially for Linux OS), catching and processing signals is required. Usually, all the signal handlers are set up at server start and do not change during the application's execution.
The goal of this recipe is to make our tasks_processor
class capable of processing signals.
This is a pretty common task. We have two non-Unicode or ANSI character strings:
#include <string>
std::string str1 = "Thanks for reading me!";
std::string str2 = "Thanks for reading ME!";
We need to compare them in a case-insensitive manner. There are a lot of methods to do that; let's take a look at Boost's.
Let's do something useful! It's common that the user's input must be checked using some regular expression-specific pattern that provides a flexible means of match. The problem is that there are a lot of regex syntaxes; expressions written using one syntax are not handled well by the other syntax. Another problem is that long regexes are not easy to write.
So in this recipe, we'll write a program that may use different types of regular expression syntaxes and checks that the input strings match the specified regexes.
My wife enjoyed the Matching strings using regular expressions recipe very much and told me that I'll get no food until I improve it to be able to replace parts of the input string according to a regex match. Each matched subexpression (part of the regex in parenthesis) must get a unique number starting from 1; this number will be used to create a new string.
This is how an updated program will work like:
Available regex syntaxes:
[0] Perl
[1] Perl case insensitive
[2] POSIX extended
[3] POSIX extended case insensitive
[4] POSIX basic
[5] POSIX basic case insensitive
Choose regex syntax: 0
Input regex: (\d)(\d)
String to match: 00
MATCH: 0, 0,
Replace pattern: \1#\2
RESULT: 0#0
String to match: 42
MATCH: 4, 2,
Replace pattern: ###\1-\1-\2-\1-\1###
RESULT: ###4-4-2-4-4###
The printf
family of functions is a threat to security. It is a very bad design to allow
users to put their own strings as the type and format specifiers. So what do we do when
user-defined format is required? How shall we implement the
std::string to_string(const std::string& format_specifier) const;
member function
of the following class?
class i_hold_some_internals {
int i;
std::string s;
char c;
// ...
};
Situations where we need to erase something in a string, replace a part of the string, or erase the first or last occurrence of some substring are very common. STL allows us to do most of this, but it usually involves writing too much code.
We saw the Boost.StringAlgorithm library in action in the Changing cases and case- insensitive comparison recipe. Let's see how it can be used to simplify our lives when we need to modify some strings:
#include <string>
const std::string str = "Hello, hello, dear Reader.";
There are situations when we need to split some strings into substrings and do something with those substrings. For example, count whitespaces along with characters in each sentence and, of course, we want to use Boost and be as efficient as possible.
This recipe is the most important recipe in this chapter! Let's take a look at a very common
case, where we write a function that accepts a string and returns the part of the string
between character values passed in the starts
and ends
arguments:
#include <string>
#include <algorithm>
std::string between_str(const std::string& input, char starts, char ends) {
std::string::const_iterator pos_beg
= std::find(input.begin(), input.end(), starts);
if (pos_beg == input.end()) {
return std::string(); // Empty
}
++ pos_beg;
std::string::const_iterator pos_end
= std::find(input.begin(), input.end(), ends);
return std::string(pos_beg, pos_end);
}
Do you like this implementation? In my opinion, it looks awful; consider the following call to it:
between_str("Getting expression (between brackets)", '(', ')');
In that call, a temporary std::string
variable will be constructed from "Getting
expression (between brackets)"
. The character array is long enough, so there is a big
chance that dynamic memory allocation will be called inside the std::string
constructor
and the character array will be copied into it. Then, somewhere inside the between_str
function, new std::string
will be constructed, which may also lead to another dynamic
memory allocation and result in copying.
So, this simple function may, and in most cases will:
* Call dynamic memory allocation (twice)
* Copy string (twice)
* Deallocate memory (twice)
Can we do better?
There are situations when it would be great to work with all the template parameters as if they were in a container. Imagine that we are writing something such as Boost.Variant:
#include <boost/mpl/aux_/na.hpp> // boost::mpl::na == n.a. == not available
template <
class T0 = boost::mpl::na,
class T1 = boost::mpl::na,
class T2 = boost::mpl::na,
class T3 = boost::mpl::na,
class T4 = boost::mpl::na,
class T5 = boost::mpl::na,
class T6 = boost::mpl::na,
class T7 = boost::mpl::na,
class T8 = boost::mpl::na,
class T9 = boost::mpl::na
>
struct variant;
And the preceding code is where all the following interesting tasks start to happen:
* How can we remove constant and volatile qualifiers from all the types?
* How can we remove duplicate types?
* How can we get the sizes of all the types?
* How can we get the maximum size of the input parameters?
All these tasks can be easily solved using Boost.MPL.
The task of this recipe will be to modify the content of one boost::mpl::vector
function
depending on the content of a second boost::mpl::vector
function. We'll be calling
the second vector as the vector of modifiers and each of those modifiers can have the
following type:
// Make unsigned
struct unsigne; // No typo: 'unsigned' is a keyword, we cannot use it.
// Make constant
struct constant;
// Otherwise we do not change type
struct no_change;
So where shall we start?
Many features were added to C++11 to simplify the metaprogramming. One such feature is the alternative function syntax. It allows deducing the result type of a template function. Here is an example:
template <class T1, class T2>
auto my_function_cpp11(const T1& v1, const T2& v2)
-> decltype(v1 + v2)
{
return v1 + v2;
}
It allows us to write generic functions more easily and work in difficult situations:
#include <cassert>
struct s1 {};
struct s2 {};
struct s3 {};
inline s3 operator + (const s1& /*v1*/, const s2& /*v2*/) {
return s3();
}
inline s3 operator + (const s2& /*v1*/, const s1& /*v2*/) {
return s3();
}
int main() {
s1 v1;
s2 v2;
my_function_cpp11(v1, v2);
my_function_cpp11(v1, v2);
assert(my_function_cpp11('\0', 1) == 1);
}
But Boost has a lot of functions like these and it does not require C++11 to work.
How is that possible and how can we make a C++03 version of the
my_function_cpp11
function?
Functions that accept other functions as an input parameter or functions that return other functions are called higher-order functions. For example, the following functions are higher-order:
function_t higher_order_function1();
void higher_order_function2(function_t f);
function_t higher_order_function3(function_t f);
We have already seen higher-order metafunctions in the recipes Using type "vector of types"
and Manipulating a vector of types from this chapter, where we used boost::transform
.
In this recipe, we'll try to make our own higher-order metafunction named coalesce
,
which accepts two types and two metafunctions. The coalesce metafunction applies
the first type-parameter to the first metafunction and compares the resulting type
with the boost::mpl::false_
type metafunction. If the resulting type is the
boost::mpl::false_
type metafunction, it returns the result of applying the second
type-parameter to the second metafunction, otherwise, it returns the first result type:
template <class Param1, class Param2, class Func1, class Func2>
struct coalesce;
Lazy evaluation means that the function won't be called until we really need its result. Knowledge of this recipe is highly recommended for writing good metafunctions. The importance of lazy evaluation will be shown in the following example.
Imagine that we are writing a metafunction that accepts a function, a parameter, and a condition. The resulting type of that function must be a fallback type if the condition is false otherwise the result will be as follows:
struct fallback;
template <
class Func,
class Param,
class Cond,
class Fallback = fallback>
struct apply_if;
And the preceding code is the place where we cannot live without lazy evaluation.
This recipe and the next one are devoted to a mix of compile time and runtime features. We'll be using the Boost.Fusion library to see what it can do.
Remember that we were talking about tuples and arrays in the first chapter. Now we want to write a single function that can stream elements of tuples and arrays to strings.
This recipe will show a tiny piece of the Boost.Fusion library's abilities. We'll be splitting a single tuple into two tuples, one with arithmetic types and the other with all the other types.
It is a common task to manipulate strings. Here we'll see how the operation of string comparison can be done quickly using some simple tricks. This recipe is a trampoline for the next one, where the techniques described here will be used to achieve constant time-complexity searches.
So, we need to make a class that is capable of quickly comparing strings for equality.
In the previous recipe, we saw how string comparison can be optimized using hashing. After reading it, the following question may arise, "Can we make a container that will cache hashed values to use faster comparison?".
The answer is yes, and we can do much more. We can achieve almost constant time complexities for search, insertion, and removal of elements.
Several times in a year, we need something that can store and index a pair of values. Moreover, we need to get the first part of the pair using the second, and get the second part using the first. Confused? Let me show you an example. We are creating a vocabulary class, wherein when the users put values into it, the class must return identifiers and when the users put identifiers into it, the class must return values.
To be more practical, users will be entering login names into our vocabulary, and wish to get the unique identifier of a person. They will also wish to get all the persons' names using identifiers.
Let's see how it can be implemented using Boost.
In the previous recipe, we made some kind of vocabulary, which is good when we need to work with pairs. But, what if we need much more advanced indexing? Let's make a program that indexes persons:
struct person {
std::size_t id_;
std::string name_;
unsigned int height_;
unsigned int weight_;
person(std::size_t id, const std::string& name, unsigned int height, unsigned int weight)
: id_(id)
, name_(name)
, height_(height)
, weight_(weight)
{}
};
inline bool operator < (const person& p1, const person& p2) {
return p1.name_ < p2.name_;
}
We will need a lot of indexes; for example, by name, ID, height, and weight.
Nowadays, we usually use std::vector
when we need nonassociative and nonordered
containers. This is recommended by Andrei Alexandrescu and Herb Sutter in the book
C++ Coding Standards, and even those users who did not read the book usually use
std::vector
. Why? Well, std::list
is slower, and uses much more resources than
std::vector
. The std::deque
container is very close to std::vector
, but stores
values noncontinuously.
std::vector
is good! But if there's a need in a container that does not invalidate iterators on erase
and
insert
we are forced to choose the slower std::list
.
But wait, there is a good solution in Boost for such cases! Let's see how it can be done using Boost.
After reading the previous recipe, some of the readers may start using fast pool allocators
everywhere; especially, for std::set
and std::map
. Well, I'm not going to stop you
from doing that, but let's at least take a look at an alternative: flat associative containers.
These containers are implemented on top of the traditional vector container and store the
values ordered.
Some compilers have support for extended arithmetic types such as 128-bit floats or integers. Let's take a quick glance at how to use them using Boost. We'll be creating a method that accepts three parameters and returns the multiplied value of those methods.
Some companies and libraries have specific requirements for their C++ code, such as
successful compilation without Runtime type information (RTTI). In this small recipe, we'll
take a look at how we can detect disabled RTTI, how to store information about types, and
compare types at runtime, even without typeid
.
Remember some situations where you were using some complicated template class declared
in the header file? Examples of such classes would be boost::variant
, containers from
Boost.Container, or Boost.Spirit parsers. When we use such classes or methods,
they are usually compiled (instantiated) separately in each source file that is using them,
and duplicates are thrown away during linking. On some compilers, that may lead to slow
compilation speed.
If only there was some way to tell the compiler in which source file to instantiate it!
Chapter 4, Compile-time Tricks, and Chapter 8, Metaprogramming, were devoted to metaprogramming. If you were trying to use techniques from those chapters, you may have noticed that writing a metafunction can take a lot of time. So it may be a good idea to experiment with metafunctions using more user-friendly methods, such as C++11 constexpr , before writing a portable implementation.
In this recipe, we'll take a look at how to detect constexpr support.
C++11 has very specific logic when user-defined types (UDTs) are used in STL containers. Containers will use move assignment and move construction only if the move constructor does not throw exceptions or there is no copy constructor.
Let's see how we can ensure the move_nothrow assignment operator and move_nothrow constructor of our type do not throw exceptions.
Almost all modern languages have the ability to make libraries, which is a collection of classes and methods that have a well-defined interface. C++ is no exception to this rule. We have two types of libraries: runtime (also called shared or dynamic load) and static. But writing libraries is not a trivial task in C++. Different platforms have different methods for describing which symbols must be exported from the shared library.
Let's have a look at how to manage symbol visibility in a portable way using Boost.
Boost is being actively developed, so each release contains new features and libraries. Some people wish to have libraries that compile for different versions of Boost and also want to use some of the features of the new versions.
Let's take a look at the boost::lexical_cast
change log. According to it, Boost 1.53 has
a lexical_cast(const CharType* chars, std::size_t count)
function overload.
Our task for this recipe will be to use that function overload for new versions of Boost, and
work around that missing function overload for older versions.
There are STL functions and classes to read and write data to files. But there are no functions to list files in a directory, to get the type of a file, or to get access rights for a file.
Let's see how such iniquities can be fixed using Boost. We'll be creating a program that lists names, write accesses, and types of files in the current directory.
Let's consider the following lines of code:
std::ofstream ofs("dir/subdir/file.txt");
ofs << "Boost.Filesystem is fun!";
In these lines, we attempt to write something to file.txt in the dir/subdir directory. This attempt will fail if there is no such directory. The ability to work with filesystems is necessary for write a good working code.
In this recipe we'll construct a directory and a subdirectory, write some data to a file, and try to create symlink , and if the symbolic link's creation fails, erase the created file. We will also avoid using exceptions as a mechanism of error reporting, preferring some form of return codes.
Let's see how that can be done in an elegant way using Boost.
Sometimes we write programs that will communicate with each other a lot. When programs are run on different machines, using sockets is the most common technique for communication. But if multiple processes run on a single machine, we can do much better!
Let's take a look at how to make a single memory fragment available from different processes using the Boost.Interprocess library.
In the previous recipe, we saw how to create shared memory and how to place some objects in it. Now it's time to do something useful. Let's take an example from the Creating a work_queue class recipe, and make it work for multiple processes. At the end of this example, we'll get a class that can store different tasks and pass them between processes.
It is hard to imagine writing some C++ core classes without pointers. Pointers and references
are everywhere in C++, and they do not work in shared memory! So if we have a structure like
this in shared memory and assign the address of some integer variable in shared memory
to pointer_
, we won't get the correct address in the other process that will attempt to use
pointer_
from that instance of with_pointer
:
struct with_pointer {
int* pointer_;
// ...
int value_holder_;
};
How can we fix that?
All around the Internet, people are asking "What is the fastest way to read files?". Let's make our task for this recipe even harder: "What is the fastest and most portable way to read binary files?"
Nowadays, plenty of embedded devices still have only a single core. Developers write for those devices, trying to squeeze maximum performance out of them. Using Boost. Threads or some other thread library for such devices is not effective; the OS will be forced to schedule threads for execution, manage resources, and so on, as the hardware cannot run them in parallel.
So how can we make a program switch to the execution of a subprogram while waiting for some resource in the main part?
Some tasks require a graphical representation of data. Boost.Graph is a library that was designed to provide a flexible way of constructing and representing graphs in memory. It also contains a lot of algorithms to work with graphs, such as topological sort, breadth first search, depth first search, and Dijkstra shortest paths.
Well, let's perform some basic tasks with Boost.Graph !
Making programs that manipulate graphs was never easy because of issues with visualization. When we work with STL containers such as std::map and std::vector , we can always print the container's contents and see what is going on inside. But when we work with complex graphs, it is hard to visualize the content in a clear way: too many vertexes and too many edges.
In this recipe, we'll take a look at the visualization of Boost.Graph using the Graphviz tool.
I know of many examples of commercial products that use incorrect methods for getting
random numbers. It's a shame that some companies still use rand()
in cryptography and
banking software.
Let's see how to get a fully random uniform distribution using Boost.Random that is suitable for banking software.
Some projects require specific trigonometric functions, a library for numerically solving ordinary differential equations, and working with distributions and constants. All of those parts of Boost.Math would be hard to fit into even a separate book. A single recipe definitely won't be enough. So let's focus on very basic everyday-use functions to work with float types.
We'll write a portable function that checks an input value for infinity and not-a-number (NaN
)
values and changes the sign if the value is negative.
This recipe and the next one are devoted to auto-testing the Boost.Test library, which is used by many Boost libraries. Let's get hands-on with it and write some tests for our own class.
Writing auto tests is good for your project. But managing test cases is hard when the project is large and many developers are working on it. In this recipe, we'll take a look at how to run individual tests and how to combine multiple test cases in a single module.
Let's pretend that two developers are testing the foo
structure declared in the foo.hpp
header and we wish to give them separate source files to write a test to. In that way, the
developers won't bother each other and can work in parallel. However, the default test run
must execute the tests of both developers.
I've left you something really tasty for dessert – Boost's Generic Image Library (GIL), which allows you to manipulate images and not care much about image formats.
Let's do something simple and interesting with it; let's make a program that negates any picture.
Compile & Run
Program arguments:
Compilation command:
Output:
Waiting...
Code (editable):
In the Book you'll also find: in-depth description, performance notes, comparison with C++11/C++14 and other juicy stuff.
About
Hi, I'm Antony Polukhin and I'm the author of Boost.TypeIndex and Boost.DLL, maintainer of multiple Boost's libraries and mentor in the Google Summer of Code programs.
All the examples at this site are additionaly explained in the Boost Application Development Cookbook, along with C++11 and performance notes. All the source codes are available on GitHub.
I hope that this site will be useful for you. Do the online compilation of Boost related examples, modify source codes, run tests and play around with Boost libraries.
Thanks
- All the people who participate in Boost C++ Libraries development. Without them, there'll be no Boost, no book, no hobbie.
- Wonderfull people from Coliru, that allow users to do online compilations.
- Heather Gopsill and Lakhi Dhatt from Packt Publishing for giving a permission to make sources and recipes' introductions publicly available.