In this section, we are going to implement some functions that simulate computation-intensive tasks that depend on each other, and let them run as parallel as possible:
- Let's first include all the necessary headers:
#include <iostream>
#include <iomanip>
#include <thread>
#include <string>
#include <sstream>
#include <future>
using namespace std;
using namespace chrono_literals;
- We need to synchronize concurrent access to cout, so let's use the synchronization helper from the other recipe in this chapter:
struct pcout : public stringstream {
static inline mutex cout_mutex;
~pcout() {
lock_guard<mutex> l {cout_mutex};
cout << rdbuf();
cout.flush();
}
};
- Now let's implement three functions which transform strings. The first function shall create an std::string object from a C-string. We let it sleep for 3 seconds to simulate that string creation is computation-heavy:
static string create(const char *s)
{
pcout{} << "3s CREATE " << quoted(s) << 'n';
this_thread::sleep_for(3s);
return {s};
}
- The next function accepts two string objects as arguments and returns their concatenation. We give it 5-second wait time to simulate that this is a time-consuming task:
static string concat(const string &a, const string &b)
{
pcout{} << "5s CONCAT "
<< quoted(a) << " "
<< quoted(b) << 'n';
this_thread::sleep_for(5s);
return a + b;
}
- The last computation-heavy function accepts a string and concatenates it with itself. It shall take 3 seconds to do this:
static string twice(const string &s)
{
pcout{} << "3s TWICE " << quoted(s) << 'n';
this_thread::sleep_for(3s);
return s + s;
}
- We could now already use those functions in a serial program, but we want to get some elegant automatic parallelization. So let's implement some helpers for this. Attention please, the following three functions look really complicated. asynchronize accepts a function f and returns a callable object that captures it. We can call this callable object with any number of arguments, and then it will capture those together with f in another callable object which it returns to us. This last callable object can be called without arguments. It does then call f asynchronously with all the arguments it captures:
template <typename F>
static auto asynchronize(F f)
{
return [f](auto ... xs) {
return [=] () {
return async(launch::async, f, xs...);
};
};
}
- The next function will be used by the function we declare in the next step afterward. It accepts a function f, and captures it in a callable object that it returns. That object can be called with a number of future objects. It will then call .get() on all the futures, apply f to them and return its result:
template <typename F>
static auto fut_unwrap(F f)
{
return [f](auto ... xs) {
return f(xs.get()...);
};
}
- The last helper function does also accept a function f. It returns a callable object that captures f. That callable object can be called with any number of callable objects as arguments, which it returns captured together with f in another callable object. That final callable object can then be called without arguments. It does then call all the callable objects that are captured in the xs... pack. These return futures which need to be unwrapped with fut_unwrap. The future-unwrapping and actual application of the real function f on the real values from the futures does again happen asynchronously using std::async:
template <typename F>
static auto async_adapter(F f)
{
return [f](auto ... xs) {
return [=] () {
return async(launch::async,
fut_unwrap(f), xs()...);
};
};
}
- Ok, that was maybe kind of a crazy ride that was slightly reminiscent of the movie "Inception" because of the lambda expressions that return lambda expressions. We will have a very detailed look at this voodoo-code later. Now let's take the functions create, concat, and twice and make them asynchronous. The function async_adapter makes a completely normal function wait for future arguments and return a future result. It is kind of a translating wrapper from the synchronous to the asynchronous world. We apply it to concat and twice. We must use asynchronize on create because it shall return a future, but we will feed it with real values instead of futures. The task dependency chain must begin with create calls:
int main()
{
auto pcreate (asynchronize(create));
auto pconcat (async_adapter(concat));
auto ptwice (async_adapter(twice));
- Now we have automatically parallelizing functions that have the same names as the original synchronous ones, but with a p-prefix. Let us now set up a complex example dependency tree. First, we create the strings "foo " and "bar ", which we immediately concatenate to "foo bar ". This string is then concatenated with itself using twice. Then we create the strings "this " and "that ", which we concatenate to "this that ". Finally, we concatenate the results to "foo bar foo bar this that ". The result shall be saved in the variable callable. Then finally call callable().get() in order to start the computation and wait for its return value, in order to also print that. No computation is done before we call callable(), and after this call, all the magic starts:
auto result (
pconcat(
ptwice(
pconcat(
pcreate("foo "),
pcreate("bar "))),
pconcat(
pcreate("this "),
pcreate("that "))));
cout << "Setup done. Nothing executed yet.n";
cout << result().get() << 'n';
}
- Compiling and running the program shows that all the create calls are performed at the same time, and then the other calls are performed. It looks as if they were scheduled intelligently. The whole program runs for 16 seconds. If the steps were not performed in parallel, it would take 30 seconds to complete. Note that we need a system with at least four CPU cores to be able to perform all create calls at the same time. If the system had fewer CPU cores, then some calls would have to share CPUs which would of course then consume more time:
$ ./chains
Setup done. Nothing executed yet.
3s CREATE "foo "
3s CREATE "bar "
3s CREATE "this "
3s CREATE "that "
5s CONCAT "this " "that "
5s CONCAT "foo " "bar "
3s TWICE "foo bar "
5s CONCAT "foo bar foo bar " "this that "
foo bar foo bar this that