© Ivor Horton and Peter Van Weert 2018
Ivor Horton and Peter Van WeertBeginning C++17https://doi.org/10.1007/978-1-4842-3366-5_19

19. Containers and Algorithms

Ivor Horton1  and Peter Van Weert2
(1)
Stratford-upon-Avon, Warwickshire, UK
(2)
Kessel-Lo, Belgium
 

The Standard Library offers an immense number of types and functions, and this number only increases with the release of every new version of the C++ standard. We could not possibly introduce the full range and scope of the Standard Library here in this book. For a contemporary overview of all the possibilities, details, and intricacies of this vast and growing library, we highly recommend C++ Standard Library Quick Reference by Peter Van Weert and Marc Gregoire. Good online references exist as well, but these make it harder to quickly get a feeling of every single feature that the Standard Library has to offer.

Nevertheless, no introduction of C++ would be complete without at least a brief discussion of containers and algorithms, as well as the glue that binds them: iterators. Rarely is a program written in C++ without either of these concepts. However, even for these aspects of the Standard Library alone, a gradual yet in-depth coverage would take up an entire book. Using the C++ Standard Template Libraries by Ivor Horton is precisely such a book. It provides a much wider and deeper coverage of containers and algorithms than we could possibly offer you in this one chapter. As a companion to this book, it does this in the same style as you’ve grown accustomed to, meaning a gentle, informal tutorial with many hands-on code samples.

The goal of this chapter is thus to give you a quick, high-level overview of what the containers and algorithms libraries of the Standard Library have to offer. We’ll focus on the underlying principles and ideas, as well as standard usage idioms, rather than listing and showcasing every individual function and capability. Our goal is not to provide you with an exhaustive reference; for that we refer you to the aforementioned reference works. Instead, we aim to sufficiently arm you to be able to read, understand, and effectively browse such references. For that, you need to have at least a global idea of what functionality is out there, how to choose between the various options, and what the common pitfalls are in their use.

In this chapter, you will learn:
  • What other containers the Standard Library has to offer (beyond std::vector<> and std::array<>, that is)

  • What the differences are between all container types, their advantages and disadvantages, and how to choose between them

  • How to traverse the elements of any container using iterators

  • What Standard Library algorithms are and how to use them effectively

Containers

We already introduced two of the most commonly used containers in Chapter 5: std::array<T,N> and std::vector<T>. Of course, not all containers store their elements in an array. There are countless other ways to arrange your data, each tailored to make one or the other operation more efficient. Arrays are great for linearly traversing elements, but they do not necessarily lend themselves to quickly finding a particular element. If you’re looking for the proverbial needle in a haystack, linearly traversing all stalks may not be the fastest way to go about it. If you organize your data such that all needles automatically group themselves in a common region, retrieving needles becomes much easier.

Sequence Containers

A sequence container is a container that stores its elements sequentially, in some linear arrangement, one element after the other. The order in which the elements are stored is determined entirely by the user of the container. Both std::array<> and std::vector<> are sequence containers, but the Standard Library defines three more such container types. The Standard Library thus offers five sequence containers, each backed by a different data structure. In this section, we’ll briefly introduce each of them in turn.

Arrays

Both std::array<T,N> and std::vector<T> are backed by a single built-in array of T elements, precisely the kind you learned to use in Chapter 5. The advantage of using the containers is that they make it easier to use these arrays and near impossible to misuse.

Surely you’ll recall that std::array<> is backed by a statically sized array, and std::vector<> is backed by a dynamic array that it allocates for you in the free store. For an array<>, you therefore always need to specify the number of elements at compile time, which somewhat limits the possible use cases of this container. A vector<>, on the other hand, is capable of dynamically growing its array as you add more and more elements. It does this in much the same way as you did inside the push_back() function of our Array<> template in Chapter 17—only using significantly more efficient techniques.

You know both these sequence containers already quite well from Chapter 5, though; there’s little more we want to add to this at this point. We will have more to say about std::vector<> later in this chapter. We will show how you can add and delete elements in the middle of the sequence, rather than only at the end. We can only do so, though, after we have first introduced iterators.

Lists

Out of all data structures that are used instead of an array, the simplest undoubtedly is the linked list. You already encountered linked lists at least twice in this book, first in Chapter 11 while working on the Truckload class and again while implementing a Stack<> template in Chapter 16. If you recall, a Truckload was essentially a container of Box objects. Internally, it stored each individual Box inside an object of a nested class called Package. Next to a Box, each Package object contained a pointer to the next Package in the list, thus linking together a long chain of tiny Package objects. We refer to Chapter 11 for more details. The Stack<> class of Chapter 16 was essentially analogous, except that it used the more generic term Node instead of Package as the name for its nested class.

The Standard Library forward_list and list headers offer two container types that are implemented in a similar manner:
  • std::forward_list<T> stores T values in a so-called singly linked list. The term singly-linked refers to the fact that each node in the linked list has only a single link to another node—the next one in the list. This data structure is completely analogous to that of your Truckload and Stack<> types. Looking back, the Truckload class could simply have used a std::forward_list<std::shared_ptr<Box>>, and creating the Stack<T> template would’ve been much easier with a plain std::forward_list<T>.

  • std::list<T> stores T values in a doubly linked list, where each node not only has a pointer to the next node in the list but has one to the previous node as well. For Exercise 11-7 you created a Truckload class backed by an analogous data structure.

In theory, the key advantage of these linked list containers compared to the other sequence containers is that they facilitate insertions and removals of elements in the middle of the sequence. If you insert an element in the middle of a vector<>—as said, we’ll show you later how to do this—the vector<> clearly first has to move all elements beyond the point of insertion one position to the right. Moreover, if the allocated array is not large enough to hold any more elements, a new larger array will have to be allocated and all elements moved into that one. With a linked list, on the other hand, inserting an element in the middle involves virtually no overhead at all. All you need to do is create a new node and rewire some next and/or previous pointers .

The key disadvantage of linked lists, however, is that they lack the so-called random access capability. Both array<> and vector<> are called random-access containers because you can instantly jump to any element with a given index. This capability is exposed through the array subscript operator, operator[], of both array<> and vector<> (and their at() function as well, of course). With a linked list, however, you cannot access any element in the list without first traversing an entire chain of nodes containing other elements (for a forward_list<>, you always have to start at the first node of the list—the so-called head; for a list<>, you can start at either end). Being able to efficiently insert in the middle of a list means little if you first have to linearly traverse half of the list to get there!

Another disadvantage is that linked lists typically exhibit particularly poor memory locality. The nodes of a list tend to become scattered around in free store memory, making it much harder for a computer to quickly fetch all elements one by one. Linearly traversing a linked list is therefore much, much slower than linearly traversing an array<> or vector<>.
The combination of these disadvantages makes for this observation:

Tip

While they make for great practice in programming with pointers and dynamic memory, the need to use linked lists in production code occurs only very rarely. A vector<> is nearly always the better choice. Even when elements need to be inserted in or removed from the middle at times, a vector<> normally remains the more efficient choice (today’s computers are good at moving big blocks of memory around!).

In all our years of C++ programming , we have never used linked lists in practice. We will therefore not discuss them any further here.

Deque
The fifth and final sequence container is called std::deque<T>. The term deque is short for double-ended queue and is pronounced /dɛk/, precisely like the word deck (as in a deck of cards). It is somewhat of a hybrid data structure with the following advantages:
  • Just like array<> and vector<>, deque<> is a random access container, meaning it has constant-time operator[] and at() operations.

  • Just like a list<>, a deque<> allows you to add elements in constant time both at the front and the back of the sequence. A vector<> only supports constant-time additions to the back of the sequence (inserting in the front at least requires all other elements to be moved one position to the right).

  • Unlike a vector<>, the elements of a deque<> are never moved to another bigger array when adding to or removing from either the front or the back of the sequence. This means that T* pointers to elements stored inside the container remain valid (provided you do not insert into or remove from the middle of the sequence using the functions explained later in this chapter, that is).

In our experience, the latter advantage is mainly what makes a deque<> useful at times in more complex scenarios where other data structures store pointers to data that is stored inside a deque<> (the need to insert at both ends of a sequence occurs surprisingly seldom). For basic use, and this accounts for well over 95 percent of the cases, your go-to sequence container should thus remain std::vector<>.

The following example shows basic use of a deque<>:

// Ex19_01.cpp - Working with std::deque<>
#include <iostream>
#include <deque>
int main()
{
  std::deque<int> my_deque;        // A deque<> allows efficient insertions to
  my_deque.push_back(2);           // both ends of the sequence
  my_deque.push_back(4);
  my_deque.push_front(1);
  my_deque[2] = 3;                 // A deque<> is a random-access sequence container
  std::cout << "There are " << my_deque.size() << " elements in my_deque: ";
  for (int element : my_deque)     // A deque<>, like all containers, is a range
    std::cout << element << ' ';
  std::cout << std::endl;
}

This code should need no further explanation . The output of this example, of course, is this:

There are 3 elements in my_deque: 1 2 3

Key Operations

All standard containers —not just sequence containers—provide a similar set of functions, with analogous names and behaviors. All containers have empty(), clear(), and swap() functions, and nearly all have a size() function (the only exception is std::forward_list<>!). All containers can be compared using == and !=, and most—including all sequence containers—can be compared by means of <, <=, >, and >= as well. Like we said in the introduction of this chapter, however, it is not our intention to provide you with a detailed and complete reference. For that we already referred you to other sources. What we do want to give you here is a brief overview of the key distinguishing operations of the various sequence containers.

With the exception of the fixed-size std::array<>, you can freely add or remove as many elements as you want to or from the front, the back, or even somewhere in the middle of the sequence. The following table shows some of the most important operations that the five sequence containers—vector<> (V), array<> (A), forward_list<> (F), list<> (L), and deque<> (D)—offer to insert, remove, or access their elements. If a square is filled, the corresponding container supports the operation.

Operation

V

A

L

F

D

Description

push_front()

pop_front()

Adds an element to, or removes one from, the front of the sequence.

push_back()

pop_back()

Adds an element to, or removes one from, the back of the sequence.

insert()

erase()

Inserts or removes one or multiple elements at arbitrary positions. As explained later in this chapter, you indicate the positions at which to insert or remove elements using iterators. (Note: The corresponding members for forward_list<> are called insert_after() and erase_after().)

front()

Returns a reference to the first element in the sequence.

back()

Returns a reference to the last element in the sequence.

operator[]

at()

Returns a reference to the element at a given index .

data()

Returns a pointer to the start of the underlying array. This is useful to pass to legacy functions or C libraries. Note: In older code, you often see the equivalent &myContainer[0].

Stacks and Queues

This section covers three related class templates: std::stack<>, std::queue<>, and std::priority_queue<>. They are called container adapters because they are technically not containers themselves. Instead, they encapsulate one of the five sequential containers (by default either a vector<> or deque<>) and then use that container to implement a specific, very limited set of member functions. For instance, while a stack<> is typically backed by a deque<>, this deque<> is kept strictly private. It will never allow you to add or remove elements from the front of the encapsulated deque<>, nor will it allow you to access any of its elements by index. Container adapters, in other words, employ the data hiding principle of Chapter 11 to force you to use the encapsulated containers only in a very specific way. For these specific yet common use cases of sequential data, using one of the adapters is therefore much safer and less error-prone than directly using the containers themselves.

LIFO vs. FIFO Semantics

A std::stack<T> represents a container with so-called last-in first-out (LIFO) semantics —the last T element that goes in will be the first one to come out. You can compare it to a stack of plates in a self-service restaurant. Plates are added at the top, pushing down other plates. A customer takes a plate from the top, which is the last added plate on the stack . You already created your own Stack<> template earlier in Chapter 16.

A std::queue<> is similar to a stack<> but instead has first-in first-out (FIFO) semantics . You can compare it to a queue at a nightclub. A person who arrived before you will be allowed to enter before you (provided you do not grease the bouncer, that is—though there’s no greasing a queue<>!).

The following example clearly shows the difference between both container adapters:

// Ex19_02.cpp - Working with stacks and queues
#include <iostream>
#include <stack>
#include <queue>
int main()
{
  std::stack<int> stack;
  for (int i {}; i < 10; ++i)
    stack.push(i);
  std::cout << "The elements coming of the top of the stack:     ";
  while (!stack.empty())
  {
    std::cout << stack.top() << ' ';
    stack.pop();                         // pop() is a void function!
  }
  std::cout << std::endl;
  std::queue<int> queue;
  for (int i {}; i < 10; ++i)
    queue.push(i);
  std::cout << "The elements coming from the front of the queue: ";
  while (!queue.empty())
  {
    std::cout << queue.front() << ' ';
    queue.pop();                         // pop() is a void function!
  }
  std::cout << std::endl;
}

The program shows canonical use of both adapters, first of a stack and then of a queue. Even though the same ten elements are added to both in the same order, the output confirms that they will be taken out in opposite orders:

The elements coming of the top of the stack:     9 8 7 6 5 4 3 2 1 0
The elements coming from the front of the queue: 0 1 2 3 4 5 6 7 8 9
There are two things worth noticing about Ex19_02:
  • The pop() function does not return any element. You must therefore typically first access these elements using top() or front(), depending on which adapter you use. (Unsurprisingly, the same holds by the way for all pop_front() and pop_back() members of the various sequence containers.)

  • Beyond those used in the example, there are actually only a few other member functions that std::stack<> and queue<> provide. Both have the conventional size(), empty(), and swap() functions, but that is about it. Like we said, the interface of these adapters is specifically tailored for one specific use and that use alone.

Tip

Container adapters are typically used to manage elements that represent tasks that need to be executed in a setting where executing all tasks at once is unfeasible. If independent or consecutive tasks are scheduled, then a queue<> is often the most natural data structure to use. Tasks are then simply executed in the order in which they are requested. If the tasks represent subtasks of other scheduled or suspended tasks, then you normally want all subtasks to finish first before initiating or resuming their parent tasks. A stack<> is then the easiest approach (note that this is also how C++ executes functions—using a call stack!).

FIFO and LIFO are thus useful for most simple task scheduling applications; for more complex scenarios, priority-based scheduling may be required. This is what std::priority_queue<> provides, which is the container adapter that we’ll briefly introduce next.

Priority Queues

The final container adapter is std::priority_queue<>, defined by the same queue header as queue<>. You can compare a priority queue to how a queue works at a real night club; that is, certain groups of guests will get in before others. Guests that have a higher priority—VIPs, good-looking ladies, even daft nephews of the nightclub’s owner—take precedence over those with lower priority. Another analogy is the queue at your local super market or bus stop, where disabled people and pregnant women can cut the line.

Similar to the other adapters, elements are added to a priority_queue<> through push() and taken out via pop(). To access the next element in the queue, you use top(). The order in which elements exit a priority_queue<T> is determined by a comparison functor. By default, the std::less<T> functor is used (see Chapter 18). You can override this comparison functor with your own. We refer to a Standard Library reference for more details on how to use a priority queue .

Caution

In common speech, it is normally the element with the highest priority that takes precedence. In a priority_queue<>, however, the front (or better yet, the top()) of the queue will by default be the element that compares lowest using <. If you instead want the element with the highest priority to rise to the top() first, you can override the default comparison functor by std::greater<>.

Sets

In C++ Standard Library speak, a set is a container in which each element can appear at most once. Adding a second element equal to any of the elements that is already stored in a set has no effect. It’s easiest to show this with a quick example:

// Ex19_03.cpp - Working with sets
#include <iostream>
#include <set>            // For the std::set<> container template
void printSet(const std::set<int>& my_set);  // Print the contents of a set to std::cout
int main()
{
  std::set<int> my_set;
  // Insert elements 1 through 4 in arbitrary order:
  my_set.insert(1);
  my_set.insert(4);
  my_set.insert(3);
  my_set.insert(3);          // The elements 3 and 1 are added twice
  my_set.insert(1);
  my_set.insert(2);
  printSet(my_set);
  std::cout << "The element 1 occurs " << my_set.count(1) << " time(s)" << std::endl;
  my_set.erase(1);           // Remove the element 1 once
  printSet(my_set);
  my_set.clear();            // Remove all elements
  printSet(my_set);
}
void printSet(const std::set<int>& my_set)
{
  std::cout << "There are " << my_set.size() << " elements in my_set: ";
  for (int element : my_set)           // A set, like all containers, is a range
    std::cout << element << ' ';
  std::cout << std::endl;
}

Executing this code produces the following output:

There are 4 elements in my_set: 1 2 3 4
The element 1 occurs 1 time(s)
There are 3 elements in my_set: 2 3 4
There are 0 elements in my_set:

There are no “push” or “pop” members for set containers. Instead, you always add elements through insert() and remove them again through erase(). You can add any number of elements you like, and in any order you like. As Ex19_03 clearly demonstrates, however, adding the same element a second time indeed has no effect. Even though in the example you add the values 1 and 3 to my_set twice, the output clearly shows that both elements are stored only once in the container.

You’ll often use these set containers to manage or gather a collection of duplicate-free elements. Beyond that, their main advantage is that sets are very good at quickly checking whether they contain a given element or not. That’s much better than, say, a plain unordered vector<>. This is not surprising, given that checking whether an element exists already is precisely what they need to do to weed out all duplicates. To check for yourself whether a given element is contained in a set or not, you either use its count() member or use its find() member. The former is used in Ex19_03 and returns either 0 or 1; the latter returns an iterator. We’ll discuss the find() function some more later, after you’ve had a thorough introduction on iterators.

The Standard Library offers two generic set containers: std::set<> and unordered_set<>. In essence, both set containers provide the same functionality. You could replace the std::set<> in Ex19_03 with a std::unordered_set<>, for instance, and the example will work (nearly) the same. Both containers differ drastically in the way they are implemented. They use very different data structures to organize their data, albeit it both toward the same goal: quickly determining where to insert a new element or, similarly, whether a given element is already present in the container.

Ordered Sets

As the name unordered_set<> already gives away, a regular set<> organizes its elements such that they are always sorted. This is also apparent from Ex19_03. Even though you added the elements 1 through 4 in some arbitrary order, the output clearly corroborates that the container somehow stores them in a sorted order. Searching through a sorted set of data can be done much faster —just imagine how long it would take you to find the definition of capricious if all words in your English dictionary were scrambled in some arbitrary order!

Note

For those who know their data structures, a std::set<> is normally backed by some balanced tree data structure (typically a red-black tree). Beyond the fact that this gives you logarithmic insert(), erase(), find(), and count() operations, there is normally little need for you to know these implementation details.

By default, a set<> orders all elements using the < operator. For fundamental types, this translates to the built-in < operator; by default, elements of a class type T, though, should thus overload the < operator to be stored in a std::set<T>. We say “by default” since beyond overloading the < operator you could also override the way a set<> orders its elements by specifying which functor it should use. If you, for instance, replace the definition of my_set in Ex19_03 with the following, then its elements become sorted from highest to lowest instead. (To print the set you’ll have to update the signature of printSet() accordingly; also, to use std::greater<>, you may have to first include the functional header:)

std::set<int, std::greater<>> my_set;

The second type argument of std::set<T> is optional (by default it equals std::less<T>). You can specify any functor type whose binary function call operator compares two T elements and returns true if the element passed to its first parameter should precede that passed to the second. Typically, a set<> default constructs a functor of the corresponding type, but if need be, you could pass one yourself to one of the set<> constructors as well. Functors, and std::less<> and greater<> in particular, were discussed in detail in the previous chapter.

Caution

To decide whether two elements are duplicates, set<> does not use the == operator. For the default < comparison, two elements x and y are considered equal if !(x < y) && !(y < x).

Unordered Sets

An unordered_set<>, naturally, does not order its elements—at least not in any predefined order that is of any particular use to you, the user of the container . Instead, it is backed by a so-called hash table or hash map . All operations of an unordered_set<> consequently usually run in near-constant time, making them potentially even faster than a regular set<>. For the most commonly used variable types—including all fundamental types, pointers, strings, and smart pointers—an unordered_set<> may be used as a drop-in replacement of std::set<>. A further discussion of this more advanced data structure, however, or how to specify so-called hash functions for your own custom types is outside the scope of this brief introduction.

Caution

The only way to know for sure which is faster for your application, though—a set<> or an unordered_set<>—is to measure its performance on some real, representative input data. For nearly all applications, it does not really matter because both will be more than fast enough.

Tip

Besides set<> and unordered_set<>, the Standard Library also offers std::multiset<> and std::unordered_multiset<>. Unlike set containers, multiset containers (also known as bags or msets) may contain the same element more than once (that is, for these containers, count() might return numbers higher than 1). Their key feature remains that multisets are very fast at determining whether and where a given element is stored in the container.

Maps

A map or associative array container is best thought of as a generalization of a dictionary or phone book. Given a specific key (a word or name of a person), you want to store or quickly retrieve a certain value (a definition or phone number). Keys in a map need to be unique; values do not (a dictionary allows for synonyms, and a phone book in principle allows for family members to share the same phone number).

Analogous to set<T> and unordered_set<T>, the Standard Library offers two different map containers: std::map<Key,Value> and std::unordered_map<Key,Value>. Unlike most containers, a map container needs at least two template type arguments, one to determine the type of the keys and one to determine the type of the values. Let’s clarify this with a quick example:

// Ex19_04.cpp - Basic use of std::map<>
#include <map>
#include <iostream>
#include <string>
int main()
{
  std::map<std::string, unsigned long long> phone_book;
  phone_book["Donald Trump"] = 202'456'1111;
  phone_book["Melania Trump"] = 202'456'1111;
  phone_book["Francis"] = 39'06'6982;
  phone_book["Elizabeth"] = 44'020'7930'4832;
  std::cout << "The president's number is " << phone_book["Donald Trump"] << std::endl;
  for (const auto& [name, number] : phone_book)
    std::cout << name << " can be reached at " << number << std::endl;
}

This produces the following result:

The president's number is 2024561111
Donald Trump can be reached at 2024561111
Elizabeth can be reached at 4402079304832
Francis can be reached at 39066982
Melania Trump can be reached at 2024561111

In Ex19_04, phone_book is defined as a map<> with keys of type std::string and values of type unsigned long long (people rarely have negative phone numbers). It therefore uniquely associates strings with numbers. No two keys can be the same. The example does confirm, though, that no such restriction exists for the values. The same number can be inserted multiple times.

The map<> and unordered_map<> containers again offer functionality that is completely analogous. You typically use them in much the same way as an array (or its object-oriented counterpart, a random-access sequential container), that is, through its array subscript operator. Only with maps you do not (necessarily) address the values by contiguous integers (oft-called indices); instead, you can in principle use keys of any type you like.

The difference between the two map types again lies in the way they are implemented. Depending on which map implementation you use, there are some requirements on the types you can use for your keys. These requirements are analogous to those for the values of set<> and unordered_set<>. A map<> again works by ordering its elements; an unordered_map<> is essentially a textbook hash map. The former is witnessed as well by the output of Ex19_04. The four names appear in alphabetical order, even though they were added to the map in a different order.

Elements of a Map

To traverse the elements of its phone_book container, Ex19_04 uses a syntax that you have not yet seen before:

  for (const auto& [name, number] : phone_book)
    std::cout << name << " can be reached at " << number << std::endl;

This syntax has been possible only since C++17. It is instructive to see how you’d express this loop using more familiar C++11 syntax:

  for (const auto& element : phone_book)
    std::cout << element.first << " can be reached at " << element.second << std::endl;

In fact, even more instructive here would be to spell out the type of element:

  for (const std::pair<std::string, unsigned long long>& element : phone_book)
    std::cout << element.first << " can be reached at " << element.second << std::endl;

That is, the elements that are contained in either a std::map<K,V> or unordered_map<K,V> have type std::pair<K,V>. You’ve briefly encountered the std::pair<> type before during the exercises of Chapter 16. It is a basic class template (defined in the utility header) whose objects each represent a pair of values, possibly of different types. You can access these two values through their public member variables first and second. As of C++17, however, there is a more compact way.

Consider the following snippet of C++17 code, for example:

std::pair my_pair{ false, 77.50 };
auto [my_bool, my_number] = my_pair;

Prior to C++17, you always had to write this in a much more verbose way (recall from Chapter 16 that C++17 introduced class template argument deduction for constructor invocations as well!):

std::pair<bool, double> my_pair{ false, 77.50 };
bool my_bool = my_pair.first;
double my_number = my_pair.second;

The loop in Ex19_04 nicely shows that this same C++17 syntax is convenient when traversing the elements of a map container as well. (And also that the auto [] syntax can be combined with const and &, for that matter!)

We’ll use this syntax again in the somewhat larger example we explore in the next subsection.

Counting Words

Enough with the tiny toy examples. Let’s look at an example use of std::map<> with some more body to it. A possible use case for a map is to count unique words in a string. Let’s see how this works with an example based on Ex8_18. We’ll use the following type aliases and function prototypes :

// Type aliases
using Words = std::vector<std::string_view>;
using WordCounts = std::map<std::string, size_t>;
// Function prototypes
Words extract_words(std::string_view text, std::string_view separators = " ,.!?\"\n");
WordCounts count_words(const Words& words);
void show_word_counts(const WordCounts& wordCounts);
size_t max_word_length(const WordCounts& wordCounts);

The extract_words() function is a slight variation of the function with the same name in Ex8_18; you should therefore have no trouble at all defining it on your own. The function extracts all individual words from a given text. A word is defined as any sequence of characters different from the given separators.

Our main point of interest here, though, is the count_words() function. This, as you may have guessed, is to count the amount of times each individual word occurs in the input vector<>. To count all unique words, the function uses a std::map<std::string, size_t>. In the map that the function returns, the words are the keys, and the value associated with each key is the number of times that the corresponding word occurs in the words vector<>. The function thus has to insert a new key/value pair into the map each time a new word is encountered. The same counter needs to be incremented whenever the same word is seen multiple times. Both of these operations can be implemented using a single line of code:

WordCounts count_words(const Words& words)
{
  WordCounts result;
  for (auto& word : words)
    ++result[std::string(word)];
  return result;
}

The following line does all the work:

    ++result[std::string(word)];
To understand this, we need to explain the workings of the array index operator, operator[], of a map container bit better:
  • If a value is already associated with the given key, the operator simply returns a reference to that value. Applying the ++ operator to this reference then simply increments the value that was already stored within the map to 2 or higher.

  • If no value is associated with the given key yet, though, the operator first inserts a new key/value pair into the map. The value of this new element is zero-initialized (or default constructed, if it concerns an object of a class type). Once the new element is inserted, the operator then returns a reference to this zero-initialized value. In count_words(), we then instantly increment the resulting size_t value to 1.

The max_word_length() function from Ex8_18 needs to be changed slightly, because we want it to use the words stored in the map. For brevity, we’ll only output words that appear more than once in the output, so we best ignore these here as well:

size_t max_word_length(const WordCounts& wordCounts)
{
  size_t max{};
  for (const auto& [word, count] : wordCounts)
    if (count >= 2 && max < word.length()) max = word.length();
  return max;
}

Finally, all the words, including their count, can be output with a show_word_counts() function. As said, we only stream out words that appear more than once:

void show_word_counts(const WordCounts& wordCounts)
{
  const size_t field_width{max_word_length(wordCounts) + 1};
  const size_t words_per_line{5};
  size_t words_in_line{};      // Number of words in the current line
  char previous_initial{};
  for (auto& [word, count] : wordCounts)
  {
    if (count < 2) continue;      // Skip words that appear only once
    // Output newline when initial letter changes or after 5 per line
    if (previous_initial &&
        (word[0] != previous_initial || ++words_in_line == words_per_line))
    {
      words_in_line = 0;
      std::cout << std::endl;
    }
    std::cout << std::setw(field_width) << word;          // Output a word
    std::cout << " (" << std::setw(2) << count << ')';    // Output count
    previous_initial = word[0];
  }
  std::cout << std::endl;
}

The fact that we used a map<> automatically ensures that all words are sorted in alphabetical order, making it easy for us to print them out alphabetically as well. In particular, show_word_counts() groups words that begin with the same letter on the same line. Beyond this, show_word_counts() does not really contain much you haven’t seen several times before already in other, similar output functions . So, we believe we can skip any further explanations and fast-forward to seeing it all working in a complete example:

// Ex19_05.cpp - Working with maps
#include <iostream>
#include <iomanip>
#include <map>
#include <string>
#include <string_view>
#include <vector>
// Type aliases
using Words = std::vector<std::string_view>;
using WordCounts = std::map<std::string, size_t>;
// Function prototypes
Words extract_words(std::string_view text, std::string_view separators = " ,.!?\"\n");
WordCounts count_words(const Words& words);
void show_word_counts(const WordCounts& wordCounts);
size_t max_word_length(const WordCounts& wordCounts);
int main()
{
  std::string text;                                         // The string to count words in
  // Read a string from the keyboard
  std::cout << "Enter a string terminated by *:" << std::endl;
  getline(std::cin, text, '*');
  Words words = extract_words(text);
  if (words.empty())
  {
    std::cout << "No words in text." << std::endl;
    return 0;
  }
  WordCounts wordCounts = count_words(words);
  show_word_counts(wordCounts);
}
// The implementations of the extract_words(), count_words(), show_word_counts(),
// and max_word_length() function as discussed earlier.

If you compile and run this program , a possible session could go as follows:

Enter a string terminated by *:
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to Heaven, we were all going direct the other way—in short, the period was so far like the present period, that some of its noisiest authorities insisted on its being received, for good or for evil, in the superlative degree of comparison only.*
    age ( 2)     all ( 2)
 before ( 2)
 direct ( 2)
  epoch ( 2)
    for ( 2)
  going ( 2)
    had ( 2)
     it ( 9)     its ( 2)
     of (12)
 period ( 2)
 season ( 2)
    the (14)   times ( 2)
    was (11)      we ( 4)    were ( 2)

Iterators

You first encountered the concept of iterators in Chapter 11, where we employed an iterator to traverse all Boxes of a given Truckload container in a nice and elegant manner. To do so, you simply asked the Truckload object for an Iterator object, after which you could use this iterator’s getFirstBox() and getNextBox() members to retrieve all Boxes in a straightforward loop:

auto iterator{ my_truckload.getIterator() };
for (auto box { iterator.getFirstBox() }; box != nullptr; box = iterator.getNextBox())
{
  std::cout << *box << std::endl;
}

This iterator concept is actually a classical and widely applied object-oriented design pattern. One used extensively by the Standard Library as well, as you’ll discover throughout this chapter. Before we go there, however, let’s first reflect some more on why an iterator is such an attractive pattern.

The Iterator Design Pattern

Iterators allow you to traverse a set of elements contained within any container-like object in an effective, uniform manner. This approach has several advantages , as discussed next.

The Truckload example from earlier is actually an excellent starting point. If you recall, internally a Truckload object used a so-called (singly) linked list to store its Boxes. Concretely, a Truckload stored each individual Box inside its own dedicated instance of a nested class called Package. Next to a Box, each Package object contained a pointer to the next Package in the list. We refer to Chapter 11 for more details, in case you forgot. Or, better yet, do not reach back to Chapter 11 just yet! Our main point here is precisely that you do not need to know anything of a Truckload’s internal wiring to iterate over all its Boxes. All you need to learn as a user of the Truckload class is the straightforward public interface of its Iterator class.

Library writers typically define Iterators with analogous interfaces for all their container types. This is the case, for instance, for all containers of the Standard Library, as we’ll discuss shortly. With this approach, it thus becomes possible to traverse different containers in precisely the same way—be it an array, linked list, or even some more complex data structure. You then no longer need to know at all how a particular container works internally to inspect its elements! Among other thing, this leads to code that is as follows:
  • Easy to write and understand.

  • Bug-free and robust. Compared to traversing pointers within potentially complex data structures, there is considerably less room for errors when using an iterator.

  • Efficient. For instance, as discussed earlier in this chapter already, one important limitation of a linked list data structure is that you cannot jump to an arbitrary element with a given index without first traversing all other elements prior to this element. In a Truckload, for example, you can only get to the Box you need by following a whole lot of next pointers starting from the head of the list. This means that a loop of the following form would be particularly inefficient:

    for (size_t i {}; i < my_truckload.getNumBoxes(); ++i)
    {
      std::cout << my_truckload[i] << std::endl;
    }

    In such a loop, each invocation of the array subscript operator [] would involve traversing the linked list of the iterator starting from the first Package (the head of the list) all the way until the ith Package. That is, with each iteration of the for loop, obtaining a reference to the ith Box would take longer and longer. An Iterator’s getNextBox() function does not suffer from this problem, as it always contains a pointer to the current Package, from which moving to the next can be done in constant time.

  • Flexible and maintainable. You can readily change the internal representation of a container without having to worry about breaking any external code traversing its elements. For instance, after this chapter, it should be straightforward to reimplement the Truckload class in terms of a vector<> instead of a custom linked list, while still preserving the same public functions, both for the class itself and for its Iterator class .

  • Easy to debug. You could add extra debugging statements and assertions to the iterator’s member functions. Typical examples are out-of-bounds checks. Mostly, library writers add such checks conditionally, provided some specific macros are defined (see Chapter 10). None of this would be possible if external code was manipulating the internal pointers or arrays directly.

Note

It should come as no surprise that iterators share many of these advantages with the concept of data hiding we explained in Chapter 11. Data hiding is precisely what iterators and other object-oriented design patterns do. They hide implementation details from the users of an object behind an easy-to-use public interface.

Another clear advantage of uniform iterators is that they facilitates the creation of function templates that work for iterators of any container type—functions that can operate on any range of elements, irrespective of whether these elements are contained within for instance a vector, a list, or even a set. As all iterators have analogous interfaces, these function templates thus do not need to know about the inner workings of the containers anymore. It’s precisely this idea, combined with first-class functions (see Chapter 18), that powers the higher-order functions of the Standard Library’s algorithms library that we will discuss in the final part of this chapter.

Iterators for Standard Library Containers

All container types of the Standard Library—and with them those of virtually any third-party C++ library—offer iterators that are completely analogous. No matter which containers you work with, you can always traverse the elements they store in the same manner . You create new iterator objects through member functions with the same name, you access the element an iterator currently refers to in the same manner, and you advance to the next element in the same manner. The public interface of these iterators is slightly different from that of the Truckload Iterators we discussed earlier, but the general idea remains the same.

Creating and Working with Standard Iterators

The most common way to create an iterator for a given container is by invoking its begin() member function. Every single standard container provides this function. Here’s an example:

std::vector<char> letters{ 'a', 'b', 'c', 'd', 'e' };
auto my_iter{ letters.begin() };

The type of an iterator for a container of type ContainerType is always ContainerType::iterator, which is either a concrete type or a type alias. Our my_iter variable definition in full would thus be as follows:

std::vector<char>::iterator my_iter{ letters.begin() };

It’s safe to say that container iterator types are a prime example where C++11’s auto type deduction truly has made all our lives a lot easier!

Through the magic of operator overloading (see Chapter 12), each and every iterator provided by the Standard Library containers mimics a pointer. For example, to access the element that our my_iter iterator currently refers to, you apply its dereference operator:

std::cout << *my_iter << std::endl;       // a

As begin() always returns an iterator that points to the first element in the container, this statement will simply print out the letter 'a'.

Just like with a pointer, a dereferenced iterator results in a reference to the actual element stored inside the container. In our example, *my_iter therefore results in a reference of type char&. As this expression is clearly an lvalue reference, you can of course also use it, for instance, on the left side of an assignment :

*my_iter = 'x';
std::cout << letters[0] << std::endl;     // x

Naturally, you can do more with an iterator than access the first element of the container. As you’ll recall from Chapter 6, pointers support the arithmetic operators ++, --, +, -, +=, and -=, which you could use to move from one element in an array to the next (or previous). You work with vector<> iterators in precisely the same manner. Here’s an example:

++my_iter;                                // Move my_iter to the next element
std::cout << *my_iter << std::endl;       // b
my_iter += 2;
std::cout << *my_iter-- << std::endl;     // d
std::cout << *my_iter << std::endl;       // c  (my_iter was altered using the post-decrement
                                          //     operator in the previous statement)
auto copy{ my_iter };
my_iter += 2;
std::cout << *copy << std::endl;          // c
std::cout << *my_iter << std::endl;       // e
std::cout << my_iter - copy << std::endl; // 2

This code, which you can find in its totality in Ex19_06, should really explain itself at this point, as all this is completely analogous to working with pointers. This even applies for the last line in our example, which is perhaps a bit less obvious than the others; that is, subtracting two vector<> iterators results in a value of a signed integer type that reflects the distance between the two iterators.

Tip

Iterators also provide a member access operator (informally, arrow operator), ->, to access the member variables or functions of the element they refer to. That is, suppose string_iter is an iterator that refers to an element of a class std::string; then string_iter->length() is short for (*string_iter).length()—again just like it is with a pointer. We will see more concrete examples later in the chapter.

Different Flavors of Iterators

The iterator that we used for the example in the previous subsection is a so-called random-access iterator. Out of all iterator categories, random-access iterators offer the richest set of operations. All iterators returned by standard containers support operators ++, *, and ->, as well as == and !=. But beyond that, there are some differences. Any and all limitations are easily explained from the nature of the data structures behind the various containers:
  • The iterators for a std::forward_list<> do not support --, -=, or -. The reason, of course, is that there is no (efficient) way for an iterator to go back to the previous element. Each node in a singly linked list only has a pointer to the next element in the list. Such iterators are referred to as forward iterators . Other containers that may only support forward iterators are unordered_set<>, unordered_map<>, and unordered_multimap<>.

  • The iterators for a std::list<>, on the other hand, do support the -- decrement operators (both pre- and post-decrement). Going back to the previous node in a double-linked is trivial. Jumping multiple elements at once still cannot be done efficiently, though. To discourage such use, std::list<> iterators do not feature, for instance, the +=, -=, +, or - operators. The iterators for the std::set<>, map<>, and multimap<> containers have the same limitations, as traversing the nodes of the underlying tree data structure is similar to traversing those of a doubly linked list. This category of iterators is termed bidirectional iterators —for obvious reasons.

  • The only iterators to offer +=, -=, +, and -, as well as the comparison operators <, <=, >, and >=, are random-access iterators. The only containers that are required to provide random-access iterators are the random-access sequence containers (or std::vector<>, array<>, and deque<>).

Tip

Knowing and understanding these terms becomes important only when consulting your Standard Library reference, most particularly its section on the generic algorithms library that we discuss later in this chapter. The reference of each algorithm template will specify which type of iterators it expects as input, either a forward iterator, bidirectional iterator, or random-access iterator.

Notice that these three iterator categories form a hierarchy. That is, every random-access iterator is also a valid bidirectional iterator, and every bidirectional iterator is also a forward iterator. So to an algorithm that requires, for instance, a forward iterator, you could most certainly pass a random-access iterator as well. For completeness, your Standard Library reference may also employ the terms input iterator and output iterator in this context. These are more theoretical concepts that refer to iterators with even less requirements than a forward iterator. In practice, every iterator created by a Standard container is thus always a valid input or output iterator.

Note

The three container adapters—std::stack<>, queue<>, and priority_queue<>—do not offer any iterators at all—not even forward iterators. Their elements can be accessed only through the top(), front(), or back() functions (whichever is applicable).

Traversing Elements of a Container

From Chapter 6, you know how to traverse an array using pointers and pointer arithmetic. To print out all elements in an array, for instance, you may use the following loop:

int numbers[] { 1, 2, 3, 4, 5 };
for (int* pnumber {numbers}; pnumber < numbers + std::size(numbers); ++pnumber)
{
  std::cout << *pnumber << ' ';
}
std::cout << std::endl;

You could traverse all elements of a vector in precisely the same manner:

std::vector<int> numbers{ 1, 2, 3, 4, 5 };
for (auto iter {numbers.begin()}; iter < numbers.begin() + numbers.size(); ++iter)
{
  std::cout << *iter << ' ';
}
std::cout << std::endl;

The problem with this loop is that it uses two operations that are exclusive to random-access iterators: < and +. This loop would thus not have compiled had numbers been, for instance, of type std::list<int> or std::set<int>. A more conventional way of expressing this same loop, therefore, is this:

for (auto iter {numbers.begin()}; iter != numbers.end(); ++iter)
{
  std::cout << *iter << ' ';
}

Tip

In C++ you normally always use expressions of the form iter != numbers.end() rather than iter < numbers.end() precisely because forward and bidirectional iterators are not required to support comparisons by means of <.

This new loop is equivalent to the one we used before, only this time it works for any standard container. Conceptually, iterators returned by a container’s end() member point to “one past the last element.” Once an iterator is incremented up to the point that it equals the container’s end() iterator. That is, once an iterator is incremented past the container’s last element, you should therefore clearly abort the loop. While it is undefined what would happen, no good can come from dereferencing an iterator that points beyond the bounds of the actual container.

The following example uses a loop exactly like this to traverse all elements contained in a list<>:

// Ex19_07.cpp
// Iterating over the elements of a list<>
#include <iostream>
#include <list>
int main()
{
  std::cout << "Enter a sequence of positive numbers, "
            << "terminated by a negative number: ";
  std::list<unsigned> numbers;
  while (true)
  {
    signed number{-1};
    std::cin >> number;
    if (number < 0) break;
    numbers.push_back(static_cast<unsigned>(number));
  }
  std::cout << "You entered the following numbers: ";
  for (auto iter {numbers.begin()}; iter != numbers.end(); ++iter)
  {
    std::cout << *iter << ' ';
  }
  std::cout << std::endl;
}

A possible session then might go like this:

Enter a sequence of positive numbers, terminated by a negative number: 4 8 15 16 23 42 -1
You entered the following numbers: 4 8 15 16 23 42

Of course, each container is a range as well, and ranges can be used in range-based for loops (see Chapter 5). The for loop of Ex19_07, for instance, could therefore be replaced by the following much simpler range-based for loop:

  for (auto number : numbers)
  {
    std::cout << number << ' ';
  }

Tip

To iterate over all elements in a container, always use a range-based for loop. You should only use a more verbose and complex iterator-based loop either if you explicitly need access to the iterator for more advanced processing in the loop’s body or if you only want to iterator over a subrange of the container’s elements.

We will see examples where the loop’s body needs access to the iterator later.

Const Iterators

All iterators that we have used thus far have been so-called mutable (or nonconst ) iterators. You can alter the element that a mutable iterator refers to simply by dereferencing it or, for elements of a class type, through its member access operator -> operator. Here’s an example:

// Ex19_08.cpp
// Altering elements through a mutable iterator
#include <iostream>
#include <vector>
#include "Box.h" // From Ex11_04
int main()
{
  std::vector<Box> boxes{ Box{ 1.0, 2.0, 3.0 } };  // A vector containing 1 Box
  auto iter{ boxes.begin() };
  std::cout << iter->volume() << std::endl;        // 6 == 1.0 * 2.0 * 3.0
  *iter = Box{ 2.0, 3.0, 4.0 };
  std::cout << iter->volume() << std::endl;        // 24 == 2.0 * 3.0 * 4.0
  iter->setHeight(7.0);
  std::cout << iter->volume() << std::endl;        // 42 == 2.0 * 3.0 * 7.0
}

There is nothing new or surprising about this example yet. The point we wanted to make is that besides mutable iterators of type ContainerType::iterator (see earlier), each container also offers const iterators of type ContainerType::const_iterator. Dereferencing a const iterator results in a reference to a const element (const Box& in our example), and its -> operator only allows you to either access member variables as const or invoke const member functions.

There are two ways you typically obtain a const iterator:
  • By calling cbegin() or cend() instead of begin() or end(). The c in the names of these member functions of course refers to const. You can try this by changing begin() to cbegin() in Ex19_08.

  • By invoking begin() or end() on a const container. If the container is const, these functions return a const iterator; only if the container itself is mutable will the result be a mutable iterator as well. (You saw in Chapter 11 how to accomplish this effect through function overloading, where one overload of the same function is const and the other is not.) You can give this a try as well by adding the keyword const in front of the declaration of the boxes vector<> in Ex19_08.

If you turn iter in Ex19_08 into a const iterator in either of these two ways, the lines containing the statements *iter = Box{ 2.0, 3.0, 4.0 }; and iter->setHeight(7.0) will no longer compile: altering an element through a const iterator is not possible.

Tip

Just like it is good practice to add const to your variable declarations whenever possible, you should also use const iterators whenever applicable. This prevents you or anyone else from accidentally altering elements in contexts where that is not desired or expected.

The for loop in Ex19_07, for instance , simply prints out all elements in the container. This certainly is not supposed to alter these elements. You could therefore write the loop like this:

  for (auto iter{ numbers.cbegin() }; iter != numbers.cend(); ++iter)
  {
    std::cout << *iter << std::endl;
  }

Caution

All standard set and map container types only provide const iterators. For these types, begin() and end() always return const iterators, even when invoked on a non-const container. As always, though, these restrictions are easily explained by the nature of these containers. For example, as you know, a std::set<> always stores all its elements in a sorted order. If you allow a user to alter the value of these elements through an iterator, mid-traversal, maintaining this invariant obviously becomes infeasible.

Inserting in and Erasing from Sequence Containers

In Chapter 5, we already introduced push_back() and pop_back()—functions you can use to add elements to, respectively, remove elements from most sequence containers. There was only one restriction: these functions only allow you to manipulate the very last element of the sequence. Earlier this chapter you also saw push_front() in action, a similar function that some sequence containers provide to add elements to the front of the sequence. This is all well and good, but what if you need to insert or remove elements in or from the middle of the sequence? Shouldn’t this be possible as well?

The answer is that of course you can insert in the middle of a sequence or remove whichever element you want! And now that you know about iterators, we are ready to show you how. That is, to indicate where to insert or which elements to remove, you need to provide iterators. All sequence containers except std::array<> offer various insert() and erase() functions that accept either iterators or iterator ranges to this end.

Let’s start off with something simple and add a single element to the beginning of a vector:

std::vector<int> numbers{ 2, 4, 5 };
numbers.insert(numbers.begin(), 1);   // Add single element to the beginning of the sequence
printVector(numbers);                 // 1 2 4 5

The element you provide as insert()’s second argument is inserted right before the position referred to by the iterator you provide as its first argument. So, provided the printVector() function indeed lives up to its name, then the output of this snippet of code should be something of the form “1 2 4 5.”

Naturally, you can insert() new elements at any position you want. The following, for instance, adds the number 3 right in the middle of our numbers sequence :

numbers.insert(numbers.begin() + numbers.size() / 2, 3); // Add in the middle
printVector(numbers);                  // 1 2 3 4 5

The insert() function moreover has a couple of overloads that allow you to add multiple elements at once. One common use of these is to append one vector<> to another:

std::vector<int> more_numbers{ 6, 7, 8 };
numbers.insert(numbers.end(), more_numbers.begin(), more_numbers.end());
printVector(numbers);                  // 1 2 3 4 5 6 7 8

Like with all overloads of insert(), the first argument again indicates the position right after where the new elements are to be added. In this case, we selected the end() iterator, which means we’re inserting right before “one past the end” of the current sequence—or, in other words, right after the last element. The two iterators passed to the function’s second and third parameters indicate the range of elements to insert. In our example, this range corresponds to the entire more_numbers sequence.

Note

Ranges of elements are always indicated using half-open intervals in standard C++. In Chapter 7, you already saw that many member functions of std::string accepted half-open character intervals specified through size_t indexes. Container members such as insert() and erase() (discussed next) similarly work with half-open intervals indicated by means of iterators. If you provide two iterators, from and to, then that range encompasses all elements in the interval [from, to). That is, the range includes the element of the from iterator but not that of the to iterator. Iterator ranges are used by virtually every template function of the Standard Library algorithms library discussed later in this chapter as well.

The opposite of insert() is called erase(). The following sequence of statements removes, one by one, the same elements we added earlier using insert():

numbers.erase(numbers.end() - 3, numbers.end());      // Erase last 3 elements
numbers.erase(numbers.begin() + numbers.size() / 2);  // Erase the middle element
numbers.erase(numbers.begin());                       // Erase the first element
printVector(numbers);                  // 2 4 5

The overload of erase() with two parameters deletes a range of elements; the one with a single parameter deletes only a single element (remember this distinction; it’ll be important again later in this chapter!).

The complete source for the example we used throughout this section can be found in Ex19_09.

Note

Most containers offer similar insert() and erase() functions (the only exception is std::array<>). Naturally, set and map containers will not allow you to indicate where to insert() an element (only sequence containers do), but they do allow you to erase() the element or elements that correspond to an iterator or iterator range. Consult your Standard Library reference for more details.

Altering Containers During Iteration

In past sections we showed you how to iterate over elements in a container as well as how to insert() and erase() elements. The logical next question is then, what if you insert() or erase() elements while you are iterating over a container?

Unless otherwise specified (consult a Standard Library reference for details), any modification to a container is said to invalidate all iterators that were ever created for that container. Any further use of an invalidated iterator results in undefined behavior, which, of course, translates to anything from unpredictable results to crashes. Consider the following function template:

template <typename NumberContainer>
void removeEvenNumbers(NumberContainer& numbers)    /* Wrong!! */
{
  auto from{ numbers.begin() }, to{ numbers.end() };
  for (auto iter {from}; iter != to; ++iter)
  {
    if (*iter % 2 == 0)
      numbers.erase(iter);
  }
}

The intent is to write a template that removes all even numbers from containers of a variety of types—be it vector<int>, deque<unsigned>, list<long>, set<short>, or unordered_set<unsigned>.

The problem is that this template contains two serious yet fairly realistic bugs, both triggered by the erase() in the loop’s body. Once you modify a sequence, for instance, through erase(), you should generally stop using any existing iterators. Yet the loop in removeEvenNumbers() ignores that. Instead, it simply soldiers on using both the to and iter iterators, even after invoking erase() on the container to which both these iterators refer. There’s therefore no telling what will happen should you execute this code, but it most certainly will not be what you might’ve hoped for.

More specifically, once you call erase() a first time, the to iterator no longer points “one past the last element,” but (at least in principle) “two past the last element.” This means that your loop probably dereferences the actual end() iterator, with all its catastrophic consequences. You can solve this rather easily by requesting a new end() iterator after each iteration of the for loop as follows:

  for (auto iter {numbers.begin()}; iter != numbers.end(); ++iter)     /* Still wrong! */
  {
    if (*iter % 2 == 0)
      numbers.erase(iter);
  }

This new loop is still very wrong, though, as it still continues to use the iter iterator after invoking erase(). This, in general, will end in disaster just as well. For a linked list, for instance, erase() will likely deallocate the node that iter refers to, meaning it becomes highly unpredictable what the upcoming ++iter will do. For a std::set(), an erase() might reshuffle the entire tree to put its elements back nicely in order. Any further iteration then becomes risky business as well.

Does this mean that you cannot remove multiple individual elements from a container without restarting each time from begin()? Fortunately not, as that would be particularly inefficient. It just means that you have to follow this pattern:

template <typename NumberContainer>
void removeEvenNumbers(NumberContainer& numbers)    /* Correct!! */
{
  for (auto iter {numbers.begin()}; iter != numbers.end(); )
  {
    if (*iter % 2 == 0)
    {
      iter = numbers.erase(iter);
    }
    else
    {
      ++iter;
    }
  }
}

Most erase() and insert() functions return an iterator that you can use to continue the iteration with. This iterator will then refer to one element past the one that was just inserted or erased (or be equal to end(), if the latter concerned the last element in the container ).

Caution

Do not deviate from the standard pattern we just laid out. For instance, the iterator returned by either erase() or insert() should itself not be incremented anymore, which is why we moved the for loop’s classic ++iter statement into the else branch in the loop’s body!

Tip

This pattern is relatively easy to get wrong, and for sequence containers it is even quite inefficient. Later in this chapter we will introduce the remove-erase idiom, which you should therefore use instead of such loops whenever possible. To insert() elements while iterating over a sequence, however, or to erase() select elements from either set or map containers, you still need to write the loop explicitly yourself.

The following example takes the removeEvenNumbers() template we just developed for a spin. Even though it does so using a vector<int> container, we could’ve used any of the aforementioned types as well:

// Ex19_10.cpp
// Removing all elements that satisfy a certain condition
// while iterating over a container
#include <vector>
#include <string_view>
#include <iostream>
std::vector<int> fillVector_1_to_N(size_t N);       // Fill a vector with 1, 2, ..., N
void printVector(std::string_view message, const std::vector<int>& numbers);
// Make sure to include the removeEvenNumbers() template from before as well...
int main()
{
  const size_t num_numbers{20};
  auto numbers{ fillVector_1_to_N(num_numbers) };
  printVector("The original set of numbers", numbers);
  removeEvenNumbers(numbers);
  printVector("The numbers that were kept", numbers);
}
std::vector<int> fillVector_1_to_N(size_t N)   // Fill a vector with 1, 2, ..., N
{
  std::vector<int> numbers;
  for (int i {1}; i <= N; ++i)
    numbers.push_back(i);
  return numbers;
}
void printVector(std::string_view message, const std::vector<int>& numbers)
{
  std::cout << message << ": ";
  for (int number : numbers) std::cout << number << ' ';
  std::cout << std::endl;
}

The outcome of this program , of course, is this:

The original set of numbers: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
The numbers that were kept: 1 3 5 7 9 11 13 15 17 19

Iterators for Arrays

Iterators behave precisely like pointers—so much so that any pointer is a valid iterator as well. To be more precise, any raw pointer may serve as a random-access iterator. This observation will allow the generic algorithm templates we discuss in the next section to iterate over arrays and containers alike. In fact, array pointers can be used in any context where one would otherwise use iterators. Recall the following statements from Ex19_09:

std::vector<int> more_numbers{ 6, 7, 8 };
numbers.insert(numbers.end(), more_numbers.begin(), more_numbers.end());

Now suppose the more_numbers variable was defined as a built-in array instead. Then one way to append these numbers is by exploiting the array-pointer duality, combined with pointer arithmetic and the std::size() function, as introduced first in Chapter 5:

int more_numbers[] { 6, 7, 8 };
numbers.insert(numbers.end(), more_numbers, more_numbers + std::size(more_numbers));

While perfectly sound, there is a nicer and more uniform way as well. For that, you use the std::begin() and std::end() function templates that are defined in the Standard Library iterator header :

int more_numbers[] { 6, 7, 8 };
numbers.insert(numbers.end(), std::begin(more_numbers), std::end(more_numbers));

In fact, these function templates do not only work for arrays; they work for any container as well:

std::vector<int> more_numbers{ 6, 7, 8 };
numbers.insert(std::end(numbers), std::begin(more_numbers), std::end(more_numbers));

When used with containers, thanks to the way name resolution works in C++ for nonmember functions, you even do not have to explicitly specify the std:: namespace:

std::vector<int> more_numbers{ 6, 7, 8 };
numbers.insert(end(numbers), begin(more_numbers), end(more_numbers));

Not surprisingly, cbegin() and cend() nonmember functions exist as well, which create the corresponding const iterators for either arrays or containers:

std::vector<int> more_numbers{ 6, 7, 8 };
int even_more_numbers[]{ 9, 10 };
numbers.insert(end(numbers), cbegin(more_numbers), cend(more_numbers));
numbers.insert(end(numbers), std::cbegin(even_more_numbers), std::cend(even_more_numbers));

The compactness and uniformity of this syntax makes it our preferred way of specifying ranges in the remainder of this chapter (not in the least because compactness rules when fitting examples onto the pages of a book). In practice, many will continue using the begin() and end() member functions for containers, and there is nothing wrong with that!

Algorithms

The generic algorithms of the Standard Library combine the strengths of various concepts explored earlier in this book, such as function templates (Chapter 8), first-class and higher-order functions (Chapter 18), and, of course, iterators (earlier this chapter). You’ll find that these algorithms are particularly powerful and expressive when combined with C++11’s lambda expressions (Chapter 18).

A First Example

Some of the higher-order functions defined in Chapter 18 actually come close to some of the standard algorithms already. Recall the find_optimum() template? It looked as follows:

template <typename T, typename Comparison>
const T* find_optimum(const std::vector<T>& values, Comparison compare)
{
  if (values.empty()) return nullptr;
  const T* optimum = &values[0];
  for (size_t i = 1; i < values.size(); ++i)
  {
    if (compare(values[i], *optimum))
    {
      optimum = &values[i];
    }
  }
  return optimum;
}
While already quite generic, this template still has two unfortunate limitations:
  • It works only for elements that are stored inside a container of type vector<>.

  • It works only if you want to consider all elements of the given collection. Considering only a subset of these elements is not possible yet without copying them all in a new container first.

You can resolve both shortcomings rather easily by generalizing the template even further using iterators:

template <typename Iterator, typename Comparison>
Iterator find_optimum(Iterator begin, Iterator end, Comparison compare)
{
  if (begin == end) return end;
  Iterator optimum = begin;
  for (Iterator iter = ++begin; iter != end; ++iter)
  {
    if (compare(*iter, *optimum))
    {
      optimum = iter;
    }
  }
  return optimum;
}
This new version does not suffer the two aforementioned issues. After all:
  • Iterators can be used to traverse the elements of all container and array types alike.

  • The new template readily works with subranges as well by simply passing only part of a complete [begin(), end()) iterator range.

The algorithms header actually offers an algorithm that is implemented in precisely this manner. Only it is not called find_optimum(), but std::max_element(). And where there’s a max_element(), there’s of course a min_element(), which searches for whichever element in the range is minimal rather than maximal when compared using the relevant callback function . Let’s see them both in action by adjusting some of the examples of the previous chapter:

// Ex19_11.cpp
// Your first algorithms: std::min_element() and max_element()
#include <iostream>
#include <algorithm>
#include <vector>
int main()
{
  std::vector<int> numbers{ 91, 18, 92, 22, 13, 43 };
  std::cout << "Minimum element: "
            << *std::min_element(begin(numbers), end(numbers)) << std::endl;
  std::cout << "Maximum element: "
            << *std::max_element(begin(numbers), end(numbers)) << std::endl;
  int number_to_search_for {};
  std::cout << "Please enter a number: ";
  std::cin >> number_to_search_for;
  auto nearer { [=](int x, int y) {
    return std::abs(x - number_to_search_for) < std::abs(y - number_to_search_for);
  }};
  std::cout << "The number nearest to " << number_to_search_for << " is "
            << *std::min_element(begin(numbers), end(numbers), nearer) << std::endl;
  std::cout << "The number furthest from " << number_to_search_for << " is "
            << *std::max_element(begin(numbers), end(numbers), nearer) << std::endl;
}

A first thing that jumps from this example is that for min_element() and max_element(), the comparison callback function is optional. Both offer an overload without this third parameter—both of which use the less-than operator, <, to compare elements . Besides that, these standard algorithms of course do precisely what you’d expect:

Minimum element: 13
Maximum element: 92
Please enter a number: 42
The number nearest to 42 is 43
The number furthest from 42 is 92

Tip

The algorithm header also provides std::minmax_element(), which you can use to obtain both minimum and maximum elements within a given range at once. This algorithm returns a pair<> of iterators, with the expected semantics. You could therefore replace the last two statements in Ex19_11 with this:

  const auto [nearest, furthest] =
    std::minmax_element(begin(numbers), end(numbers), nearer);
  std::cout << "The number nearest to " << number_to_search_for << " is "
            << *nearest << std::endl;
  std::cout << "The number furthest from " << number_to_search_for << " is "
            << *furthest << std::endl;

We refer to earlier in this chapter for an introduction on the pair<> template and auto [] syntax.

Finding Elements

The Standard Library provides various algorithms to search for elements within a range of elements. In the previous subsection we already introduced min_element(), max_element(), and minmax_element(). The two related algorithms that you’ll probably use most often are std::find() and find_if(). The first, std::find(), is used to search a range for an element that equals a given value (it compares values with the == operator). The second, find_if(), instead expects a first-class callback function as an argument. It uses this callback function to determine whether any given element satisfies the desired characteristics.

To try them, let’s reach back to an old favorite again: the Box class. Because std::find() needs to compare two Boxes, you’ll need a variant of Box with an overloaded operator==(). The Box.h header from Ex12_09, for one, will do just fine:

// Ex19_12.cpp – Finding boxes.
#include <iostream>
#include <vector>
#include <algorithm>
#include "Box.h"      // From Ex12_09
int main()
{
  std::vector<Box> boxes{ Box{1,2,3}, Box{5,2,3}, Box{9,2,1}, Box{3,2,1} };
  // Define a lambda functor to print the result of find() or find_if():
  auto print_result = [&boxes] (auto result)
  {
    if (result == end(boxes))
      std::cout << "No box found." << std::endl;
    else
      std::cout << "Found matching box at position "
                << (result - begin(boxes)) << std::endl;
  };
  // Find an exact box
  Box box_to_find{ 3,2,1 };
  auto result{ std::find(begin(boxes), end(boxes), box_to_find) };
  print_result(result);
  // Find a box with a volume larger than that of box_to_find
  const auto required_volume = box_to_find.volume();
  result = std::find_if(begin(boxes), end(boxes),
              [required_volume](const Box& box) {return box.volume() > required_volume; });
  print_result(result);
}

The output is as follows:

Found matching box at position 3
Found matching box at position 1

Both find() and find_if() return either an iterator to the found element or the end iterator of the range if no element is found that matches the search criteria .

Caution

If no element is found, the end iterator of the range to search is returned, not the end iterator of the container. Even though in many cases these are the same (in Ex19_12 they are), this is not always true.

A number of variants are provided by the standard. The following list shows a few of them. Consult a Standard Library reference for more details.
  • find_if_not() is similar to find_if(), only it searches for the first element for which the given callback function returns false (rather than true).

  • find_first_of() searches a range of elements for the first element that matches any element from another given range.

  • adjacent_find() searches for two consecutive elements that are equal or that satisfy a given predicate.

  • search() / find_end() search for a range of elements in another range of elements. The former returns the first match, while the latter returns the last match.

  • binary_search() checks whether a given element is present in a sorted range. By exploiting the fact that the elements in the input range are sorted, it can find the desired element faster than find(). Elements are compared using either the < operator or a user-provided comparison callback.

Caution

Earlier in this chapter, we explained that set and map containers are really good at finding elements themselves already. Because they know the internal structure of their data, they can find elements much faster than any generic algorithm ever could. These containers therefore offer a find() member function that you should always use instead of the generic std::find() algorithm. In general, whenever a container offers member functions that are functionally equivalent to an algorithm, you should always use the former. Consult your Standard Library reference for further details.

Outputting Multiple Values

find(), find_if(), find_if_not()—these three algorithms all search for the first element that meets a particular requirement . But what if you’re interested in finding all of them instead? If you glance through all algorithms that the Standard Library has to offer, you’ll find that there is no algorithm with a name like find_all(). Luckily, there are at least three algorithms that would allow you to obtain all elements in a given range that satisfy a given condition:
  • std::remove_if() can be used to remove all elements that do not satisfy the condition. We discuss this algorithm later in the next subsection.

  • std::partition() rearranges the elements in a range such that those elements that satisfy a callback condition are moved to the front of the range and those that do not are moved to the back. We’ll leave this option for you to try later in the exercises.

  • std::copy_if() may be used to copy all elements you need to a second output range.

// Ex19_13.cpp - Extracting all odd numbers.
#include <iostream>
#include <set>
#include <vector>
#include <algorithm>
std::set<int> fillSet_1_to_N(size_t N);       // Fill a set with 1, 2, ..., N
void printVector(const std::vector<int>& v);  // Print the contents of a vector to std::cout
int main()
{
  const size_t num_numbers{20};
  const auto numbers = fillSet_1_to_N(num_numbers);
  std::vector<int> odd_numbers( numbers.size() );
  auto end_odd_numbers = std::copy_if(begin(numbers), end(numbers), begin(odd_numbers),
                                      [](int n) { return n % 2 == 1; });
  odd_numbers.erase(end_odd_numbers, end(odd_numbers));
  printVector(odd_numbers);
}

The printVector() function you can recycle from Ex19_09, and fillSet_1_to_N() is trivially analogous to the fillVector_1_to_N() function of Ex19_10.

Let’s focus our attention on the three lines of the program that matter: those that copy all odd numbers in the numbers set<> into the odd_numbers vector<>. Clearly, the first two parameters of std::copy_if() determine the range of elements that potentially need to be copied; the fourth determines which elements are effectively copied. The third parameter of copy_if() is more interesting, though. The first element that copy_if() copies is assigned to this position. The next position that copy_if() assigns to is then obtained by incrementing the target iterator using ++.

It is therefore critical that the target range, the vector<> odd_numbers in Ex19_13, is at least large enough to hold all elements that copy_if() will copy. Because in general you do not know in advance how many elements there will be, you may be forced to allocate a buffer of memory that is larger than required. This is what you see in Ex19_13 as well: odd_numbers is initialized with a dynamic array of num_numbers elements (all zeroes), which is most definitely sufficient to hold all odd numbers. Of course, it will turn out to be twice larger than actually required. For that purpose, the std::copy_if() algorithm returns an iterator that points one past the last element that it copied. You then typically use this iterator to erase all superfluous elements from the target container, precisely like this is done in Ex19_13.

Caution

Take care not to forget the second argument to the call to erase()! It must be the end iterator of the container. If you forget this second argument, erase() will just erase the single element pointed to by the iterator passed as the first argument. (In Ex19_13, in other words, only one of the original zeroes would then be erased from odd_numbers!)

Many algorithms copy or move output into a target range analogously to std::copy_if() (std::copy(), move(), replace_copy(), replace_copy_if(), remove_copy()—the list goes on…). All could therefore in principle be using the same idiom. That is, you conservatively allocate an overly large target range, which you then shrink again using the iterator returned by the algorithm.

This pattern, however, while still useful at times, is clearly cumbersome, verbose, and quite error prone. Luckily, there is a better way. For Ex19_13, you could use the following two statements instead (see Ex19_13A):

std::vector<int> odd_numbers;
std::copy_if(begin(numbers), end(numbers), std::back_inserter(odd_numbers),
             [](int n) { return n % 2 == 1; });

With this technique, there is no need to over-allocate and thus no need to erase() any redundant elements afterward either. Instead, you create a very special “fake” iterator through the std::back_inserter() function, which is defined by the iterator header of the Standard Library. In our case, every time copy_if() dereferences and assigns a value to this iterator, what effectively happens is that the element that is assigned is forwarded to the push_back() function of the container object that you passed to back_inserter() earlier—that is, odd_numbers.

The end result is therefore the same (all odd numbers are added to odd_numbers), but this time you did so with less code and, more importantly, code that is much clearer, and that leaves considerably less room for error. This is the conclusion:

Tip

The iterator header defines the back_inserter(), front_inserter(), and inserter() functions that create “fake” iterator objects that trigger, respectively, push_back(), push_front(), and insert() whenever a value is assigned to these iterators after dereferencing. Use these functions whenever an algorithm needs to output multiple values into some container.

The Remove-Erase Idiom

Often you need to delete all elements from a container that satisfy specific conditions. With the removeEvenNumbers() function of Ex19_10 earlier in this chapter, we showed you how you can do this using a rather complex for loop that iterates over all the elements of the container, checks whether an element satisfies the conditions, and calls erase() on the container to remove the element. Implementing it this way is inefficient and error prone, however. Take a vector<> as an example. If you remove an element from the middle of the vector<>, all subsequent elements need to be shifted down to fill the gap of the removed element. Also, when removing elements from the container while you are iterating over that same container, you need to take extra care that the iterator is correctly handled, as discussed earlier. This is prone to errors. Therefore:

Tip

Instead of manually iterating over the elements of a sequence container, you should always use the remove-erase idiom. With this idiom, discussed next, you use the std::remove() or std::remove_if() algorithm to remove elements that satisfy a certain condition from a container.

The std::remove() and remove_if() algorithms don’t really remove the elements from the container; they can’t because they only get a range identified by a pair of iterators. They don’t have access to the container, so they can’t erase elements, even if they wanted to. Instead, these algorithms work by moving all elements that should not be removed to the front of the input range. That way, all elements to be kept are moved toward the beginning of the container, and all elements to be erased are left at the end of the container. Similar to copy_if() and friends (see the previous subsection), remove() and remove_if() then return an iterator to the first element to erase. This iterator can then be used as a first argument to the erase() method of the container to truly erase the elements.

Let’s see how the remove-erase idiom works with some actual code . You can use the same program as Ex19_10, only this time you replace the removeEvenNumbers() with this version:

void removeEvenNumbers(std::vector<int>& numbers)
{
  // Use the remove_if() algorithm to remove all even numbers
  auto first_to_erase{ std::remove_if(begin(numbers),
    end(numbers), [](int number) { return number % 2 == 0; }) };
  // Erase all elements including and beyond first_to_erase  
  numbers.erase(first_to_erase, end(numbers));
}

The output of the resulting program (available in Ex19_14) should then remain the same:

The original set of numbers: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
The numbers that were kept: 1 3 5 7 9 11 13 15 17 19

Caution

Take care—and we cannot repeat this enough—not to forget the second argument to the call to erase()! It must be the end iterator of the container. If you forget this second argument, erase() will just erase the single element pointed to by the iterator passed as the first argument. In Ex19_14, in other words, only the number 11 would then be removed.

Sorting

One key operation for arrays and sequence containers is sorting their elements. The std::sort() algorithm is defined in <algorithm> and can be used to sort a range of elements. The first two arguments passed to the algorithm are the begin and end iterators of the range to sort. The third argument passed is an optional comparator. If no comparator is given, the elements are sorted in ascending sequence. That is, if applied to a range of strings, the strings will be sorted lexicographically. The following example sorts a range of strings twice, first lexicographically and then according to the length of each string:

// Ex19_15.cpp – Sorting strings
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
int main()
{
  std::vector<std::string> names{"Frodo Baggins", "Gandalf the Gray", "Aragon",
    "Samwise Gamgee", "Peregrin Took", "Meriadoc Brandybuck", "Gimli",
     "Legolas Greenleaf", "Boromir"};
  // Sort the names lexicographically
  std::sort(begin(names), end(names));
  std::cout << "Names sorted lexicographically:" << std::endl;
  for (const auto& name : names) std::cout << name << ", ";
  std::cout << std::endl << std::endl;  
  // Sort the names by length
  std::sort(begin(names), end(names),
    [](const auto& left, const auto& right) {return left.length() < right.length(); });
  std::cout << "Names sorted by length:" << std::endl;
  for (const auto& name : names) std::cout << name << ", ";
  std::cout << std::endl;
}

The output is as follows:

Names sorted lexicographically:
Aragon, Boromir, Frodo Baggins, Gandalf the Gray, Gimli, Legolas Greenleaf, Meriadoc Brandybuck, Peregrin Took, Samwise Gamgee,
Names sorted by length:
Gimli, Aragon, Boromir, Frodo Baggins, Peregrin Took, Samwise Gamgee, Gandalf the Gray, Legolas Greenleaf, Meriadoc Brandybuck,

Parallel Algorithms

Some of the most notable additions to the C++17 Standard Library are parallel versions of most generic algorithms. Virtually every computer today has multiple processing cores. Even the most modest of phones has multiple processing cores these days. By default, invoking any of the standard algorithms, however, will use only one of these cores. All other cores risk sitting by idly, waiting until someone throws some work in their direction. That is a real shame. When processing large arrays or containers of data, these algorithms could run so much faster if they’d just divide the work among all available cores. With C++17, doing this becomes easy. All you have to do, for instance, to sort the fellowship in Ex19_15 in parallel is tell the algorithm to use the so-called parallel execution policy , like so:

std::sort(std::execution::par, begin(names), end(names));

The std::execution::par constant is defined by the execution header, so you’ll have to include that first. The header defines other execution policies as well, but we won’t discuss them here.

Naturally, with only nine elements, you are unlikely to notice any difference. Of course, if Saruman or Sauron were to sort the names of their troops, then parallel execution would make a lot more sense.

Nearly every algorithm can be parallelized this way. It costs you almost nothing, and the gains could be significant. So, always keep this option in mind when processing larger data sets.

Tip

The algorithm header also defines the for_each() algorithm, which you now could use to parallelize many regular range-based for loops. Do take care, though, that each iteration of the loop can execute independently of the other, or you’ll run into data races. Data races and other aspects of concurrent programming, however, are outside the scope of this book.

Summary

This chapter provided you with a solid first introduction on three of the most important, most frequently used features of the Standard Library: containers, iterators, and algorithms. Containers organize your data using various data structures, each with their strengths and limitations. A typical container, and sequential containers in particular, do not offer much functionality beyond adding, removing, and traversing elements. More advanced operations to manipulate the data that is stored inside these containers are provided instead in the form of an impressive collection of generic, higher-order function templates, called algorithms.

Our goal here was never to make you an expert user of the various container and algorithm templates yet. For that, considerably more pages are required than we had left for this book. To actively start using the features that you learned about in this chapter, you’ll therefore need to regularly consult a Standard Library reference—one that lists all member functions of the various containers, as well as the many algorithm templates that exist (there are more than 100 of them in total!), and that specifies the precise semantics of all this powerful functionality. Even the most seasoned C++ developer regularly needs guidance from a good reference book or website.

In this chapter, we therefore aimed to convey a broad bird’s-eye overview, focused on general principles, best practices, and common caveats to watch out for, with guidelines on choosing between the rich set of features that the Standard Library has to offer, typical use cases, and standard idioms. In short, it contained everything that you cannot readily extract from a typical reference work. The most important points covered in this chapter are the following:
  • Sequence containers store data in a straightforward user-determined linear order, one element after the other.

  • Your go-to sequential container should be std::vector<>. The practical use for the other sequential containers in real-life applications, and list<>, forward_list<>, and deque<> in particular, is typically limited.

  • The three container adapters—std::stack<>, queue<>, and priority_queue<>—all encapsulate a sequential container, which they use to implement a limited set of operations that allow you to inject and later take out elements. Their difference mostly lies in the order in which these elements come out again.

  • Sets are duplicate-free containers and are good at determining whether they contain a given element.

  • Maps uniquely associate keys with values and allow you to quickly retrieve a value given their keys.

  • Both sets and maps come in two flavors: ordered and unordered. The former are particularly interesting if you need a sorted view on your data as well; the latter have the potential to be more efficient but may come with the complexity of having to define an effective hash function first (we did not cover hash functions here, but you can read all about it in your favorite Standard Library reference).

  • You—and of course the Standard algorithms as well—can use iterators to enumerate the elements of any given container, without having to know how this data is actually physically organized.

  • Iterators in C++ typically make heavy use of operator overloading in order to look and feel like pointers.

  • The Standard Library offers more than a 100 different algorithms, most in the algorithms header. We made sure that the ones you’ll likely use most often are covered either in the main text or in the following exercises.

  • All algorithms operate on half-open ranges of iterators, and many accept a first-class callback function. Mostly you’ll call an algorithm with a lambda expression if its default behavior does not suit you.

  • Algorithms that retrieve a single element from a range (find(), find_if(), min_element(), max_element(), and so on) do so by returning an iterator. The end iterator of the range is then always used to denote “not found.”

  • Algorithms that produce multiple output elements (copy(), copy_if(), and so on) should normally always be used in conjunction with the std::back_inserter(), front_inserter(), and inserter() utilities provided by the iterator header.

  • To remove multiple elements from sequence containers, you should use the remove-erase idiom.

  • You can take advantage of the extensive multiprocessing capabilities of current hardware by passing the std::execution::par execution policy as the first argument to most algorithms.

Exercises

The following exercises enable you to try what you’ve learned in this chapter. If you get stuck, look back over the chapter for help. If you’re still stuck after that, you can download the solutions from the Apress website ( www.apress.com/book/download.html ), but that really should be a last resort.
  • Exercise 19-1. In practice, we would never recommend you to implement your own linked list data structure to store Boxes in Truckload. At the time it made perfect sense to practice nested classes, as well as working with pointers; but normally you should follow our advice from earlier in this chapter and simply use a vector<> instead (a polymorphic vector<>, to be precise—see Chapter 14). If you need a sequence container, a vector<> is almost always the way to go! Eliminate the linked list from the Truckload class of Exercise 17-1 according to this guideline. Notice how you can now adhere to the rule of zero as well (see Chapter 18)?

  • Exercise 19-2. Replace both instances of your self-defined Stack<> in Ex16_04A with an instance of std::stack<>.

  • Exercise 19-3. Rework your solution to Exercise 16-6 by replacing all instances of your SparseArray<> and linked list template types with standard containers. Carefully think about which container types would be most appropriate!

    Note If you want extra practice, you can do the same for the solutions of Exercises 16-4 and 16-5 as well.

  • Exercise 19-4. Research the std::partition() algorithm and use it to reimplement the removeEvenNumbers() function of either Ex19_10 or Ex19_14.

  • Exercise 19-5. Not all Standard Library algorithms are defined by the algorithms header. Some are defined by the numeric header as well. One such example is accumulate(). Research this algorithm and use it to implement an algorithm-like function template that computes the average of a given iterator range. Exercise your newly implemented template with a little test program.

  • Exercise 19-6. Another algorithm that is defined by the numeric header is the oddly named iota() algorithm, which you can use to fill a given range with values M, M+1, M+2, and so on. Use it to rework the fillVector_1_to_N() function of Ex19_10.

    Note The name of the iota() algorithm refers to the Greek letter iota, written ⍳. It is an homage to the classical programming language APL developed by Turing Award winner Kenneth E. Iverson in the 1960s in his influential book A Programming Language (this title is where the acronym APL itself stems from). The APL programming language used mathematical symbols to name numeric functions, one of which was ⍳. In APL, ⍳3, for instance, would produce the array {1 2 3}.

  • Exercise 19-7. erase() and erase_if() are not the only algorithms for which the remove-erase idiom is applicable. Another example is std::unique(), which is used to remove duplicates from a presorted range of elements. Write a little program that fills a vector<> with a considerably large amount of random integers between 0 and RAND_MAX, sorts this sequence, removes the duplicates, and then prints out the amount of remaining elements.

  • Exercise 19-8. Parallelize your solution to the previous exercise.