© Ivor Horton and Peter Van Weert 2018
Ivor Horton and Peter Van WeertBeginning C++17https://doi.org/10.1007/978-1-4842-3366-5_10

10. Program Files and Preprocessing Directives

Ivor Horton1  and Peter Van Weert2
(1)
Stratford-upon-Avon, Warwickshire, UK
(2)
Kessel-Lo, Belgium
 

This chapter is more about managing code than writing code. We’ll discuss how multiple program files and header files interact and how you manage and control their contents. The material in this chapter has implications for how you define your data types, which you’ll learn about starting in the next chapter.

In this chapter, you will learn:
  • How header files and source files interrelate

  • What a translation unit is

  • What linkage is and why it is important

  • More about how you use namespaces

  • What preprocessing is and how to use the preprocessing directives to manage code

  • The basic ideas in debugging and the debugging help you can get from preprocessing and the Standard Library

  • How you use the static_assert keyword

Understanding Translation Units

You know that header files primarily contain declarations that are used by the source files that contain the executable code. Sure, headers can contain definitions with executable code, and source files will often declare new functionality that appears in no header as well. But for the most part, the basic idea is that header files contain function declarations and type definitions, which are used by source files to create additional function definitions. The contents of a header file are made available in a source file by using an #include preprocessing directive.

So far you have used only preexisting header files that provide the information necessary for using Standard Library capabilities. The program examples have been short and simple; consequently, they have not warranted the use of separate header files containing your own definitions. In the next chapter, when you learn how to define your own data types, the need for header files will become even more apparent. A typical practical C++ program involves many header files that are included in many source files.

Each source file, along with the contents of all the header files that you include in it, is called a translation unit. The term is a somewhat abstract term because this isn’t necessarily a file in general, although it will be with the majority of C++ implementations. The compiler processes each translation unit in a program independently to generate an object file. The object file contains machine code and information about references to entities such as functions that were not defined in the translation unit—external entities in other words. The set of object files for a complete program is processed by the linker, which establishes all necessary connections between the object files to produce the executable program module. If an object file contains references to an external entity that is not found in any of the other object files, no executable module will result, and there will be one or more error messages from the linker. The combined process of compiling and linking translation units is referred to as translation.

The One Definition Rule

The one definition rule (ODR) is an important concept in C++. Despite its name, ODR is not really one single rule; it’s more like a set of rules—one rule per type of entity you might define in a program. Our exposition of these rules won’t be exhaustive, nor will it use the formal jargon you’d need for it to be 100 percent accurate. Our main intent is to familiarize you with the general ideas behind the ODR restrictions. This will help you later to better understand how to organize the code of your program in multiple files and to decipher and resolve the compiler and linker errors you’ll encounter when you violate an ODR rule.

In a given translation unit, no variable, function, class type, enumeration type, or template must ever be defined more than once. You can have more than one declaration for a variable or function, for example, but there must never be more than one definition that determines what it is and causes it to be created. If there’s more than one definition within the same translation unit, the code will not compile.

Note

A declaration introduces a name into a scope. A definition not only introduces the name but also defines what it is. In other words, all definitions are declarations, but not all declarations are definitions.

You have seen that you can define variables in different blocks to have the same name, but this does not violate the one definition rule; the variables may have the same name, but they are distinct.

Most functions and variables must be defined once and only once within an entire program. No two definitions are allowed, even if they’re identical and appear defined in different translation units. Exceptions to this rule are inline functions and variables (the latter have existed only since C++17). For inline functions and variables, a definition must appear once in every translation unit that uses them. All these definitions of a given inline function or variable have to be identical, though. For this reason, you should always define inline functions and variables in a header file that you include in a source file whenever one is required.

If you define a class or enumeration type, you usually want to use it in more than one translation unit. Several translation units in a program can thus each include a definition for the type, again provided all these definitions are identical. In practice, you achieve this by placing the definition for a type in a header file and use an #include directive to add the header file to any source file that requires the type definition. However, duplicate definitions for a given type remain illegal within a single translation unit, so you need to be careful how you define the contents of header files. You must make sure that duplicate type definitions within a translation unit cannot occur. You’ll see how you do this later in this chapter.

Note

In the next chapter, we’ll disclose how a class definition declares the various member functions and variables for that class. For the class type itself, one definition is required within every translation unit that uses it. Its members, however, follow ODR rules analogous to those of regular functions and variables. In other words, noninline class members must have only one single definition within the entire program. That’s why, unlike class type definitions, class member definitions will mostly appear in source files rather than in header files.

The one definition rule applies differently to function templates (or class member function templates, covered in Chapter 16). Because the compiler needs to know how to instantiate the template, each translation unit in which you instantiate a template using a previously unseen set of template arguments needs to contain a definition of that template. In a way, the compiler does preserve an ODR-like behavior by instantiating each template only once for a given combination of arguments.

Program Files and Linkage

Entities defined in one translation unit often need to be accessed from code in another translation unit. Functions are obvious examples of where this is the case, but you can have others—variables defined at global scope that are shared across several translation units, for instance, or the definitions of nonfundamental types. Because the compiler processes one translation unit at a time, such references can’t be resolved by the compiler. Only the linker can do this when all the object files from the translation units in the program are available.

The way that names in a translation unit are handled in the compile/link process is determined by a property that a name can have called linkage. Linkage expresses where in the program code the entity that is represented by a name can be. Every name that you use in a program either has linkage or doesn’t. A name has linkage when you can use it to access something in your program that is outside the scope in which the name is declared. If this isn’t the case, it has no linkage. If a name has linkage, then it can have internal linkage or external linkage. Therefore, every name in a translation unit has internal linkage, external linkage, or no linkage.

Determining Linkage for a Name

The linkage that applies to a name is not affected by whether its declaration appears in a header file or a source file. The linkage for each name in a translation unit is determined after the contents of any header files have been inserted into the .cpp file that is the basis for the translation unit. The linkage possibilities have the following meanings:
  • Internal linkage : The entity that the name represents can be accessed from anywhere within the same translation unit. For example, the names of non-inline variables defined at global scope that are specified as const have internal linkage by default.

  • External linkage: A name with external linkage can be accessed from another translation unit in addition to the one in which it is defined. In other words, the entity that the name represents can be shared and accessed throughout the entire program. All the functions that we have written so far have external linkage and so do both non-const and inline variables defined at global scope.

  • No linkage: When a name has no linkage , the entity it refers to can be accessed only from within the scope that applies to the name. All names that are defined within a block—local names, in other words—have no linkage.

External Functions

In a program made up of several files, the linker establishes (or resolves) the connection between a function call in one source file and the function definition in another. When the compiler compiles a call to the function, it only needs the information contained in a function prototype to create the call. This prototype is contained within each declaration of the function. The compiler doesn’t mind whether the function’s definition occurs in the same file or elsewhere. This is because function names have external linkage by default. If a function is not defined within the translation unit in which it is called, the compiler flags the call as external and leaves it for the linker to sort out.

It’s high time to clarify this with a first example. For this, we’ll adapt Ex8_17.cpp and move the definition of its power() function to a different translation unit:

// Ex10_01.cpp
// Calling external functions
#include <iostream>
#include <iomanip>
double power(double x, int n);      // Declaration of an external power() function
int main()
{
  for (int i {-3}; i <= 3; ++i)     // Calculate powers of 8 from -3 to +3
    std::cout << std::setw(10) << power(8.0, i);
  std::cout << std::endl;
}

All the files for examples with more than one file will be in a separate folder in the code download, so the files for this example will be in the Ex10_01 folder.

The Ex10_01 translation unit consists of the code in Ex10_01.cpp, combined with all declarations brought in by including the iostream and iomanip headers. Note that, indirectly, these headers in turn surely #include many more Standard Library headers. That way, a lot of code may get pulled into a translation unit, even by the #include of only a single header.

Even though power() is called by main(), no definition of this function is present in the Ex10_01 translation unit. But that’s OK. All the compiler needs to carry out a call to power() is its prototype. The compiler then simply makes note of a call to an externally defined power() function inside the object file for the Ex10_01 translation unit, making it the linker’s job to hook up—or link—the call with its definition. If the linker doesn’t find the appropriate definition in one of the other translation units of the program, it will signal this as a translation failure.

To make the Ex10_01 program translate correctly, you’ll therefore need a second translation unit with the definition of power():

// Power.cpp
// The power function called from Ex10_01.cpp is defined in a different translation unit
double power(double x, int n)
{
  if (n == 0)      return 1.0;
  else if (n > 0)  return x * power(x, n - 1);
  else /* n < 0 */ return 1.0 / power(x, -n);
}

By linking the object files of the Ex10_01 and Power translation units, you obtain a program that is otherwise completely equivalent to that of Ex8_17.

Note that in order to use the power() function in Ex10_01.cpp, we still had to supply the compiler with a prototype in the beginning of the source file. It would not be very practical if you had to do this explicitly for every externally defined function. This is why function prototypes are typically gathered in header files, which you can then conveniently #include into your translation units. Later in this chapter we explain how you can create your own header files.

External Variables

Suppose that in Ex10_01.cpp, you wanted to replace the magic constants -3 and 3 using an externally defined variable power_ range , like so:

  for (int i {-power_range}; i <= power_range; ++i)     // Calculate powers of 8
    std::cout << std::setw(10) << power(8.0, i);

The first step is to create an extra source file, Range.cpp, containing the variable’s definition:

// Range.cpp
int power_range{ 3 };            // A global variable with external linkage

Non-const variables have external linkage by default, just like functions do. So other translation units will have no problem accessing this variable. The interesting question, though, is this: how do you declare a variable in the Ex10_02 translation unit without it becoming a second definition? A reasonable first attempt would be this:

// Ex10_02.cpp
// Using an externally defined variable
#include <iostream>
#include <iomanip>
double power(double x, int n);      // Declaration of an external power() function
int power_range;                    // Not an unreasonable first attempt, right?
int main()
{
  for (int i {-power_range}; i <= power_range; ++i)     // Calculate powers of 8
    std::cout << std::setw(10) << power(8.0, i);
  std::cout << std::endl;
}

The compiler will have no problem with this declaration of power_range. The linker, however, will signal an error! We recommend you give this a try as well to familiarize yourself with this error message. In principle (linker error messages do not always excel in clarity), you should then be able to deduce that we have supplied two distinct definitions for power_range: one in Range.cpp and one in Ex10_02.cpp. This, of course, violates the one definition rule!

The underlying problem is that our declaration of the power_range variable in Ex10_02.cpp is not just any old variable declaration; it’s a variable definition:

int power_range;

In fact, you might’ve already known this would happen. Surely, you’ll remember that variables generally contain garbage if you neglect to initialize them. Near the end of Chapter 3, however, we’ve also covered global variables. And global variables, as we told you, will be initialized with zero, even if you omit the braced initializer from their definition. In other words, our declaration of the global power_range variable in Ex10_02.cpp is equivalent to the following definition:

int power_range {};

ODR does not allow for two definitions of the same variable. The compiler therefore needs to be told that the definition for the global variable power_ range will be external to the current translation unit, Ex10_02. If you want to access a variable that is defined outside the current translation unit, then you must declare the variable name using the extern keyword:

extern int power_range;            // Declaration of an externally defined variable

This statement is a declaration that power_range is a name that is defined elsewhere. The type must correspond exactly to the type that appears in the definition. You can’t specify an initial value in an extern declaration because it’s a declaration of the name, not a definition of a variable. Declaring a variable as extern implies that it is defined in another translation unit. This causes the compiler to mark the use of the externally defined variable. It is the linker that makes the connection between the name and the variable to which it refers.

Note

You’re allowed to add extern specifiers in front of function declarations as well. For example, in Ex10_02.cpp you could’ve declared the power() function with an explicit extern specifier to call attention to the fact that the function’s definition will be part of a different translation unit:

extern double power(double x, int n);

While arguably nicer for consistency and code clarity, adding extern here is optional.

const Variables with External Linkage

Given its nature, you’d of course want to define the power_range variable from Range.cpp of Ex10_02 as a global constant, rather than a modifiable global variable:

// Range.cpp
const int power_range {3};

A const variable , however, has internal linkage by default, which makes it unavailable in other translation units. You can override this by using the extern keyword in the definition:

// Range.cpp
extern const int power_range {3};     // Definition of a global constant with external linkage

The extern keyword tells the compiler that the name should have external linkage, even though it is const. When you want to access power_range in another source file, you must declare it as const and external:

extern const int power_range;         // Declaration of an external global constant

You can find this in a fully functioning example in Ex10_02A. Within any block in which this declaration appears, the name power_range refers to the constant defined in another file. The declaration can appear in any translation unit that needs access to power_range. You can place the declaration either at global scope in a translation unit so that it’s available throughout the code in the source file or within a block in which case it is available only within that local scope.

Global variables can be useful for constant values that you want to share because they are accessible in any translation unit. By sharing constant values across all of the program files that need access to them, you can ensure that the same values are being used for the constants throughout your program. However, although up to now we have shown constants defined in source files, the best place for them is in a header file. You’ll see an example of this later in this chapter.

Internal Names

If there’s a way to specify that names should have external linkage, surely there must be one as well to specify that they should have internal linkage, right? There is, but it’s not what you’d expect.

Let’s first illustrate when and why you’d need this possibility. Perhaps you noticed that upon every recursive call of power() of Ex10_01 the function checks whether its argument n is positive or negative. This is somewhat wasteful because the sign of n, of course, never changes. One option is to rewrite power() in the following manner:

// Power.cpp
double compute(double x, unsigned n)
{
  return n == 0? 1.0 : x * compute(x, n - 1);
}
double power(double x, int n)
{
  return n >= 0 ? compute(x, static_cast<unsigned>(n))
                : 1.0 / compute(x, static_cast<unsigned>(-n));
}

The power() function itself is now no longer recursive. Instead, it calls the recursive helper function compute(), which is defined to work only for positive (unsigned) arguments n. Using this helper, it’s easy to rewrite power() in such a way that it checks whether the given n is positive only once.

In this case, compute() could in principle be a useful function in its own right—best renamed then to become a second overload of power(), that is, one specific for unsigned exponents. For argument’s sake, however, suppose we want compute() to be nothing more than a local helper function, one that is only to be called by power(). You’ll find that the need for this occurs quite often; you need a function to make your local code clearer or to reuse within one specific translation unit, but that function is too specific for it to be exported to the rest of the program for reuse.

Our compute() function currently has external linkage as well, just like power(), and can therefore be called from within any translation unit. Worse, the one definition rule implies that no other translation unit may define a compute() function with the same signature anymore either! If all local helper functions always had external linkage, they’d all need unique names as well, which would soon make it even harder to avoid name conflicts in larger programs.

What we need is a way to tell the compiler that a function such as compute() should have internal linkage rather than external linkage. An obvious attempt would be to add an intern specifier. That might’ve worked, if not for the little detail that there’s no such keyword in C++. Instead, in the old days, the way to mark a name (function or variable name) for internal linkage was by adding the static keyword. Here’s an example:

static double compute(double x, unsigned n)       // compute() now has internal linkage
{
  return n == 0 ? 1.0 : x * compute(x, n - 1);
}

While this notation will still work—you can try it for yourself in Ex10_03—this use of the keyword static is no longer recommended. The only reason that this syntax is not deprecated or removed from the C++ Standard yet (or anymore, for those who know their history) is that you’ll still find it a lot in legacy code. Nevertheless, the recommended way to define names with internal linkage today is through unnamed namespaces, as we’ll explain later in this chapter.

Caution

Never use static anymore to mark names that should have internal linkage; always use unnamed namespaces instead.

Preprocessing Your Source Code

Preprocessing is a process executed by the compiler before a source file is compiled into machine instructions. Preprocessing prepares and modifies the source code for the compile phase according to instructions that you specify by preprocessing directives. All preprocessing directives begin with the symbol #, so they are easy to distinguish from C++ language statements. Table 10-1 shows the complete set.
Table 10-1.

Preprocessing Directives

Directive

Description

#include

Supports header file inclusion.

#if

Enables conditional compilation.

#else

else for #if.

#elif

Equivalent to #else #if.

#endif

Marks the end of an #if directive.

#define

Defines an identifier.

#undef

Deletes an identifier previously defined using #define.

#ifdef (or #if defined)

Does something if an identifier is defined.

#ifndef (or #if !defined)

Does something if an identifier is not defined.

#line

Redefines the current line number. Optionally changes the filename as well.

#error

Outputs a compile-time error message and stops the compilation. This is typically part of a conditional preprocessing directive sequence.

#pragma

Offers vendor-specific features while retaining overall C++ compatibility.

The preprocessing phase analyzes, executes, and then removes all preprocessing directives from a source file. This generates the translation unit that consists purely of C++ statements that is then compiled. The linker must then process the object file that results along with any other object files that are part of the program to produce the executable module.

You may wonder why you would want to use the #line directive to change the line number. The need for this is rare, but one example is a program that maps some other language into C or C++. An original language statement may generate several C++ statements, and by using the #line directive, you can ensure that C++ compiler error messages identify the line number in the original code, rather than the C++ that results. This makes it easier to identify the statement in the original code that is the source of the error.

Several of these directives are primarily applicable in C and are not so relevant with current C++. The language capabilities of C++ provide much more effective and safer ways of achieving the same result as some of the preprocessing directives. We’ll mostly focus on the preprocessing directives that are important in C++. You are already familiar with the #include directive. There are other directives that can provide considerable flexibility in the way in which you specify your programs. Keep in mind that preprocessing operations occur before your program is compiled. Preprocessing modifies the statements that constitute your program, and the preprocessing directives no longer exist in the source file that is compiled.

Defining Preprocessor Macros

A #define directive specifies a so-called macro. A macro is a rewrite rule that instructs the preprocessor which text replacements to apply to the source code prior to handing it over to the compiler. The simplest form of the #define preprocessing directive is the following:

#define IDENTIFIER sequence of characters

This macro effectively defines IDENTIFIER as an alias for sequence of characters. IDENTIFIER must conform to the usual definition of an identifier in C++, that is, any sequence of letters and digits, the first of which is a letter, and where the underline character counts as a letter. A macro identifier does not have to be in all caps, though this is certainly a widely accepted convention. sequence of characters can be any sequence of characters, including an empty sequence or a sequence that contains whitespace.

One use for #define is to define an identifier that is to be replaced in the source code by a substitute string during preprocessing. Here’s how you could define PI to be an alias for a sequence of characters that represents a numerical value:

#define PI 3.14159265

PI looks like a variable, but this has nothing to do with variables. PI is a symbol, or token, that is exchanged for the specified sequence of characters by the preprocessor before the code is compiled. 3.14159265 is not a numerical value in the sense that no validation is taking place; it is merely a string of characters. The string PI will be replaced during preprocessing by its definition, the sequence of characters 3.14159265, wherever the preprocessing operation deems that the substitution makes sense. If you wrote 3,!4!5 as the replacement character sequence, the substitution would still occur.

The #define directive is often used to define symbolic constants in C, but don’t do this in C++. It is much better to define a constant variable , like this:

inline const double pi {3.14159265358979323846};

pi is a constant value of a particular type. The compiler ensures that the value for pi is consistent with its type. You could place this definition in a header file for inclusion in any source file where the value is required or define it with external linkage:

extern const double pi {3.14159265358979323846};

Now you may access pi from any translation unit just by adding an extern declaration for it wherever it is required.

Note

To date, the C++ Standard Library does not define any mathematical constants, not even one as fundamental as π. Nevertheless, most compiler libraries will offer nonstandard definitions of π, so it may therefore be worth checking your documentation first. Otherwise, the easiest portable solution is to define a simple macro for it or otherwise one constant per floating-point type.1 When doing so, however, it is critical you use sufficient digits after the comma. Never define π, for instance, as simply 3.1415—using such an approximation would result in particularly inaccurate results! The macro we defined earlier this section has sufficient digits to be safe for use in float, double, or long double computations on most platforms.

Caution

Using a #define directive to define an identifier that you use to specify a value in C++ code has three major disadvantages: there’s no type checking support, it doesn’t respect scope, and the identifier name cannot be bound within a namespace. In C++, you should always use const variables instead.

Here’s another example:

#define BLACK WHITE

Any occurrence of BLACK in the file will be replaced by WHITE. The identifier will be replaced only when it is a token. It will not be replaced if it forms part of an identifier or appears in a string literal or a comment. There’s no restriction on the sequence of characters that is to replace the identifier. It can even be absent in which case the identifier exists but with no predefined substitution string—the substitution string is empty. If you don’t specify a substitution string for an identifier, then occurrences of the identifier in the code will be replaced by an empty string; in other words, the identifier will be removed. Here’s an example:

#define VALUE

The effect is that all occurrences of VALUE that follow the directive will be removed. The directive also defines VALUE as an identifier, and its existence can be tested by other directives, as you’ll see.

Note that the preprocessor is completely agnostic about C or C++. It will blindly perform any replacement you ask it to do, even if the result is no longer valid C or C++ code. You could even use it to replace C++ keywords as follows (we’ll leave it up to you to decide whether you should…):

#define true false
#define break

The major use for the #define directive with C++ is in the management of header files, as you’ll see later in this chapter.

Defining Function-Like Macros

The #define directives you’ve seen so far have been similar to variable definitions in C++. You can define function-like text replacement macros as well. Here’s an example:

#define MAX(A, B) A >= B ? A : B

While this looks an awful lot like one, this most certainly isn’t a function. There are no argument types, nor is there a return value. A macro is not something that is called, nor does its right side necessarily specify statements to be executed at runtime. Our sample macro simply instructs the preprocessor to replace all occurrences of MAX( anything1 , anything2 ) in the source code with the character sequence that appears in the second half of the #define directive. During this replacement process, all occurrences of A in A >= B? A : B are of course replaced by anything1, and all occurrences of B are replaced with anything2. The preprocessor makes no attempt at interpreting the anything1 and anything2 character sequences; all it does is blind text replacement. Suppose, for instance, your code contains this statement:

std::cout << MAX(expensive_computation(), 0) << std::endl;

Then the preprocessor expands it to the following source code before handing it over to the compiler:

std::cout << expensive_computation() >= 0 ? expensive_computation() : 0 << std::endl;
This example exposes two problems:
  • The resulting code will not compile. If you use the ternary operator together with the streaming operator <<, the operator precedence rules tell us that the expression with the ternary operator should be between parentheses. A better definition of our MAX() macro would therefore be the following:

                #define MAX(A, B) (A >= B ? A : B)

    Even better would be to add parentheses around all occurrences of A and B to avoid similar operator precedence problems there as well:

                #define MAX(A, B) ((A) >= (B) ? (A) : (B))
  • The expensive_computation() function is called up to two times. If a macro parameter such as A appears more than once in the replacement, the preprocessor will blindly copy the macro arguments more than once. This undesired behavior with macros is harder to avoid.

These are just two of the common pitfalls with macro definitions. We therefore recommend you never create function-like macros, unless you have a good reason for doing so. While some advanced scenarios do call for such macros, C++ mostly offers alternatives that are far superior. Macros are popular among C programmers because they allow the creation of function-like constructs that work for any parameter type. But of course you already know that C++ offers a much better solution for this: function templates. After Chapter 9, it should be a breeze for you to define a C++ function template that replaces the C-style MAX() macro. And this template inherently avoids both the shortcomings of macro definitions we listed earlier.

Caution

Never use preprocessor macros to define operations such as min(), max(), or abs(). Instead, you should always use either regular C++ functions or function templates. Function templates are far superior to preprocessor macros for defining blueprints of functions that work for any argument type. In fact, the cmath header of the Standard Library already offers precisely such function templates—including std::min(), std::max(), and std::abs()—so there’s often no need for you to define them yourself.

Preprocessor Operators

For completeness, Table 10-2 lists the two operators you can apply to the parameters of a function-like text replacement macro.
Table 10-2.

Preprocessor Operators

#

The so-called stringification operator. Turns the argument in a string literal containing its value (by surrounding it with double quotes and adding the necessary character escape sequences).

##

The concatenation operator. Concatenates (pastes together, similar to what the + operator does for the values of two std::strings) the values of two identifiers.

The following toy program illustrates how you might use these operators:

// Ex10_04.cpp
// Working with preprocessor operators
#include <iostream>
#define DEFINE_PRINT_FUNCTION(NAME, COUNT, VALUE) \
  void NAME##COUNT() { std::cout << #VALUE << std::endl; }
DEFINE_PRINT_FUNCTION(fun, 123, Test 1 "2" 3)
int main()
{
  fun123();
}

Before we get to the use of both preprocessor operators, Ex10_04.cpp shows one additional thing: macro definitions are not really allowed to span multiple lines. By default, the preprocessor simply replaces any occurrences of the macro’s identifier (possibly taking a number of arguments) with all the characters it finds on the same line to the right of the identifier. However, it is not always practical to fit the entire definition on one single line. The preprocessor therefore allows you to add line breaks, as long as they are immediately preceded with a backslash character. All such escaped line breaks are discarded from the substitution. That is, the preprocessor first concatenates the entire macro definition back into one single line (in fact, it does so even outside of a macro definition).

Note

In Ex10_04.cpp we added the line break before the right side of the macro definition, which is probably the most natural thing to do. But since the preprocessor always just stitches any sliced lines back together, without interpreting the characters, such escaped line breaks can really appear anywhere you want. Not that this is in any way recommended, but this means you could in extremis even write the following:

#define DEFINE_PRINT_FUNCT\
ION(NAME, COUNT, VALUE) vo\
id NAME##COUNT() { std::co\
ut << #VALUE << std::endl; }

Mind you, if you do splice identifiers like this, for whatever crazy reason, take care not to add whitespace characters at the beginning of the next line, as these are not discarded by the preprocessor when it puts the pieces back together.

Enough about line breaks; let’s get back to the topic at hand: preprocessor operators. The macro definition in Ex10_04 uses both ## and #:

#define DEFINE_PRINT_FUNCTION(NAME, COUNT, VALUE) \
  void NAME##COUNT() { std::cout << #VALUE << std::endl; }

With this definition, the line DEFINE_PRINT_FUNCTION(fun, 123, Test 1 "2" 3) in Ex10_04.cpp expands to the following:

  void fun123() { std::cout << "Test 1 \"2\" 3" << std::endl; }

Without the ## operator, you’d have the choice between either NAMECOUNT or NAME COUNT. In the former, the preprocessor would not recognize NAME or COUNT, whereas the latter in our example would expand to fun 123, which is not a valid function name (C++ identifiers must not contain spaces). And, clearly, without the # operator you’d have a hard time turning a given character sequence into a valid C++ string literal.

Because the preprocessor runs first, the fun123() function definition will thus be present by the time the C++ compiler gets to see the source code. This is why you can call fun123() in the program’s main() function, where it produces the following result:

Test 1 "2" 3

Undefining Macros

You may want to have the identifier resulting from a #define directive exist in only part of a program file. You can nullify a definition for an identifier using the #undef directive . You can negate a previously defined VALUE macro with this directive:

#undef VALUE

VALUE is no longer defined following this directive, so no substitutions for VALUE can occur. The following code fragment illustrates this:

#define PI 3.141592653589793238462643383279502884
// All occurrences of PI in code from this point will be replaced
// by 3.141592653589793238462643383279502884
// ...
#undef PI
// PI is no longer defined from here on so no substitutions occur.
// Any references to PI will be left in the code.

Between the #define and #undef directives, preprocessing replaces appropriate occurrences of PI in the code with 3.141592653589793238462643383279502884. Elsewhere, occurrences of PI are left as they are. The #undef directive also works for function-like macros. Here’s an example:

#undef MAX

Including Header Files

A header file is an external file whose contents are included in a source file using the #include preprocessing directive. Header files contain primarily type definitions, template definitions, function prototypes, and constants. You are already completely familiar with statements such as this:

#include <iostream>

The contents of the iostream Standard Library header replaces the #include directive . This will be the definitions required to support input and output with the standard streams. Any Standard Library header name can appear between the angled brackets. If you #include a header that you don’t need, the primary effect is to extend the compilation time, and the executable may occupy more memory than necessary. It may also be confusing for anyone who reads the program.

You include your own header files into a source file with a slightly different syntax where you enclose the header file name between double quotes. Here’s an example:

#include "myheader.h"

The contents of the file named myheader.h are introduced into the program in place of the #include directive. The contents of any file can be included into your program in this way. You simply specify the file name of the file between quotes as in the example. With the majority of compilers, the file name can use uppercase and lowercase characters. In theory, you can assign any name and extension you like to your header files; you don’t have to use the extension .h. However, it is a convention adhered to by most C++ programmers, and we recommend that you follow it too.

Note

Some libraries use the .hpp extension for C++ header files and reserve the use of the .h extension for header files that contain either pure C functions or functions that are compatible with both C and C++. Mixing C and C++ code is an advanced topic, which we do not cover in this book.

The process used to find a header file depends on whether you specify the file name between double quotes or between angled brackets. The precise operation is implementation-dependent and should be described in your compiler documentation. Usually, the compiler only searches the default directories that contain the Standard Library headers for the file when the name is between angled brackets. This implies that your header files will not be found if you put the name between angled brackets. If the header name is between double quotes, the compiler searches the current directory (typically the directory containing the source file that is being compiled) followed by the directories containing the standard headers. If the header file is in some other directory, you may need to put the complete path for the header file or the path relative to the directory containing the source file between the double quotes.

Preventing Duplication of Header File Contents

A header file that you include into a source file can contain #include directives of its own, and this process can go on many levels deep. This feature is used extensively in large programs and in the Standard Library headers. With a complex program involving many header files, there’s a good chance that a header file may potentially be #included more than once in a source file. In some situations this may even be unavoidable. The one definition rule, however, prohibits the same definition from appearing more than once in the same translation unit. We therefore need a way to prevent this from occurring.

Note

Occasionally, a header that is included into some header A.h may even, directly or indirectly, include header A.h again. Without a mechanism to prevent the same header from being included into itself, this would introduce an infinite recursion of #includes, causing the compiler to implode into itself.

You have already seen that you don’t have to specify a value when you define an identifier:

#define MY_IDENTIFIER

This creates MY_IDENTIFIER, so it exists from here on and represents an empty character sequence. You can use the #if defined directive to test whether a given identifier has been defined and include code or not in the file depending on the result:

#if defined MY_IDENTIFIER
  // The code here will be placed in the source file if MY_IDENTIFIER has been defined.
  // Otherwise it will be omitted.
#endif

All the lines following #if defined up to the #endif directive will be kept in the file if the identifier, MY_IDENTIFIER, has been defined previously and omitted if it has not. The #endif directive marks the end of the text that is controlled by the #if defined directive. You can use the abbreviated form, #ifdef, if you prefer:

#ifdef MY_IDENTIFIER
  // The code here will be placed in the source file if MY_IDENTIFIER has been defined.
  // Otherwise it will be omitted.
#endif

You can use the #if !defined or its equivalent, #ifndef, to test for an identifier not having been defined:

#if !defined MY_IDENTIFIER
  // The code down to #endif will be placed in the source file
  // if MY_IDENTIFIER has NOT been defined. Otherwise, the code will be omitted.
#endif

Here, the lines following #if !defined down to the #endif are included in the file to be compiled provided the identifier has not been defined previously. This pattern is the basis for the mechanism that is used to ensure that the contents of a header file are not duplicated in a translation unit:

// Header file myheader.h
#ifndef MYHEADER_H
#define MYHEADER_H
  // The entire code for myheader.h is placed here.
  // This code will be placed in the source file,
  // but only if MYHEADER_H has NOT been defined previously.
#endif

If a header file, myheader.h, that has contents like this is included into a source file more than once, the first #include directive will include the code because MYHEADER_H has not been defined. In the process it will define MYHEADER_H. Any subsequent #include directives for myheader.h in the source file or in other header files that are included into the source file will not include the code because MYHEADER_H will have been defined previously.

Naturally, you should choose a unique identifier to use instead of MYHEADER_H for each header. Different naming conventions are used, although most base these names on that of the header file itself. In this book we’ll use identifiers of the form HEADERNAME_H.

The previous #ifndef - #define - #endif pattern is common enough to have its own name; this particular combination of preprocessor directives is called an #include guard. All header files should be surrounded with an #include guard to eliminate the potential for violations of the one definition rule.

Tip

Most compilers offer a #pragma directive to achieve the same effect as the pattern we have described. With nearly all compilers, placing a line containing #pragma once at the beginning of a header file is all that is necessary to prevent duplication of the contents. While nearly all compilers support this #pragma, it is not standard C++, so for this book we’ll continue to use #include guards.

Your First Header File

For your first header file, we start again from Ex10_01 and this time put the prototype declaration of the power function into its own header file. Power.cpp remains the same as before. The only difference is the way the function prototype of power() makes it into the main translation unit:

// Ex10_05.cpp
// Creating and including your own header file
#include <iostream>
#include <iomanip>
#include "Power.h"        // Contains the prototype for the power() function
int main()
{
  for (int i {-3}; i <= 3; ++i)     // Calculate powers of 8 from -3 to +3
    std::cout << std::setw(10) << power(8.0, i);
  std::cout << std::endl;
}

This source file is completely identical to Ex10_01.cpp, except that this time the declaration of power() is pulled in by including this Power.h header:

// Power.h
#ifndef POWER_H
#define POWER_H
// Function to calculate x to the power n
double power(double x, int n);
#endif

Usually, you will not create a header for each individual function. The Power.h header could be the start of a larger Math.h header that groups any number of useful, reusable mathematical functions.

Tip

Naturally the Standard Library already offers a header with mathematical functions. It’s called cmath, and it defines more than 75 different functions and function templates for common (and some far less common) mathematical functions. One of these functions, of course, is std::pow().

Namespaces

We introduced namespaces in Chapter 1, but there’s a bit more to it than we explained then. With large programs, choosing unique names for all the entities that have external linkage can become difficult. When an application is developed by several programmers working in parallel and/or when it incorporates headers and source code from various third-party C++ libraries, using namespaces to prevent name clashes becomes essential. Name clashes are perhaps most likely in the context of user-defined types, or classes, which you will meet in the next few chapters.

A namespace is a block that attaches an extra name—the namespace name—to every entity name that is declared or defined within it. The full name of each entity is the namespace name followed by the scope resolution operator, ::, followed by the basic entity name. Different namespaces can contain entities with the same name, but the entities are differentiated because they are qualified by different namespace names.

You typically use a separate namespace within a single program for each collection of code that encompasses a common purpose. Each namespace would represent some logical grouping of functions, together with any related global variables and declarations. A namespace would also be used to contain a unit of release, such as a library.

You are already aware that Standard Library names are declared within the std namespace. You also know that you can reference any name from a namespace without qualifying it with the namespace name by using a blanket using directive:

using namespace std;

However, this risks defeating the purpose of using namespaces in the first place and increases the likelihood of errors because of the accidental use of a name in the std namespace. It is thus often better to use qualified names or add using declarations for the names from another namespace that you are referencing.

Tip

Especially in header files, both using directives and using declarations are considered “not done” because they force anyone who wants to use the header’s types and functionality to #include these using directives and declarations as well. Inside source files, however, opinions differ. Personally we believe that using should be used only sporadically, for longer or nested namespaces, and as locally as possibly. So, you should definitely not use it for for the three-letter std namespace. But, like we said, opinions are divided on the subject, as with most matters of coding style. It is therefore always best to check what the conventions are within your team or company.

The Global Namespace

All the programs that you’ve written so far have used names that you defined in the global namespace . The global namespace applies by default if a namespace hasn’t been defined. All names within the global namespace are just as you declare them, without a namespace name being attached. In a program with multiple source files, all the names with linkages are within the global namespace.

To explicitly access names defined in the global namespace, you use the scope resolution operator without a left operand, for example, ::power(2.0, 3). This is only really required, though, if there is a more local declaration with the same name that hides that global name.

With small programs, you can define your names within the global namespace without running into any problems. With larger applications, the potential for name clashes increases, so you should use namespaces to partition your code into logical groupings. That way, each code segment is self-contained from a naming perspective, and name clashes are prevented.

Defining a Namespace

You can define a namespace with these statements:

namespace myRegion
{
  // Code you want to have in the namespace,
  // including function definitions and declarations,
  // global variables, enum types, templates, etc.
}

Note that no semicolon is required after the closing brace in a namespace definition. The namespace name here is myRegion. The braces enclose the scope for the namespace myRegion, and every name declared within the namespace scope has the name myRegion attached to it.

Caution

You must not include the main() function within a namespace. The runtime environment expects main() to be defined in the global namespace.

You can extend a namespace scope by adding a second namespace block with the same name. For example, a program file might contain the following:

namespace calc
{
  // This defines namespace calc
  // The initial code in the namespace goes here
}
namespace sort
{
  // Code in a new namespace, sort
}
namespace calc
{
  /* This extends the calc namespace
     Code in here can refer to names in the previous
     calc namespace block without qualification */
}

There are two blocks defined as namespace calc, separated by a namespace sort. The second calc block is treated as a continuation of the first, so functions declared within each of the calc blocks belong to the same namespace. The second block is called an extension namespace definition because it extends the original namespace definition. You can have several extension namespace definitions in a translation unit.

Of course, you wouldn’t choose to organize a source file so that it contains multiple namespace blocks in this way, but it can occur anyway. If you include several header files into a source file, then you may effectively end up with the sort of situation we just described. An example of this is when you include several Standard Library headers (each of which contributes to the namespace std), interspersed with your own header files:

#include <iostream>                    // In namespace std
#include "mystuff.h"                   // In namespace calc
#include <string>                      // In namespace std – extension namespace
#include "morestuff.h"                 // In namespace calc – extension namespace

Note that references to names from inside the same namespace do not need to be qualified. For example, names that are defined in the namespace calc can be referenced from within calc without qualifying them with the namespace name.

Let’s look at an example that illustrates the mechanics of declaring and using a namespace. Of course, now that you can create your own header files, you should organize the code nicely in multiple files from now. The program will therefore consist of two files: one header file and one source file. The header file defines a few common mathematical constants in the namespace with name constants:

// Constants.h
// Using a namespace
#ifndef CONSTANTS_H
#define CONSTANTS_H
namespace constants
{
  inline const double pi { 3.14159265358979323846 };      // the famous pi constant
  inline const double e  { 2.71828182845904523536 };      // base of the natural logarithm
  inline const double sqrt_2 { 1.41421356237309504880 };  // square root of 2
}
#endif

The #include guard makes sure that these definitions will never appear more than once in the same translation unit. That does not prevent them from being #included into two or more distinct translation units, though. All three constants are therefore defined to be inline variables as well. This allows their definitions to appear in multiple translation units without violating ODR. If your compiler does not support inline variables yet (this language feature was introduced with C++17), you have to move the variable definitions to a source file instead. The header might then look as follows:

// Constants.h
// Declares three constants that are defined externally
#ifndef CONSTANTS_H
#define CONSTANTS_H
namespace constants
{
  extern const double pi;         // the famous pi constant
  extern const double e;          // base of the natural logarithm
  extern const double sqrt_2;     // square root of 2
}
#endif

You can find the corresponding source file in the Ex10_06A directory of the online downloads. It is similar to our original Constants.h, except that there’s no #include guard, and all occurrences of the inline keyword have been replaced with extern.

Either way, the main source file can use these constants as follows:

// Ex10_06.cpp
// Using a namespace
#include <iostream>
#include "Constants.h"
int main()
{
  std::cout << "pi has the value " << constants::pi << std::endl;
  std::cout << "This should be 2: " << constants::sqrt_2 * constants::sqrt_2 << std::endl;
}

This example produces the following output:

pi has the value 3.14159
This should be 2: 2

Applying using Declarations

Just to formalize what we have been doing in previous examples, we’ll remind you of the using declaration for a single name from a namespace:

using namespace_name::identifier;

using is a keyword, namespace_name is the name of the namespace, and identifier is the name that you want to use unqualified. This declaration introduces a single name from the namespace, which could represent anything that has a name. For instance, a set of overloaded functions defined within a namespace can be introduced with a single using declaration.

Although we’ve placed using declarations and directives at global scope in the examples, you can also place them within a namespace, within a function, or even within a statement block. In each case, the declaration or directive applies until the end of the block that contains it.

Note

When you use an unqualified name, the compiler first tries to find the definition in the current scope, prior to the point at which it is used. If the definition is not found, the compiler looks in the immediately enclosing scope. This continues until the global scope is reached. If a definition for the name is not found at global scope (which could be an extern declaration), the compiler concludes that the name is not defined.

Functions and Namespaces

For a function to exist within a namespace , it is sufficient for the function prototype to appear in the namespace. You can define the function elsewhere using the qualified name for the function; in other words, the function definition doesn’t have to be enclosed in a namespace block (it can, but it doesn’t have to be). Let’s explore an example. Suppose you write two functions, max() and min(), that return the maximum and minimum of a vector of values. You can put the declarations for the functions in a namespace as follows:

// compare.h
// For Ex10_07.cpp
#ifndef COMPARE_H
#define COMPARE_H
#include <vector>
namespace compare
{
  double max(const std::vector<double>& data);
  double min(const std::vector<double>& data);
}
#endif

This code would be in a header file, compare.h, which can be included by any source file that uses the functions. The definitions for the functions can now appear in a .cpp file. You can write the definitions without enclosing them in a namespace block, as long as the name of each function is qualified with the namespace name. The contents of the file would be as follows:

// compare.cpp
// For Ex10_07.cpp
#include "compare.h"
#include <limits>             // For std::numeric_limits<>::infinity()
// Function to find the maximum
double compare::max(const std::vector<double>& data)
{
  double result { -std::numeric_limits<double>::infinity() };
  for (const auto value : data)
    if (value > result) result = value;
  return result;
}
// Function to find the minimum
double compare::min(const std::vector<double>& data)
{
  double result { std::numeric_limits<double>::infinity() };
  for (const auto value : data)
    if (value < result) result = value;
  return result;
}

You need the compare.h header file to be included so that the compare namespace is identified. This tells the compiler to deduce that the min() and max() functions are declared within the namespace. Next, there’s an #include directive for limits. This Standard Library header provides facilities to query for properties and special values of numeric data types. In this particular case, we use it to obtain the special value that the fundamental double type normally has for infinity. Positive infinity (mathematical notation +∞) is a special double value greater than any other double value, and negative infinity (-∞) is one that is less than any other double. The standard syntax to obtain these special values in C++ is std::numeric_limits<double>::infinity(). They readily allow us to write min() and max() functions that do something sensible for empty vectors as well.

Note that there’s an #include directive for the vector header that is also included in compare.h. The contents of the vector header will appear only once in this file because all the Standard Library headers have preprocessing directives to prevent duplication. In general, it can sometimes be a good idea to have #include directives for every header that a file uses, even when one header may include another header that you use. This makes the file independent of potential changes to the header files.

You could place the code for the function definitions within the compare namespace directly as well. In that case, the contents of compare.cpp would be as follows:

#include <vector>
#include <limits>             // For std::numeric_limits<>::infinity()
namespace compare
{
  double max(const std::vector<double>& data)
  {
    // Code for max() as above...
  }
  double min(const std::vector<double>& data)
  {
    // Code for min() as above...
  }
}

If you write the function definitions in this way, then you don’t need to #include compare.h into this file. This is because the definitions are within the namespace. Doing it this way, however, is really unconventional. You’d normally #include a header MyFunctionality.h in a source file with the same base name, MyFunctionality.cpp, and define all functions there by explicitly qualifying them with their namespace using ::.

Using the min() and max() functions is the same, however you have defined them. To confirm how easy it is, let’s try it with the functions you’ve just defined. Create the compare.h header file with the contents we discussed earlier. Create the first version of compare.cpp where the definitions are not defined in a namespace block. All you need now is a .cpp file containing the definition of main() to try the functions:

// Ex10_07.cpp
// Using functions in a namespace
#include <iostream>
#include <vector>
#include "compare.h"
using compare::max;                    // Using declaration for max
int main()
{
  using compare::min;                  // Using declaration for min
  std::vector<double> data {1.5, 4.6, 3.1, 1.1, 3.8, 2.1};
  std::cout << "Minimum double is " << min(data) << std::endl;
  std::cout << "Maximum double is " << max(data) << std::endl;
}

If you compile the two .cpp files and link them, executing the program produces the following output:

Minimum double is 1.1
Maximum double is 4.6

There is a using declaration for both functions in compare.h, so you can use the names without having to add the namespace name. Just to show you that it’s possible, we’ve added the declaration for compare::max() in the global scope and that for compare::min() at function scope. The result is that max() can be used unqualified in the entire source file, but min() only within the main() function. Of course, this distinction would’ve made much more sense had there been multiple functions in your source file.

In this case, you could equally well have used a using directive for the compare namespace in this case:

using namespace compare;

The namespace only contains the functions max() and min(), so this would have been just as good and one less line of code. You can again insert this using namespace directive either at the global or function scope. Either way, the semantics are as you’d expect.

Without the using declarations for the function names (or a using directive for the compare namespace), you would have to qualify the functions like this:

  std::cout << "Minimum double is " << compare::min(data) << std::endl;
  std::cout << "Maximum double is " << compare::max(data) << std::endl;

Unnamed Namespaces

You don’t have to assign a name to a namespace, but this doesn’t mean it doesn’t have a name. You can declare an unnamed namespace with the following code:

namespace
{
   // Code in the namespace, functions, etc.
}

This creates a namespace that has a unique internal name that is generated by the compiler. Only one “unnamed” namespace exists within each translation unit, so additional namespace declarations without a name will be extensions of the first. However, unnamed namespaces within distinct translation units always are distinct unnamed namespaces.

Note that an unnamed namespace is not within the global namespace. This fact, combined with the fact that an unnamed namespace is unique to a translation unit, has significant consequences. It means that functions, variables, and anything else declared within an unnamed namespace are local to the translation unit in which they are defined. They can’t be accessed from another translation unit. It should come therefore as no surprise that the compiler assigns internal linkage to all names declared in an unnamed namespace.

In other words, placement of function definitions within an unnamed namespace has the same effect as declaring the functions as static in the global namespace. Declaring functions and variables as static at global scope used to be how one ensured they weren’t accessible outside their translation unit. However, as noted already before, this practice is de facto deprecated. An unnamed namespace is a much better way of restricting accessibility where necessary.

Tip

All names declared inside an unnamed namespace have internal linkage (even names defined with an extern specifier). If a function is not supposed to be accessible from outside a particular translation unit, you should always define it in an unnamed namespace. Using a static specifier for this purpose is no longer recommended.

Nested Namespaces

You can define one namespace inside another. The mechanics of this are easiest to understand if we look at a specific context. For instance, suppose you have the following nested namespaces :

//outin.h
#ifndef OUTIN_H
#define OUTIN_H
#include <vector>
namespace outer
{
  double max(const std::vector<double>& data)
  {
    // body code...
  }
  double min(const std::vector<double>& data)
  {
   // body code...
  }
  namespace inner
  {
    void normalize(std::vector<double>& data)
    {
       // ...
       double minValue{ min(data) };   // Calls min() in outer namespace
       // ...
    }
  }
}
#endif // OUTIN_H

From within the inner namespace, the normalize() function can call the function min() in the namespace outer without qualifying the name. This is because the declaration of normalize() in the inner namespace is also within the outer namespace.

To call min() from the global namespace, you qualify the function name in the usual way:

double result{ outer::min(data) };

Of course, you could use a using declaration for the function name or specify a using directive for the namespace. To call normalize() from the global namespace, you must qualify the function name with both namespace names:

outer::inner::normalize(data);

The same applies if you include the function prototype within the namespace and supply the definition separately. You could write just the prototype of normalize() within the inner namespace and place the definition of normalize() in the file outin.cpp:

// outin.cpp
#include "outin.h"
void outer::inner::normalize(std::vector<double>& data)
{
  // ...
  double minValue{ min(data) };          // Calls min() in outer
  // ...
}

To compile this successfully, the compiler needs to know about the namespaces. Therefore, outin.h, which we #include here prior to the function definition, needs to contain the namespace declarations.

To declare or define something in the nested inner namespace, you must always nest the block for the inner namespace inside a block for the outer namespace. Here’s an example:

namespace outer
{
  namespace inner
  {
     double average(const std::vector<double>& data) { /* body code... */ }
  }
}

If you would have defined the average() function without the surrounding namespace outer block, you’d have defined a new namespace called inner next to—instead of nested into—outer:

namespace inner
{
   double average(const std::vector<double>& data) { /* body code... */ }
}

In other words, the average() function would then have to be qualified as inner::average(data), instead of outer::inner::average(data).

Because defining inside nested namespaces that way can become cumbersome quite fast—especially as the number of levels grows to, say, three or more—the latest C++17 version of the language has introduced a new, more convenient syntax for this. In C++17, you can write our earlier example like this:

namespace outer::inner
{
  double average(const std::vector<double>& data) { /* body code... */ }
}

Namespace Aliases

In a large program with multiple development groups, long namespace names or more deeply nested namespaces may become necessary to ensure that you don’t have accidental name clashes (although we’d probably advise you to avoid this if at all possible). Such long names may be unduly cumbersome to use; having to attach names such as Group5_Process3_Subsection2 or Group5::Process3::Subsection2 to every function call would be more than a nuisance. To get over this, you can define an alias for a namespace name on a local basis. The general form of the statement you’d use to define an alias for a namespace name is as follows:

namespace alias_name = original_namespace_name;

You can then use alias_name in place of original_namespace_name to access names within the namespace. For example, to define an alias for the namespace name in the previous paragraph, you could write this:

namespace G5P3S2 = Group5::Process3::Subsection2;

Now you can call a function within the original namespace with a statement such as this:

int maxValue {G5P3S2::max(data)};

Logical Preprocessing Directives

The logical #if works in essentially the same way as an if statement in C++. Among other things this allows conditional inclusion of code and/or further preprocessing directives in a file, depending on whether preprocessing identifiers have been defined or based on identifiers having specific values. This is particularly useful when you want to maintain one set of code for an application that may be compiled and linked to run in different hardware or operating system environments. You can define preprocessing identifiers that specify the environment for which the code is to be compiled and select code or #include directives accordingly.

The Logical #if Directive

You have seen in the context of managing the contents of a header file that a logical #if directive can test whether a symbol has been previously defined. Suppose you put the following code in your program file:

 // Code that sets up the array data[]...
 #ifdef CALC_AVERAGE
  double average {};
  for (size_t i {}; i < std::size(data); ++i)
    average += data[i];
  average /= std::size(data);
  std::cout << "Average of data array is " << average << std::endl;
 #endif
 // rest of the program...

If the identifier CALC_AVERAGE has been defined by a previous preprocessing directive, the code between the #if and #endif directives is compiled as part of the program. If CALC_AVERAGE has not been defined, the code won’t be included. You used a similar technique before to create #include guards that protect the contents of a header file from multiple inclusions into source files.

You can also use the #if directive, though, to test whether a constant expression is true. Let’s explore that a bit further.

Testing for Specific Identifier Values

The general form of the #if directive is as follows:

#if constant_expression

The constant_expression must be an integral constant expression that does not contain casts. All arithmetic operations are executed with the values treated as type long or unsigned long, though Boolean operators (||, &&, and !) are definitely supported as well. If the value of constant_expression evaluates to nonzero, then lines following the #if down to the #endif will be included in the code to be compiled. The most common application of this uses simple comparisons to check for a particular identifier value. For example, you might have the following sequence of statements:

#if ADDR == 64
  // Code taking advantage of 64-bit addressing...
#endif

The statements between the #if directive and #endif are included in the program here only if the identifier ADDR has been defined as 64 in a previous #define directive.

Tip

There is no cross-platform macro identifier to detect whether the current target platform uses 64-bit addressing. Most compilers, however, do offer some platform-specific macro that it will define for you whenever it’s targeting a 64-bit platform. A concrete test that should work for the Visual C++, GCC, and Clang compilers, for instance, would look something like this:

#if _WIN64 || __x86_64__ || __ppc64__
  // Code taking advantage of 64-bit addressing...
#endif

Consult your compiler documentation for these and other predefined macro identifiers.

Multiple-Choice Code Selection

The #else directive works in the same way as the C++ else statement, in that it identifies a sequence of lines to be included in the file if the #if condition fails. This provides a choice of two blocks, one of which will be incorporated into the final source. Here’s an example:

#if ADDR == 64
  std::cout << "64-bit addressing version." << std::endl;
  // Code taking advantage of 64-bit addressing...
#else
  std::cout << "Standard 32-bit addressing version." << std::endl;
  // Code for older 32-bit processors...
#endif

One or the other of these sequences of statements will be included in the file, depending on whether or not ADDR has been defined as 64.

There is a special form of #if for multiple-choice selections. This is the #elif directive, which has the following general form:

#elif constant_expression

Here is an example of how you might use this:

#if LANGUAGE == ENGLISH
  #define Greeting "Good Morning."
#elif LANGUAGE == GERMAN
  #define Greeting "Guten Tag."
#elif LANGUAGE == FRENCH
  #define Greeting "Bonjour."
#else
  #define Greeting "Ola."
#endif
  std::cout << Greeting << std::endl;

With this sequence of directives, the output statement will display one of a number of different greetings, depending on the value assigned to LANGUAGE in a previous #define directive.

Caution

Any undefined identifiers that appear after the conditional directives #if and #elif are replaced with the number 0. This implies that, should LANGUAGE for instance not be defined in the earlier example, it may still compare equal to ENGLISH should that be either undefined or explicitly defined to be zero.

Another possible use is to include different code depending on an identifier that represents a version number:

#if VERSION == 3
  // Code for version 3 here...
#elif VERSION == 2
  // Code for version 2 here...
#else
  // Code for original version 1 here...
#endif

This allows you to maintain a single source file that compiles to produce different versions of the program depending on how VERSION has been set in a #define directive.

Tip

Your compiler likely allows you to specify the value of preprocessing identifiers by passing a command-line argument to the compiler (if you’re using a graphical IDE, there should be a corresponding properties dialog somewhere). That way you can compile different versions or configurations of the same program without changing any code.

Standard Preprocessing Macros

There are several standard predefined preprocessing macros, and the most useful are listed in Table 10-3.
Table 10-3.

Predefined Preprocessing Macros

Macro

Description

__LINE__

The line number of the current source line as a decimal integer literal.

__FILE__

The name of the source file as a character string literal.

__DATE__

The date when the source file was preprocessed as a character string literal in the form Mmm dd yyyy. Here, Mmm is the month in characters, (Jan, Feb, etc.); dd is the day in the form of a pair of characters 1 to 31, where single-digit days are preceded by a blank; and yyyy is the year as four digits (such as 2014).

__TIME__

The time at which the source file was compiled, as a character string literal in the form hh:mm:ss, which is a string containing the pairs of digits for hours, minutes, and seconds separated by colons.

__cplusplus

A number of type long that corresponds to the highest version of the C++ standard that your compiler supports. This number is of the form yyyymm, where yyyy and mm represent the year and month in which that version of the standard was approved. At the time of writing, possible values are 199711 for nonmodern C++, 201103 for C++11, 201402 for C++14, and 201703 for C++17. Compilers may use intermediate numbers to signal support for earlier drafts of the standard as well.

Note that each of the macro names in Table 10-3 start, and most end, with two underscore characters. The __LINE__ and __FILE__ macros expand to reference information relating to the source file. You can modify the current line number using the #line directive, and subsequent line numbers will increment from that. For example, to start line numbering at 1000, you would add this directive:

#line 1000

You can use the #line directive to change the string returned by the __FILE__ macro. It usually produces the fully qualified file name, but you can change it to whatever you like. Here’s an example:

#line 1000 "The program file"

This directive changes the line number of the next line to 1000 and alters the string returned by the __FILE__ macro to "The program file". This doesn’t alter the file name, just the string returned by the macro. Of course, if you just wanted to alter the apparent file name and leave the line numbers unaltered, the best you can do is to use the __LINE__ macro in the #line directive:

#line __LINE__ "The program file"

It depends on the implementation what exactly happens after this directive. There are two possibly outcomes: either the line numbers remain unaltered or they are all decremented by one (it depends on whether the value returned by __LINE__ takes the line upon which the #line directive appears into account).

You can use the date and time macros to record when your program was last compiled with a statement such as this:

std::cout << "Program last compiled at " << __TIME__ << " on "<< __DATE__ << std::endl;

When this statement is compiled, the values displayed by the statement are fixed until you compile it again. Thus, the program outputs the time and date of its last compilation. These macros can be useful for use in either about screens or log files.

Testing for Available Headers

Each version of the Standard Library provides a multitude of new header files defining new features. These new features and functionalities allow you to write code that would’ve taken a lot more effort before or that would’ve been less performant or less robust. On the one hand, you therefore normally always want to use the best and latest that C++ has to offer. On the other hand, however, your code is sometimes supposed to compile and run correctly with multiple compilers—either multiple versions of the same compiler or different compilers for different target platforms. This sometimes requires a way for you to test, at compile time, what features the current compiler supports to enable or disable different versions of your code.

The __has_include() macro, recently introduced by C++17, can be used to check for the availability of any header file. Here’s an example:

#if __has_include(<SomeStandardLibaryHeader>)
  #include <SomeStandardLibaryHeader>
  // ... Definitions that use functionality of some Standard Library header ...
#elif __has_include("SomeHeader.h")
  #include "SomeHeader.h"
  // ... Alternative definitions that use functionality of SomeHeader.h ...
#else
  #error("Need at least SomeStandardLibaryHeader or SomeHeader.h")
#endif

We’re sure that you can figure out how this works from this example alone.

Tip

The __has_include() macro itself is still very new. You can check whether it is supported using the #ifdef directive, as in #ifdef __has_include.

Debugging Methods

Most of your programs will contain errors, or bugs, when you first complete them. There are many ways in which bugs can arise. Most simple typos will be caught by the compiler, so you’ll find these immediately. Logical errors or failing to consider all possible variations in input data will take longer to find. Debugging is the process of eliminating these errors. Debugging a program represents a substantial proportion of the total time required to develop it. The larger and more complex the program, the more bugs it’s likely to contain and the more time and effort you’ll need to make it run properly. Very large programs—operating systems, for example, or complex applications such as word processing systems or even the C++ program development system that you may be using at this moment—can be so complex that the system will never be completely bug free. You will already have some experience with this through the regular patches and updates to the operating system and some of the applications on your computer. Most bugs in this context are relatively minor and don’t limit the usability of the product greatly. The most serious bugs in commercial products tend to be security issues.

Your approach to writing a program can significantly affect how difficult it will be to test and debug. A well-structured program that consists of compact functions, each with a well-defined purpose, is much easier to test than one without these attributes. Finding bugs will also be easier with a program that has well-chosen variable and function names and comments that document the operation and purpose of its component functions. Good use of indentation and statement layout can also make testing and fault-finding simpler.

It is beyond the scope of this book to deal with debugging comprehensively. The book concentrates on the standard C++ language and library, independent of any particular C++ development system, and it’s more than likely you’ll be debugging your programs using tools that are specific to the development system you have. Nevertheless, we’ll explain some basic ideas that are common to most debugging systems. We’ll also introduce the rather elementary debugging aids within the Standard Library.

Integrated Debuggers

Many C++ compilers come with a program development environment that has extensive debugging tools built in. These potentially powerful facilities can dramatically reduce the time needed to get a program working, and if you have such a development environment, familiarizing yourself with how you use it for debugging will pay substantial dividends. Common tools include the following:
  • Tracing program flow: This allows you to execute a program by stepping through the source code one statement at a time. It continues with the next statement when you press a designated key. A program may have to be compiled in what is commonly called debug mode to make this possible. Other provisions of the debug environment usually allow you to display information about all relevant variables at each pause.

  • Setting breakpoints: Stepping through a large program one statement at a time can be tedious. It may even be impossible to step through the program in a reasonable period of time. Stepping through a loop that executes 10,000 times is an unrealistic proposition. Breakpoints identify specific statements in a program at which execution pauses to allow you to check the program state. Execution continues to the next breakpoint when you press a specified key.

  • Setting watches: A watch identifies a specific variable whose value you want to track as execution progresses. The values of variables identified by watches you have set are displayed at each pause point. If you step through your program statement by statement, you can see the exact point at which values are changed and sometimes when they unexpectedly don’t change.

  • Inspecting program elements: You can usually examine a variety of program components when execution is paused. For example, at breakpoints you can examine details of a function, such as its return type and its arguments, or information relating to a pointer, such as its location, the address it contains, and the data at that address. It is sometimes possible to access the values of expressions and to modify variables. Modifying variables can often allow problem areas to be bypassed, allowing subsequent code to be executed with correct data.

Preprocessing Directives in Debugging

Although many C++ development systems provide powerful debug facilities, adding your own tracing code can still be useful. You can use conditional preprocessing directives to include blocks of code to assist during testing and omit the code when testing is complete. You can control the formatting of data that will be displayed for debugging purposes, and you can arrange for the output to vary according to conditions or relationships within the program.

We’ll illustrate how you can use preprocessing directive to help with debugging through a somewhat contrived program. This example also gives you a chance to review a few of the techniques that you should be familiar with by now. Just for this exercise you’ll declare three functions that you’ll use in the example within a namespace, fun. First, you’ll put the namespace declaration in a header file:

// functions.h
#ifndef FUNCTIONS_H
#define FUNCTIONS_H
namespace fun
{
  // Function prototypes
  int sum(int, int);                   // Sum arguments
  int product(int, int);               // Product of arguments
  int difference(int, int);            // Difference between arguments
}
#endif

Enclosing the contents of the header file between an #include guard prevents the contents from being #included into a translation unit more than once. The prototypes are defined within the namespace, fun, so the function names are qualified with fun, and the function definitions must appear in the same namespace.

You can put the functions definitions in the file functions.cpp :

// functions.cpp
//#define TESTFUNCTION                 // Uncomment to get trace output
#ifdef TESTFUNCTION
#include <iostream>                    // Only required for trace output...
#endif
#include "functions.h"
// Definition of the function sum
int fun::sum(int x, int y)
{
  #ifdef TESTFUNCTION
  std::cout << "Function sum called." << std::endl;
  #endif
  return x+y;
}
/* The definitions of the functions product() and difference() are analogous... */

You only need the iostream header because you use stream output statements to provide trace information in each function. The iostream header will be included and the output statements compiled only if the identifier TESTFUNCTION is defined in the file. TESTFUNCTION isn’t defined at present because the directive is commented out.

The main() function is in a separate .cpp file:

// Ex10_08.cpp
// Debugging using preprocessing directives
#include <iostream>
#include <cstdlib>                       // For random number generator
#include <ctime>                         // For time function
#include "functions.h"
#define TESTRANDOM
// Function to generate a random integer 0 to count-1
size_t random(size_t count)
{
  return static_cast<size_t>(std::rand() / (RAND_MAX / count + 1));
}
int main()
{
  const int a{10}, b{5};               // Some arbitrary values
  int result{};                        // Storage for results
  const size_t num_functions {3};
  std::srand(static_cast<unsigned>(std::time(nullptr))); // Seed random generator
  // Select function at random
  for (size_t i{}; i < 5; i++)
  {
    size_t select = random(num_functions); // Generate random number (0 to num_functions-1)
#ifdef TESTRANDOM
    std::cout << "Random number = " << select << ' ';
    if (select >= num_functions)
    {
      std::cout << "Invalid random number generated!" << std::endl;
      return 1;
    }
#endif
    switch (select)
    {
      case 0: result = fun::sum(a, b); break;
      case 1: result = fun::product(a, b); break;
      case 2: result = fun::difference(a, b); break;
    }
    std::cout << "result = " << result << std::endl;
  }
}

Here’s an example of the output:

Random number = 2 result = 5
Random number = 2 result = 5
Random number = 1 result = 50
Random number = 0 result = 15
Random number = 1 result = 50

In general, you should get something different. If you want to get the trace output for the functions in the namespace fun as well, you must uncomment the #define directive at the beginning of functions.cpp.

The #include directive for functions.h adds the prototypes for sum(), product(), and difference(). The functions are defined within the namespace fun. These functions are called in main() using a random number and a switch statement. The number is produced by random(). The Standard Library function rand() from cstdlib that is called in random() generates a sequence of pseudorandom numbers of type int in the range 0 to RAND_MAX, where RAND_MAX is a symbol defined as an integer in the cstdlib header. Somehow the range of values returned by rand() therefore needs to be scaled to the range of values you need. You need to take care, however, what expression you use to do this. The expression rand() % count, for instance, would work but is known to produce numbers that are distressingly nonrandom. The expression we used in Ex10_08 has been proven to fare much better (trust us, it works!), provided count is sufficiently small compared to RAND_MAX.

Caution

The rand() function in the stdlib header does not generate random numbers that have satisfactory properties for applications that require truly random numbers (such as cryptography). It is acceptable (just) for the simplest of applications, but for any more serious use of random numbers, we recommend that you investigate the functionality provided by the random header of the Standard Library. The details of this extensive and relatively complex random number generation library are outside the scope of this book, though.

You must initialize the sequence that rand() produces before the first rand() call by passing an unsigned integer seed value to srand(). Each different seed value will typically result in a different integer sequence from successive rand() calls. The time() function that is declared in the ctime header returns the number of seconds since January 1, 1970, as an integer, so using this as the argument to srand() ensures that you get a different random sequence each time the program executes.

Defining the identifier TESTRANDOM in Ex10_08.cpp switches on diagnostic output in main(). With TESTRANDOM defined, the code to output diagnostic information in main() will be included in the source that is compiled. If you remove the #define directive, the trace code will not be included. The trace code checks to make sure you use a valid number for the switch statement. Because you don’t expect to generate invalid random values, you shouldn’t get this output!

Tip

It’s easy to generate invalid values and verify the diagnostic code works. To do this, the random() function must generate a number other than 0, 1, or 2. If you add 1 to the value produced in the return statement, you should get an illegal value roughly 33 percent of the time.

If you define the TESTFUNCTION identifier in functions.cpp, you’ll get trace output from each function. This is a convenient way of controlling whether the trace statements are compiled into the program. You can see how this works by looking at one of the functions that may be called, product():

int fun::product(int x, int y)
{
#ifdef TESTFUNCTION
  std::cout << "Function product called." << std::endl;
#endif
  return x * y;
}

The output statement simply displays a message, each time the function is called, but the output statement will be compiled only if TESTFUNCTION has been defined. A #define directive for a preprocessing symbol such as TESTFUNCTION is local to the source file in which it appears, so each source file that requires TESTFUNCTION to be defined needs to have its own #define directive. One way to manage this is to put all your directives that control trace and other debug output into a separate header file. You can then include this into all your .cpp files. In this way, you can alter the kind of debug output you get by making adjustments to this one header file.

Of course, diagnostic code is included only while you are testing the program. Once you think the program works, you quite sensibly leave it out. Therefore, you need to be clear that this sort of code is no substitute for error detection and recovery code that deals with unfortunate situations arising in your fully tested program (as they most certainly will).

Tip

Some compilers define a specific macro if and only if the code is being compiled in debug mode. For Visual C++, for instance, that macro is _DEBUG. At times, it’s interesting to use such macro identifiers to control the inclusion of debugging statements.

Using the assert() Macro

The assert() preprocessor macro is defined in the Standard Library header cassert. This enables you to test logical expressions in your program. Including a line of the form assert(expression) results in code that causes the program to be terminated with a diagnostic message if expression evaluates to false. We can demonstrate this with this simple example:

// Ex10_09.cpp
// Demonstrating assertions
#include <iostream>
#include <cassert>
int main()
{
  int y {5};
  for (int x {}; x < 20; ++x)
  {
    std::cout << "x = " << x << " y = " << y << std::endl;
    assert(x < y);
  }
}

You should see an assertion message in the output when the value of x reaches 5. The program is terminated by the assert() macro by calling std::abort() when x < y evaluates to false. The abort() function is from the Standard Library, and its effect is to terminate the program immediately. As you can see from the output, this happens when x reaches the value 5. The macro displays the output on the standard error stream, cerr, which is always the command line. The message contains the condition that failed and also the file name and line number in which the failure occurred. This is particularly useful with multifile programs, where the source of the error is pinpointed exactly.

Assertions are often used for critical conditions in a program where if certain conditions are not met, disaster will surely ensue. You would want to be sure that the program wouldn’t continue if such errors arise. You can use any logical expression as the argument to the assert() macro, so you have a lot of flexibility.

Using assert() is simple and effective, and when things go wrong, it provides sufficient information to pin down where the program has terminated.

Tip

Some debuggers, in particular those integrated into graphical IDEs, allow you to pause each time an assertion is triggered, right before the application terminates. This greatly increases the value of assertions during debugging sessions.

Switching Off assert() Macros

You can switch off the preprocessor assertion mechanism when you recompile the program by defining NDEBUG at the beginning of the program file:

#define NDEBUG

This causes all assertions in the translation unit to be ignored. If you add this #define at the beginning of Ex10_09.cpp, you’ll get output for all values of x from 0 to 19 and no diagnostic message. Note that this directive is effective only if it’s placed before the #include statement for cassert.

Tip

Most compilers also allow you to define macros such as NDEBUG globally for all source and header files at once (for instance by passing a command-line argument or by filling in some field in your IDE’s configuration windows). Often NDEBUG is defined that way for fully optimizing so-called “release” configurations but not for the configurations that are used during debugging. Consult your compiler’s documentation for more details.

Caution

assert() is for detecting programming errors, not for handling errors at runtime. Evaluation of the logical expression shouldn’t cause side effects or be based on something beyond the programmer’s control (such as whether opening a file succeeds). Your program should include code to handle all error conditions that might be expected to occur occasionally.

Static Assertions

Static assertions, unlike the assert() macro, are part of the C++ language itself. That is, they are no Standard Library addition but built into the language. The assert() macro is for checking conditions dynamically, at runtime, whereas static assertions are for checking conditions statically, at compile time.

A static assertion is a statement of either of the following forms:

static_assert(constant_expression);
static_assert(constant_expression, error_message);

static_assert is a keyword, constant_expression must produce a result at compile time that can be converted to type bool, and error_message is an optional string literal. If constant_expression evaluates to false, then the compilation of your program should fail. The compiler will abort the compilation and output a diagnostics message that contains error_message if you provided it. If you did not specify an error_message, the compiler will generate one for you (usually based on constant_expression). When constant_expression is true, a static assertion does nothing.

Note

The fact that you can omit the error_message string literal in static assertions is new in C++17.

The compiler needs to be able to evaluate constant_expression during compilation. This limits the range of expressions you can use. Typical static_assert() expressions consist of literals, const variables that are initialized by literals, macros, the sizeof() operator, template arguments, and so on. A static assertion cannot, for instance, check the size() of a std::string or use the value of a function argument or any other non-const variable—such expressions can be evaluated only at runtime.

As a first example, suppose that your program does not support 32-bit compilation, for instance, because it needs to address more than 2GB of memory. Then you could put the following static assertion anywhere in your source file:

static_assert(sizeof(int*) > 4, "32-bit compilation is not supported.");  

As you know, the sizeof operator evaluates to the number of bytes that is used to represent a type or variable. For a 32-bit program, any pointer occupies 32 bits, or 4 bytes. Note that we picked int*, but any pointer type would do. Obviously the compiler knows the size of an int* pointer at compile time. Adding this static assertion will thus ensure that you cannot inadvertently compile as a 32-bit program.

Note

With every new edition of the C++ standard, and C++14 in particular, the range of expressions and functions that compilers should be able to evaluate at compile time is increasing. You can define all kinds of functions, variables, and even lambda expressions (C++17) that it should be able to evaluate statically simply by adding the constexpr keyword to their declaration. Naturally, such functions remain bound by certain restrictions; not everything is possible at compile time. The use of constexpr, however, is beyond the scope of this book.

A common use for static assertions is in template definitions to verify the characteristics of a template parameter. Suppose that you define a function template for computing the average of elements of type T. Clearly, this is an arithmetic operation, so you want to be sure the template cannot be used with collections of non-numeric types. A static assertion can do that:

// average.h
#ifndef AVERAGE_H
#define AVERAGE_H
#include <type_traits>
#include <vector>
#include <cassert>
template<typename T>
T average(const std::vector<T>& values)
{
  static_assert(std::is_arithmetic_v<T>,
                "Type parameter for average() must be arithmetic.");  
  assert(!values.empty());    // Not possible using static_assert()!
  T sum {};
  for (auto& value : values)
    sum += value;
  return sum / values.size();
}
#endif

A static assertion inside a function template gets evaluated each time the compiler instantiates the template with a given list of arguments—again always at compile time. At that time, the compiler knows the properties of the types assigned to type template parameters such as T, as well as the values assigned to any non-type template parameters. Static assertions pertaining to properties of types typically employ templates defined in the type_traits Standard Library header. One example of such a so-called type trait is is_arithmetic<T>. This type trait has a value member, which you can access as is_arithmetic<T>::value and which will be true if T is an arithmetic type and false otherwise. An arithmetic type is any floating-point or integral type. Since C++17, you can also write is_arithmetic_v<T> instead of is_arithmetic<T>::value.

Note

The type_traits header contains a large number of type testing templates including is_integral_v<T>, is_signed_v<T>, is_unsigned_v<T>, is_floating_point_v<T>, and is_enum_v<T>. There are many other useful templates in the type_traits header, and it is well worth exploring the contents further, especially once you have learned about class templates in Chapter 16.

You cannot statically assert, however, that the given values vector must be nonempty. After all, the same instance of this function template may be called many times, each time with vectors of different size. In general, the compiler has no way of knowing whether this vector will be empty or not—the only way to know this is to check the size during the execution of your program. In other words, asserting this condition is clearly a job for the assert() macro!

You can use the following program to see static assertions in action:

// Ex10_10.cpp
// Using a static assertion
#include <vector>
#include <iostream>
#include <string>
#include "average.h"
int main()
{
   std::vector<double> vectorData {1.5, 2.5, 3.5, 4.5};
   std::cout << "The average of vectorData is " << average(vectorData) << std::endl;
// Uncomment the next lines for compiler errors...
//   std::vector<std::string> words {"this", "that", "them", "those"};
//   std::cout << "The average of words values is " << average(words) << std::endl;
   std::vector<float> emptyVector;
   average(emptyVector);                // Will trigger a runtime assertion!
}

In case you were wondering, yes, even if we had not added the static_assert() statement, the average() template would still fail to instantiate for nonarithmetic types such as std::string. After all, you cannot divide a string by its size. Depending on your compiler, the error message you will then get may be rather cryptic. To see the difference, we invite you to remove the static assertion from the average() template, uncomment the two lines of the test program, and then recompile. Static assertions like this are thus added sometimes not so much to make the compilation fail but rather to provide more helpful diagnostic messages.

Summary

This chapter discussed capabilities that operate between, within, and across program files. C++ programs typically consist of many files, and the larger the program, the more files you have to contend with. It’s vital that you really understand namespaces, preprocessing, and debugging techniques if you are to develop real-world C++ programs.

The important points from this chapter include the following:

  • Each entity in a translation unit must have only one definition. If multiple definitions are allowed throughout a program, they must still all be identical.

  • A name can have internal linkage, meaning that the name is accessible throughout a translation unit; external linkage, meaning that the name is accessible from any translation unit; or it can have no linkage, meaning that the name is accessible only in the block in which it is defined.

  • You use header files to contain definitions and declarations required by your source files. A header file can contain template and type definitions, enumerations, constants, function declarations, inline function and variable definitions, and named namespaces. By convention, header file names use the extension .h.

  • Your source files will typically contain the definitions for all non-inline functions and variables declared in the corresponding header. A C++ source file usually has the file name extension .cpp.

  • You insert the contents of a header file into a .cpp file by using an #include directive.

  • A .cpp file is the basis for a translation unit that is processed by the compiler to generate an object file.

  • A namespace defines a scope; all names declared within this scope have the namespace name attached to them. All declarations of names that are not in an explicit namespace scope are in the global namespace.

  • A single namespace can be made up of several separate namespace declarations with the same name.

  • Identical names that are declared within different namespaces are distinct.

  • To refer to an identifier that is declared within a namespace from outside the namespace, you need to specify the namespace name and the identifier, separated by the scope resolution operator, ::.

  • Names declared within a namespace can be used without qualification from inside the namespace.

  • The preprocessing phase executes directives to transform the source code in a translation unit prior to compilation. When all directives have been processed, the translation unit will only contain C++ code, with no preprocessing directives remaining.

  • You can use conditional preprocessing directives to ensure that the contents of a header file are never duplicated within a translation unit.

  • You can use conditional preprocessing directives to control whether trace or other diagnostic debug code is included in your program.

  • The assert() macro enables you to test logical conditions during execution and issue a message and abort the program if the logical condition is false.

  • You can use static_assert to check type arguments for template parameters in a template instance to ensure that a type argument is consistent with the template definition.

Exercises

The following exercises enable you to try what you’ve learned in this chapter. If you get stuck, look back over the chapter for help. If you’re still stuck after that, you can download the solutions from the Apress website ( www.apress.com/source-code ), but that really should be a last resort.

  • Exercise 10-1. Write a program that calls two functions, print_this(std::string_view) and print_that(std::string_view), each of which calls a third function, print(std::string_view), to print the string that is passed to it. Define each function and main() in separate source files, and create three header files to contain the prototypes for print_this(), print_that(), and print().

  • Exercise 10-2. Modify the program from Exercise 10-1 so that print() uses a global integer variable to count the number of times it has been called. Output the value of this variable in main() after calls to print_this() and print_that().

  • Exercise 10-3. In the print.h header file from Exercise 10-1, delete the existing prototype for print(), and instead create two namespaces, print1 and print2, each of which contains a print(string_view) function. Implement both functions in the print.cpp file so that they print the namespace name and the string. Change print_this() so that it calls print() defined in the print1 namespace, and change print_that() to call the version in the print2 namespace. Run the program, and verify that the correct functions are called.

  • Exercise 10-4. Modify the main() function from the previous exercise so that print_this() is called only if a DO_THIS preprocessing identifier is defined. When this is not the case, print_that() should be called.