Table of Contents for
Mastering C++ Multithreading

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Mastering C++ Multithreading by Maya Posch Published by Packt Publishing, 2017
  1. Mastering C++ Multithreading
  2. Title Page
  3. Copyright
  4. Mastering C++ Multithreading
  5. Credits
  6. About the Author
  7. About the Reviewer
  8. www.PacktPub.com
  9. Why subscribe?
  10. Customer Feedback
  11. Table of Contents
  12. Preface
  13. What this book covers
  14. What you need for this book
  15. Who this book is for
  16. Conventions
  17. Reader feedback
  18. Downloading the example code
  19. Errata
  20. Piracy
  21. Questions
  22. Revisiting Multithreading
  23. Getting started
  24. The multithreaded application
  25. Makefile
  26. Other applications
  27. Summary
  28. Multithreading Implementation on the Processor and OS
  29. Defining processes and threads
  30. Tasks in x86 (32-bit and 64-bit)
  31. Process state in ARM
  32. The stack
  33. Defining multithreading
  34. Flynn's taxonomy
  35. Symmetric versus asymmetric multiprocessing
  36. Loosely and tightly coupled multiprocessing
  37. Combining multiprocessing with multithreading
  38. Multithreading types
  39. Temporal multithreading
  40. Simultaneous multithreading (SMT)
  41. Schedulers
  42. Tracing the demo application
  43. Mutual exclusion implementations
  44. Hardware
  45. Software
  46. Summary
  47. C++ Multithreading APIs
  48. API overview
  49. POSIX threads
  50. Windows support
  51. PThreads thread management
  52. Mutexes
  53. Condition variables
  54. Synchronization
  55. Semaphores
  56. Thread local storage (TLC)
  57. Windows threads
  58. Thread management
  59. Advanced management
  60. Synchronization
  61. Condition variables
  62. Thread local storage
  63. Boost
  64. Qt
  65. QThread
  66. Thread pools
  67. Synchronization
  68. QtConcurrent
  69. Thread local storage
  70. POCO
  71. Thread class
  72. Thread pool
  73. Thread local storage (TLS)
  74. Synchronization
  75. C++ threads
  76. Putting it together
  77. Summary
  78. Thread Synchronization and Communication
  79. Safety first
  80. The scheduler
  81. High-level view
  82. Implementation
  83. Request class
  84. Worker class
  85. Dispatcher
  86. Makefile
  87. Output
  88. Sharing data
  89. Using r/w-locks
  90. Using shared pointers
  91. Summary
  92. Native C++ Threads and Primitives
  93. The STL threading API
  94. Boost.Thread API
  95. The 2011 standard
  96. C++14
  97. C++17
  98. STL organization
  99. Thread class
  100. Basic use
  101. Passing parameters
  102. Return value
  103. Moving threads
  104. Thread ID
  105. Sleeping
  106. Yield
  107. Detach
  108. Swap
  109. Mutex
  110. Basic use
  111. Non-blocking locking
  112. Timed mutex
  113. Lock guard
  114. Unique lock
  115. Scoped lock
  116. Recursive mutex
  117. Recursive timed mutex
  118. Shared mutex
  119. Shared timed mutex
  120. Condition variable
  121. Condition_variable_any
  122. Notify all at thread exit
  123. Future
  124. Promise
  125. Shared future
  126. Packaged_task
  127. Async
  128. Launch policy
  129. Atomics
  130. Summary
  131. Debugging Multithreaded Code
  132. When to start debugging
  133. The humble debugger
  134. GDB
  135. Debugging multithreaded code
  136. Breakpoints
  137. Back traces
  138. Dynamic analysis tools
  139. Limitations
  140. Alternatives
  141. Memcheck
  142. Basic use
  143. Error types
  144. Illegal read / illegal write errors
  145. Use of uninitialized values
  146. Uninitialized or unaddressable system call values
  147. Illegal frees
  148. Mismatched deallocation
  149. Overlapping source and destination
  150. Fishy argument values
  151. Memory leak detection
  152. Helgrind
  153. Basic use
  154. Misuse of the pthreads API
  155. Lock order problems
  156. Data races
  157. DRD
  158. Basic use
  159. Features
  160. C++11 threads support
  161. Summary
  162. Best Practices
  163. Proper multithreading
  164. Wrongful expectations - deadlocks
  165. Being careless - data races
  166. Mutexes aren't magic
  167. Locks are fancy mutexes
  168. Threads versus the future
  169. Static order of initialization
  170. Summary
  171. Atomic Operations - Working with the Hardware
  172. Atomic operations
  173. Visual C++
  174. GCC
  175. Memory order
  176. Other compilers
  177. C++11 atomics
  178. Example
  179. Non-class functions
  180. Example
  181. Atomic flag
  182. Memory order
  183. Relaxed ordering
  184. Release-acquire ordering
  185. Release-consume ordering
  186. Sequentially-consistent ordering
  187. Volatile keyword
  188. Summary
  189. Multithreading with Distributed Computing
  190. Distributed computing, in a nutshell
  191. MPI
  192. Implementations
  193. Using MPI
  194. Compiling MPI applications
  195. The cluster hardware
  196. Installing Open MPI
  197. Linux and BSDs
  198. Windows
  199. Distributing jobs across nodes
  200. Setting up an MPI node
  201. Creating the MPI host file
  202. Running the job
  203. Using a cluster scheduler
  204. MPI communication
  205. MPI data types
  206. Custom types
  207. Basic communication
  208. Advanced communication
  209. Broadcasting
  210. Scattering and gathering
  211. MPI versus threads
  212. Potential issues
  213. Summary
  214. Multithreading with GPGPU
  215. The GPGPU processing model
  216. Implementations
  217. OpenCL
  218. Common OpenCL applications
  219. OpenCL versions
  220. OpenCL 1.0
  221. OpenCL 1.1
  222. OpenCL 1.2
  223. OpenCL 2.0
  224. OpenCL 2.1
  225. OpenCL 2.2
  226. Setting up a development environment
  227. Linux
  228. Windows
  229. OS X/MacOS
  230. A basic OpenCL application
  231. GPU memory management
  232. GPGPU and multithreading
  233. Latency
  234. Potential issues
  235. Debugging GPGPU applications
  236. Summary

Visual C++

For Microsoft's MSVC compiler there are the interlocked functions, as summarized from the MSDN documentation, starting with the adding features:

Interlocked function

Description

InterlockedAdd

Performs an atomic addition operation on the specified LONG values.

InterlockedAddAcquire

Performs an atomic addition operation on the specified LONG values. The operation is performed with acquire memory ordering semantics.

InterlockedAddRelease

Performs an atomic addition operation on the specified LONG values. The operation is performed with release memory ordering semantics.

InterlockedAddNoFence

Performs an atomic addition operation on the specified LONG values. The operation is performed atomically, but without using memory barriers (covered in this chapter).

 

These are the 32-bit versions of this feature. There are also 64-bit versions of this and other methods in the API. Atomic functions tend to be focused on a specific variable type, but variations in this API have been left out of this summary to keep it brief.

We can also see the acquire and release variations. These provide the guarantee that the respective read or write access will be protected from memory reordering (on a hardware level) with any subsequent read or write operation. Finally, the no fence variation (also known as a memory barrier) performs the operation without the use of any memory barriers.

Normally CPUs perform instructions (including memory reads and writes) out of order to optimize performance. Since this type of behavior is not always desirable, memory barriers were added to prevent this instruction reordering.

Next is the atomic AND feature:

Interlocked function

Description

InterlockedAnd

Performs an atomic AND operation on the specified LONG values.

InterlockedAndAcquire

Performs an atomic AND operation on the specified LONG values. The operation is performed with acquire memory ordering semantics.

InterlockedAndRelease

Performs an atomic AND operation on the specified LONG values. The operation is performed with release memory ordering semantics.

InterlockedAndNoFence

Performs an atomic AND operation on the specified LONG values. The operation is performed atomically, but without using memory barriers.

 

The bit-test features are as follows:

Interlocked function

Description

InterlockedBitTestAndComplement

Tests the specified bit of the specified LONG value and complements it.

InterlockedBitTestAndResetAcquire

Tests the specified bit of the specified LONG value and sets it to 0. The operation is atomic, and it is performed with acquire memory ordering semantics.

InterlockedBitTestAndResetRelease

Tests the specified bit of the specified LONG value and sets it to 0. The operation is atomic, and it is performed using memory release semantics.

InterlockedBitTestAndSetAcquire

Tests the specified bit of the specified LONG value and sets it to 1. The operation is atomic, and it is performed with acquire memory ordering semantics.

InterlockedBitTestAndSetRelease

Tests the specified bit of the specified LONG value and sets it to 1. The operation is atomic, and it is performed with release memory ordering semantics.

InterlockedBitTestAndReset

Tests the specified bit of the specified LONG value and sets it to 0.

InterlockedBitTestAndSet

Tests the specified bit of the specified LONG value and sets it to 1.

 

The comparison features can be listed as shown:

Interlocked function

Description

InterlockedCompareExchange

Performs an atomic compare-and-exchange operation on the specified values. The function compares two specified 32-bit values and exchanges with another 32-bit value based on the outcome of the comparison.

InterlockedCompareExchangeAcquire

Performs an atomic compare-and-exchange operation on the specified values. The function compares two specified 32-bit values and exchanges with another 32-bit value based on the outcome of the comparison. The operation is performed with acquire memory ordering semantics.

InterlockedCompareExchangeRelease

Performs an atomic compare-and-exchange operation on the specified values. The function compares two specified 32-bit values and exchanges with another 32-bit value based on the outcome of the comparison. The exchange is performed with release memory ordering semantics.

InterlockedCompareExchangeNoFence

Performs an atomic compare-and-exchange operation on the specified values. The function compares two specified 32-bit values and exchanges with another 32-bit value based on the outcome of the comparison. The operation is performed atomically, but without using memory barriers.

InterlockedCompareExchangePointer

Performs an atomic compare-and-exchange operation on the specified pointer values. The function compares two specified pointer values and exchanges with another pointer value based on the outcome of the comparison.

InterlockedCompareExchangePointerAcquire

Performs an atomic compare-and-exchange operation on the specified pointer values. The function compares two specified pointer values and exchanges with another pointer value based on the outcome of the comparison. The operation is performed with acquire memory ordering semantics.

InterlockedCompareExchangePointerRelease

Performs an atomic compare-and-exchange operation on the specified pointer values. The function compares two specified pointer values and exchanges with another pointer value based on the outcome of the comparison. The operation is performed with release memory ordering semantics.

InterlockedCompareExchangePointerNoFence

Performs an atomic compare-and-exchange operation on the specified values. The function compares two specified pointer values and exchanges with another pointer value based on the outcome of the comparison. The operation is performed atomically, but without using memory barriers

 

The decrement features are:

Interlocked function

Description

InterlockedDecrement

Decrements (decreases by one) the value of the specified 32-bit variable as an atomic operation.

InterlockedDecrementAcquire

Decrements (decreases by one) the value of the specified 32-bit variable as an atomic operation. The operation is performed with acquire memory ordering semantics.

InterlockedDecrementRelease

Decrements (decreases by one) the value of the specified 32-bit variable as an atomic operation. The operation is performed with release memory ordering semantics.

InterlockedDecrementNoFence

Decrements (decreases by one) the value of the specified 32-bit variable as an atomic operation. The operation is performed atomically, but without using memory barriers.

The exchange (swap) features are:

Interlocked function

Description

InterlockedExchange

Sets a 32-bit variable to the specified value as an atomic operation.

InterlockedExchangeAcquire

Sets a 32-bit variable to the specified value as an atomic operation. The operation is performed with acquire memory ordering semantics.

InterlockedExchangeNoFence

Sets a 32-bit variable to the specified value as an atomic operation. The operation is performed atomically, but without using memory barriers.

InterlockedExchangePointer

Atomically exchanges a pair of pointer values.

InterlockedExchangePointerAcquire

Atomically exchanges a pair of pointer values. The operation is performed with acquire memory ordering semantics.

InterlockedExchangePointerNoFence

Atomically exchanges a pair of addresses. The operation is performed atomically, but without using memory barriers.

InterlockedExchangeSubtract

Performs an atomic subtraction of two values.

InterlockedExchangeAdd

Performs an atomic addition of two 32-bit values.

InterlockedExchangeAddAcquire

Performs an atomic addition of two 32-bit values. The operation is performed with acquire memory ordering semantics.

InterlockedExchangeAddRelease

Performs an atomic addition of two 32-bit values. The operation is performed with release memory ordering semantics.

InterlockedExchangeAddNoFence

Performs an atomic addition of two 32-bit values. The operation is performed atomically, but without using memory barriers.

The increment features are:

Interlocked function

Description

InterlockedIncrement

Increments (increases by one) the value of the specified 32-bit variable as an atomic operation.

InterlockedIncrementAcquire

Increments (increases by one) the value of the specified 32-bit variable as an atomic operation. The operation is performed using acquire memory ordering semantics.

InterlockedIncrementRelease

Increments (increases by one) the value of the specified 32-bit variable as an atomic operation. The operation is performed using release memory ordering semantics.

InterlockedIncrementNoFence

Increments (increases by one) the value of the specified 32-bit variable as an atomic operation. The operation is performed atomically, but without using memory barriers.

 

The OR feature:

Interlocked function

Description

InterlockedOr

Performs an atomic OR operation on the specified LONG values.

InterlockedOrAcquire

Performs an atomic OR operation on the specified LONG values. The operation is performed with acquire memory ordering semantics.

InterlockedOrRelease

Performs an atomic OR operation on the specified LONG values. The operation is performed with release memory ordering semantics.

InterlockedOrNoFence

Performs an atomic OR operation on the specified LONG values. The operation is performed atomically, but without using memory barriers.

Finally, the exclusive OR (XOR) features are:

Interlocked function

Description

InterlockedXor

Performs an atomic XOR operation on the specified LONG values.

InterlockedXorAcquire

Performs an atomic XOR operation on the specified LONG values. The operation is performed with acquire memory ordering semantics.

InterlockedXorRelease

Performs an atomic XOR operation on the specified LONG values. The operation is performed with release memory ordering semantics.

InterlockedXorNoFence

Performs an atomic XOR operation on the specified LONG values. The operation is performed atomically, but without using memory barriers.