Modern X86 Assembly Language Programming: Covers x86 64-bit, AVX, AVX2, and AVX-512

Appendix A

Appendix A includes supplemental material about the following items:

Software utilities for x86 processors
Visual Studio
References

Software Utilities for x86 Processors

The following utilities can be used to determine which x86 instruction set extensions are supported by the processor in your computer:

CPUID CPU-Z ( https://www.cpuid.com )
HWiNFO Diagnostic Software ( https://www.hwinfo.com )
Piriform SPECCY ( https://www.ccleaner.com/speccy )

Visual Studio

In this section, you’ll learn how to use Microsoft’s Visual Studio development tool to run the source code examples that are described in the main text. You’ll also learn how to create a simple Visual Studio C++ project. Before proceeding, you may want to refer to the Introduction for additional information regarding Visual Studio and the recommended hardware platforms for running the source code examples. The Introduction also contains important details about downloading the source code ZIP files for each chapter.

Visual Studio uses logical entities called solutions and projects to help simplify application development. A solution is a collection of one or more projects that are used to build an application. A project is container object that organizes an application’s files. A Visual Studio project is usually created for each buildable component of an application (e.g., executable file, dynamic-linked library, static library, and so on).

A standard Visual Studio C++ project includes two solution configurations named Debug and Release. As implied by their names, these configurations support separate executable builds for initial development and final release. A standard Visual Studio C++ project also incorporates solution platforms. The default solution platforms are named Win32 and x64, which contain the necessary settings to build 32-bit and 64-bit executables, respectively. The Visual Studio solution and project files for this book’s source code examples include only the x64 platform.

Running a Source Code Example

You can use the following steps to run any of the book’s source code examples:

1.
Using File Explorer, double-click on the chapter’s Visual Studio solution ( .sln ) file. The solution file is included in the chapter source code ZIP file.
2.
From the menu bar, select Build | Configuration Manager. In the Configuration Manager dialog box, set Active Solution Configuration to Release . Then set Active Solution Platform to x64 . Note that these options may already be selected.
3.
If necessary, select View | Solution Explorer to open the Solution Explorer window.
4.
In the Solution Explorer window, right-click on a project to run and choose Set as StartUp Project .
5.
Select Debug | Start Without Debugging to run the program.

Some of the source code examples reference data files in different folders using fixed path names. To run the corresponding executables using a different folder structure than the one used for Visual Studio development, you may need to change the path name strings in the C++ source code.

Creating a Visual Studio C++ Project

In this section, you’ll learn how to create a simple Visual Studio project that includes both C++ and assembly language source code files. The ensuing paragraphs describe the same basic procedure that was used to create the source code examples in the main text and includes the following phases:

Create a C++ project
Enable MASM support
Add an assembly language file
Set project properties
Edit the source code
Build and run the project

Create a C++ Project

Use the following steps to create a Visual Studio C++ project:

1.
Start Visual Studio.
2.
Select File | New Project.
3.
In the New Project dialog box control tree, select Installed | Visual C++ | Windows Desktop.
4.
Select Windows Console Application for the project type.
5.
In the Name text box, enter Example1 .
6.
In the Location text box, enter a folder name for the project location. You can also use the Browse button to choose a folder or leave the text unchanged to use the default location.
7.
In the Solution text box, enter TestSolution .
8.
Verify that the New Project dialog box settings are the same as the ones shown in Figure A-1 (the Location can be different). Click OK.
9.
If necessary, select View | Solution Explorer to open the Solution Explorer window.
10.
In the Solution Explorer tree control, right-click on the top-level text that’s labeled Solution ‘Example 1’ (1 Project) and select Rename . Change the solution name to TestSolution .
11.
Select Build | Configuration Manager. In the Configuration Manager dialog box, choose <Edit…> under Active Solution Platforms (see Figure A-2 ).
12.
In the Edit Solution Platforms dialog box, select x86 and click Remove (see Figure A-3 ). Click Close to close the Edit Solutions Platforms dialog box; click Close to close the Configuration Manager dialog box.

../images/326959_2_En_BookBackmatter_Fig1_HTML.jpg — Figure A-1.
New Project dialog box

../images/326959_2_En_BookBackmatter_Fig2_HTML.jpg — Figure A-2.
Configuration Manager dialog box

../images/326959_2_En_BookBackmatter_Fig3_HTML.jpg — Figure A-3.
Edit Solution Platforms dialog box

Enable MASM Support

Use the following steps to enable support for Microsoft Macro Assembler:

1.
In the Solution Explorer tree control, right-click on Example1 and select Build Dependencies | Build Customizations.
2.
In the Visual C++ Build Customizations dialog box, check masm(.targets, .props) .
3.
Click OK.

Add an Assembly Language File

Use the following steps to add an assembly language source code file ( .asm ) to a Visual Studio C++ project:

1.
In the Solution Explorer tree control, right-click on Example1 and select Add | New Item.
2.
Select C++ File (.cpp) for the file type.
3.
In the Name text box, change the name to Example1_.asm , as shown in Figure A-4 . Note that the trailing underscore is required since all C++ and assembly language source code files in a project must have a unique base name.
4.
Click Add.

../images/326959_2_En_BookBackmatter_Fig4_HTML.jpg — Figure A-4.
Add New Item dialog box

Set Project Properties

Use the following steps to set the project’s properties. The properties that control listing file generation (Steps 5 - 8) are optional.

1.
In the Solution Explorer tree control, right-click on Example1 and select Properties .
2.
In the Property Pages dialog box, change the Configuration setting to All Configurations and the Platform setting to All Platforms . Note that one or both options may already be set.
3.
In the tree control, select Configuration Properties | General. Change the setting Whole Program Optimization to No Whole Program Optimization (see Figure A-5 ).
4.
Select Configuration Properties | C/C++ | Code Generation. Change the setting Enable Enhanced Instruction Set to Advanced Vector Extensions (/arch:AVX) (see Figure A-6 )
5.
Select Configuration Properties | C/C++ | Output Files. Change the setting Assembler Output to Assembly Machine and Source Code (/FAcs) (see Figure A-7 ).
6.
Select Configuration Properties | Microsoft Macro Assembler | Listing File. Change the setting Enable Assembly Generated Code Listing to Yes (/Sg) (see Figure A-8 ).
7.
Change the Assembled Code Listing File text field to $(IntDir)\%(filename).lst (see Figure A-8 ). This macro text specifies the project’s intermediate directory, which is a subfolder of the main project folder.
8.
Click OK.

../images/326959_2_En_BookBackmatter_Fig5_HTML.jpg — Figure A-5.
Property Pages dialog box (Whole Program Optimization)

../images/326959_2_En_BookBackmatter_Fig6_HTML.jpg — Figure A-6.
Property Pages dialog box (Enable Enhanced Instruction Set)

../images/326959_2_En_BookBackmatter_Fig7_HTML.jpg — Figure A-7.
Property Pages dialog box (Assembler Output)

../images/326959_2_En_BookBackmatter_Fig8_HTML.jpg — Figure A-8.
Property Pages dialog box (Microsoft Macro Assembler Listing File)

Edit the Source Code

Use the following steps to edit the project source code:

1.
In the Editor window, click on the tab named Example1.cpp .
2.
Edit the C++ source code to match the code that’s shown in Listing A-1 .
3.
Click on the tab named Example1_.asm .
4.
Edit the assembly language source code to match the code that’s shown in Listing A-2 .
5.
Select File | Save All.

// Example1.cpp : Defines the entry point for the console application.

#include "stdafx.h"

#include <iostream>

using namespace std;

extern "C" int CalcResult1_(int val1, int val2, int* quo, int* rem);

int main()

{

int val1 = 42;

int val2 = 9;

int quo;

int rem;

int prod = CalcResult1_(val1, val2, &quo, &rem);

cout << "Results for Example1\n";

cout << "val1 = " << val1 << '\n';

cout << "val2 = " << val2 << '\n';

cout << "quo = " << quo << '\n';

cout << "rem = " << rem << '\n';

cout << "prod = " << prod << '\n';

return 0;

}

Listing A-1.

Example1.cpp

; extern "C" int CalcResult1_(int val1, int val2, int* quo, int* rem);

.code

CalcResult1_ proc

mov r10d,ecx ;r10d = val1

mov r11d,edx ;r11d = val2

mov eax,ecx ;eax = val1

cdq ;edx:eax = val1

idiv r11d ;calc val1 / val2

mov dword ptr [r8],eax ;save quotient

mov dword ptr [r9],edx ;save remainder

imul r10d,r11d ;r10d = val1 * val2

mov eax,r10d ;eax = val1 * val2

ret

CalcResult1_ endp

end

Listing A-2.

Example1_.asm

Build and Run the Project

Use the following steps to build and run the project:

1.
Select Build | Build Solution.
2.
If necessary, fix any reported C++ compiler or MASM errors and repeat Step 1.
3.
Select Debug | Start Without Debugging.
4.
Verify that the output matches the console window shown in Figure A-9 .
5.
Press Enter to close the console window.

../images/326959_2_En_BookBackmatter_Fig9_HTML.jpg — Figure A-9.
Console window output

References

This section contains a list of references that were consulted during preparation of the main text. It also includes additional references and resources that provide worthwhile information. The references have been grouped into the following categories:

X86 programming reference manuals
X86 programming and microarchitecture references
Ancillary resources
Algorithm references
C++ references

X86 Programming Reference Manuals

The following is a list of x86 programming reference manuals published by AMD and Intel:

AMD64 Architecture Programmer’s Manual Volume 1: Application Programming https://support.amd.com/TechDocs/24592.pdf
AMD64 Architecture Programmer’s Manual Volume 3: General Purpose and System Instructions , https://support.amd.com/TechDocs/24594.pdf
AMD64 Architecture Programmer’s Manual Volume 4: 128-bit and 256-bit Media Instructions , https://support.amd.com/TechDocs/26568.pdf
Software Optimization Guide for AMD Family 17h Processors, Publication Number 55723 , June 2017, https://developer.amd.com/resources/developer-guides-manuals
Intel 64 and IA-32 Architectures Software Developer’s Manual, Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4 , https://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html
Intel 64 and IA-32 Architectures Optimization Reference Manual , https://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html
Intel Architecture Instruction Set Extensions and Future Features Programming Reference , https://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html

X86 Programming and Microarchitecture References

The follow resources contain informative information about x86 assembly language programming, processors, and microarchitectures.

Guy Ben-Haim, Itai Neoran, and Ishay Tubi, Practical Intel AVX Optimization on 2nd Generation Intel Core Processors , https://software.intel.com/sites/default/files/m/d/4/1/d/8/Practical_Optimization_with_AVX.pdf
Ian Cutress, The Intel Skylake Mobile and Desktop Launch, with Architecture Analysis , September 2015, https://www.anandtech.com/show/9582/intel-skylake-mobile-desktop-launch-architecture-analysis
Ian Cutress, The Intel Skylake-X Review: Core i9-7900X, i7-7820X and i7-7800X Tested , June 2017, https://www.anandtech.com/show/11550/the-intel-skylakex-review-core-i9-7900x-i7-7820x-and-i7-7800x-tested
Anger Fog, The microarchitecture of Intel, AMD and VIA CPUs: An optimization guide for assembly programmers and compiler makers , August 2018, https://agner.org/optimize/#manuals
Agner Fog, Optimizing subroutines in assembly language: An optimization guide for x86 platforms , April 2018, https://agner.org/optimize/#manuals
Chris Kirkpatrick, Intel AVX State Transitions: Migrating SSE Code to AVX , https://software.intel.com/en-us/articles/intel-avx-state-transitions-migrating-sse-code-to-avx
Patrick Konsor, Avoiding AVX-SSE Transition Penalties , https://software.intel.com/en-us/articles/avoiding-avx-sse-transition-penalties
Patrick Konsor, Performance Benefits of Half-Precision Floats , https://software.intel.com/en-us/articles/performance-benefits-of-half-precision-floats
Daniel Kusswurm, Modern x86 Assembly Language Programming , Apress, ISBN 978-1-4842-0065-0, 2014.
Max Locktyukhin, How to Detect New Instruction Support in the 4th Generation Intel Core Processor Family , August 2013, https://software.intel.com/en-us/node/405250
John Morgan, Microsoft Visual Studio 2017 Supports Intel AVX-512 , https://blogs.msdn.microsoft.com/vcblog/2017/07/11/microsoft-visual-studio-2017-supports-intel-avx-512
Erdinc Ozturk, James Guilford, Vinodh Gopal, and Wajdi Feghal, New Instructions Supporting Large Integer Arithmetic on Intel Architecture Processors , August 2012, https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-large-integer-arithmetic-paper.pdf
James Reinders, AVX-512 May Be a Hidden Gem in Intel Xeon Scalable Processors , June 2017, https://www.hpcwire.com/2017/06/29/reinders-avx-512-may-hidden-gem-intel-xeon-scalable-processors
Anand Lal Shimpi, Intel's Haswell Architecture Analyzed: Building a New PC and a New Intel , October 2012, http://www.anandtech.com/show/6355/intels-haswell-architecture

Ancillary Resources

The following resources contain useful information about x86 processors and microarchitectures:

Processors for Desktops , AMD, https://www.amd.com/en/products/processors-desktop
List of AMD Accelerated Processing Unit Microprocessors , Wikipedia, https://en.wikipedia.org/wiki/List_of_AMD_Accelerated_Processing_Unit_microprocessors
List of AMD CPU Microarchitectures , Wikipedia, https://en.wikipedia.org/wiki/List_of_AMD_CPU_microarchitectures
List of AMD Microprocessors , Wikipedia, https://en.wikipedia.org/wiki/List_of_AMD_processors
Product Information Website , Intel, https://ark.intel.com
List of Intel CPU Microarchitectures , Wikipedia, https://en.wikipedia.org/wiki/List_of_Intel_CPU_microarchitectures
List of Intel Microprocessors , Wikipedia, https://en.wikipedia.org/wiki/Intel_processor
List of Intel Xeon Microprocessors , Wikipedia, https://en.wikipedia.org/wiki/List_of_Intel_Xeon_microprocessors
Register Renaming , Wikipedia, https://en.wikipedia.org/wiki/Register_renaming

Algorithm References

The following resources were consulted to develop the algorithms used in the source code examples:

Forman S. Acton, REAL Computing Made REAL – Preventing Errors in Scientific and Engineering Calculations , ISBN 978-0486442211, Dover Publications, 2005
Tony Chan, Gene Golub, Randall LeVeque, Algorithms for Computing the Sample Variance: Analysis and Recommendations , The American Statistician, Volume 37 Number 3 (1983), p. 242-247
James F. Epperson, An Introduction to Numerical Methods and Analysis, Second Edition , ISBN 978-1-118-36759-9, Wiley, 2013
David Goldberg, What Every Computer Scientist Should Know About Floating-Point Arithmetic , ACM Computing Surveys, Volume 23 Issue 1 (March 1991), p. 5 – 48
Rafael C. Gonzalez and Richard E. Woods, Digital Image Processing, Fourth Edition , ISBN 978-0-133-35672-4, 2018
James E. Miller, David G. Moursund, Charles S. Duris, Elementary Theory & Application of Numerical Analysis, Revised Edition , ISBN 978-0486479064, Dover Publications, 2011
Anthony Pettofrezzo, Matrices and Transformations , ISBN 0-486-63634-8, Dover Publications, 1978
Hans Schneider and George Barker, Matrices and Linear Algebra , ISBN 0-486-66014-1, Dover Publications, 1989
Eric W. Weisstein, Convolution , MathWorld, http://mathworld.wolfram.com/Convolution.html
Eric W. Weisstein, Correlation Coefficient , MathWorld, http://mathworld.wolfram.com/CorrelationCoefficient.html
Eric W. Weisstein, Cross Product , MathWorld, http://mathworld.wolfram.com/CrossProduct.html
Eric W. Weisstein, Least Squares Fitting , MathWorld, http://mathworld.wolfram.com/LeastSquaresFitting.html
Eric W. Weisstein, Matrix Multiplication , MathWorld, http://mathworld.wolfram.com/MatrixMultiplication.html
David M. Young and Robert Todd Gregory, A Survey of Numerical Mathematics, Volume 1 , ISBN 0-486-65691-8, Dover Publications, 1988
Algorithms for calculating variance , Wikipedia, https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
Body Surface Area Calculator , http://www.globalrph.com/bsa2.htm
Grayscale , Wikipedia, https://en.wikipedia.org/wiki/Grayscale
Linked List , Wikipedia, https://en.wikipedia.org/wiki/Linked_list

C++ References

The following resources contain valuable information about C++ programming, the C++ Standard Template Libraries, and C++ programming using multiple threads.

Ivor Horton, Using the C++ Standard Template Libraries , Apress, ISBN 978-1-4842-0005-6, 2015
Nicolai M. Josuttis, The C++ Standard Library – A Tutorial and Reference, Second Edition , Addison Wesley, ISBN 978-0-321-62321-8, 2012
Bjarne Stroustrup, The C++ Programming Language, Fourth Edition , Addison Wesley, ISBN 978-0-321-56384-2, 2013
Anthony Williams, C++ Concurrency in Action – Practical Multithreading , ISBN 978-1-933-98877-1, Manning Publications, 2012
cplusplus.com , http://www.cplusplus.com

Index

A

Advanced Vector Extensions (AVX)

data types

packed floating-point

packed integer

scalar floating-point

differences between x86-SSE

execution lanes

intermixing x86-AVX and x86-SSE code

operand alignment

vzeroupper

YMM register high-order bit zeroing

instruction syntax

non-destructive source operand

registers

MXCSR

XMM registers

YMM registers

AVX2

MXCSR

non-destructive source operand

operand alignment

packed floating-point

packed integer

variable bit shift

XMM registers

YMM registers

AVX-512

conditional execution and merging

merge masking

zero masking

data types

embedded broadcast

instruction-level rounding

round down

round to nearest

round to zero

round up

suppress all exceptions

instruction set extensions

AV512CD

AVX512BW

AVX512DQ

AVX512F

AVX512VL

instruction syntax

conditional execution and merging

embedded broadcast

instruction-level rounding

merge masking

predicate mask

MXCSR

opmask registers

XMM registers

YMM registers

ZMM registers

zero masking

Array of structures (AOS)

Array operations

column means

row-major ordering

least squares

min-max

simple calculations

square roots

Arrays

accessing elements

comparing

one-dimensional

reversal

row-major ordering

two-dimensional

row and column indices

B

Benchmark timing measurements

csv file

TRIMMEAN

C

C++

classes

AlignedArray

AlignedMem

array

BmThreadTimer

default_random_engine

ImageMatrix

matrix

mutex

thread

uniform_int_distribution

unique_ptr

lvalue

rvalue

size_t

specifiers

alignas

Cache

cache line

L1 data (D-Cache)

L1 instruction (I-Cache)

non-temporal data

pollution

slice

temporal data

Conditional jump

Conditional move

Condition codes

Convolution

discrete equation

input signal

padding

kernel

fixed size

variable size

output signal

response signal

SIMD equations

theory

YMM registers

ZMM registers

Correlation coefficient

CPU Identification (CPUID)

AVX-512 feature flags

feature flag

host operating system

OSXSAVE

leaf value

memory caches

return results

serializing instruction

sub-leaf value

xgetbv

D

Data blend

Data gather

indices

doubleword

quadword

merge control mask

vector scale-index-base

Data permute

indices

Data prefetch

hint

linked list

Differences between x86-32 and x86-64 programming

byte register restrictions

deprecated instructions

immediate operands

32-bit

invalid instructions

operand sizes

E

Enhanced bit manipulation

leading zero bits

trailing zero bits

F, G

Feature set identification

SeeCPUID

Flagless operations

multiplication

shift

FMA

SeeFused-Multiply-Add (FMA)

FMA3

SeeFused-Multiply-Add (FMA)

FMA4

SeeFused-Multiply-Add (FMA)

Fundamental data types

byte

double quadword

doubleword

little endian ordering

proper alignment

quadword

word

Fused-Multiply-Add (FMA)

arithmetic

convolution functions

packed

scalar

data dependencies

multiple registers

operand ordering scheme

packed

rounding

MXCSR.RC

scalar

value discrepancies

H

Half-precision floating-point

encoding

exponent

sign bit

significand

F16C

Half-precision floating-point conversions

rounding mode

I

IEEE 754

binary encoding

exponent

sign bit

significand

special values

denormal

floating-point zero

infinity

NaN

QNaN

SNaN

Image processing

image histogram

image statistics

mean

standard deviation

image thresholding

mask image

pixel clipping

pixel conversions

instruction-level rounding

size reduction

pixel mean

pixel minimum-maximum

RGB pixel min-max values

macro text string

RGB to grayscale conversion

color conversion coefficients

size reduction

weighted sum

thresholding

mask image

Instruction operands

immediate

memory

Instruction pipeline

allocate rename block

branch prediction unit

decoded instruction cache

execution engine

execution unit

instruction decoder

instruction fetch and pre-decode

instruction queue

loop stream detector

micro-op instruction queue

retire unit

scheduler

Instruction set extensions

ADX

BMI1

BMI2

F16C

FMA

LZCNT

POPCNT

Integer arithmetic

addition

division

logical operations

mixed sizes

multiplication

shift operations

subtraction

J, K

Jump table

L

Linked list

node

data

end-of-list terminator

link

Loop unrolling

M

MASM

SeeMicrosoft Macro Assembler (MASM)

Matrix operations

inverse

Cayley-Hamilton theorem

multiplication

transposition

Matrix-vector multiplication

equations

permutation of vector components

Memory addressing modes

base register

base register + disp

base register + index register

base register + index register + disp

base register + index register * scale factor

base register + index register * scale factor + disp

effective address calculation

index * scale factor + disp

RIP + disp (RIP relative)

RIP relative

Microarchitecture

Coffee Lake

Haswell

Kaby Lake

Skylake

Skylake Server

Micro-op

macro-fusion

micro-fusion

Microsoft Macro Assembler (MASM)

comment line

custom segment

directive

align

.allocstack

bcst

byte ptr

catstr

.code

.const

.data

dup

dword

dword ptr

endp

.endprolog

ends

equ

.erridni

macro

proc

proc frame

.pushreg

qword

qword ptr

readonly

real4

real8

.savexmm128

segment

.setframe

substr

word ptr

xmmword ptr

ymmword ptr

zmmword ptr

label

location counter ($)

macro text string

Miscellaneous data types

bit field

bit string

string

Multithreading

data arrays

MXCSR

control flags

rounding control

rounding mode

status flags

N

Non-temporal memory store

arrays

hint

Numeric data types

floating-point

double-precision

single-precision

signed integers

unsigned integers

O

Optimization

basic techniques

data alignment

multi-byte values

packed floating-point

packed integer

floating-point arithmetic

denormals

loop unrolling

precision

program branches

backward conditional

branch prediction

forward conditional

loop unrolling

SIMD techniques

P, Q

Packed floating-point arithmetic

common operations

addition

compares

conversions

division

multiplication

subtraction

compares

conversions

unsigned integer

logical decisions

operations

absolute value

addition

division

multiplication

square root

subtraction

Packed integer arithmetic

basic arithmetic

doubleword

word

common operations

addition

multiplication

shifts

subtraction

operations

addition

shifts

subtraction

pack and unpack

size promotions

sign extended

zero extended

R

Registers

general purpose

8-bit

16-bit

32-bit

64-bit

MXCSR

RFLAGS

carry

direction

overflow

parity

sign

zero

RIP (instruction pointer)

RSP (stack pointer)

XMM

YMM

ZMM

RFLAGS

SeeRegisters

Ring interconnect

S, T, U

Scalar floating-point arithmetic

arrays

double-precision

matrices

operations

addition

compares

conversions

division

multiplication

square root

subtraction

single-precision

SIMD

SeeSingle Instruction Multiple Data (SIMD)

Single Instruction Multiple Data (SIMD)

arithmetic

horizontal addition

horizontal subtraction

packed floating-point

packed integer

saturated

wrapround

data types

xmmword

ymmword

zmmword

programming concepts

Smoothing operator

Gaussian filter

coefficients

Strings

concatenation

counting characters

direction flag

end-of-string character

Structure

member alignment

padding

Structure of arrays (SOA)

System agent

V, W

Vector cross product

component equation

gather

opmask register

scatter

Vector scale-index-base (VSIB)

SeeAVX2

Visual C++

calling convention

epilog macros

floating-point argument

floating-point return value

function epilog

function prolog

general-purpose register

integer argument

leaf function

local storage

non-leaf function

non-volatile register

prolog macros

returning structures by value

return value

stack alignment

stack arguments

stack frame

stack layout

volatile register

XMM register

ZMM registers

decorated name

extern “C” modifier

X

XmmVal

Y

YmmVal

Z

ZmmVal

Previous Chapter

16. Advanced Programming

Table of Contents for Modern X86 Assembly Language Programming: Covers x86 64-bit, AVX, AVX2, and AVX-512

Index

A

B

C

D

E

F, G

H

I

J, K

L

M

N

O

P, Q

R

S, T, U

V, W

X

Y

Z

Table of Contents for
Modern X86 Assembly Language Programming: Covers x86 64-bit, AVX, AVX2, and AVX-512