Shellcode needs to dereference a base pointer when accessing data in a position-independent manner. Adding or subtracting values to this base value will allow it to safely access data that is included with the shellcode. Because the x86 instruction set does not provide EIP-relative data access, as it does for control-flow instructions, a general-purpose register must first be loaded with the current instruction pointer, to be used as the base pointer.
Obtaining the current instruction pointer may not be immediately obvious, because the
instruction pointer on x86 systems cannot be directly accessed by software. In fact, there is no way
to assemble the instruction mov eax, eip to directly load a
general-purpose register with the current instruction pointer. However, shellcode uses two popular
techniques to address this issue: call/pop and fnstenv instructions.
When a call instruction is executed, the processor pushes
the address of the instruction following the call onto the stack,
and then branches to the requested location. This function executes, and when it completes, it
executes a ret instruction to pop the return address off the top
of the stack and load it into the instruction pointer. As a result, execution returns to the
instruction just after the call.
Shellcode can abuse this convention by immediately executing a pop instruction after a call, which will load the
address immediately following the call into the specified
register. Example 19-1 shows a simple Hello World example
that uses this technique.
Example 19-1. call/pop Hello World
example
Bytes Disassembly 83 EC 20 sub esp, 20h 31 D2 xor edx, edx E8 0D 00 00 00 call sub_17 ❶ 48 65 6C 6C 6F db 'Hello World!',0 ❷ 20 57 6F 72 6C 64 21 00 sub_17: 5F pop edi ❸ ; edi gets string pointer 52 push edx ; uType: MB_OK 57 push edi ; lpCaption 57 push edi ; lpText 52 push edx ; hWnd: NULL B8 EA 07 45 7E mov eax, 7E4507EAh ; MessageBoxA FF D0 call eax ❹ 52 push edx ; uExitCode B8 FA CA 81 7C mov eax, 7C81CAFAh ; ExitProcess FF D0 call eax ❺
The call at ❶ transfers control to sub_17 at ❸. This is PIC because the call
instruction uses an EIP relative value (0x0000000D) to calculate
the call target. The pop
instruction at ❸ loads the address stored on top of the
stack into EDI.
Remember that the EIP value saved by the call instruction
points to the location immediately following the call, so after
the pop instruction, EDI will contain a pointer to the db declaration at ❷. This
db declaration is assembly language syntax to create a sequence
of bytes to spell out the string Hello World!. After the pop at ❸, EDI will point to
this Hello World! string.
This method of intermingling code and data is normal for shellcode, but it can easily confuse
disassemblers who try to interpret the data following the call
instruction as code, resulting in either nonsensical disassembly or completely halting the
disassembly process if invalid opcode combinations are encountered. As seen in Chapter 15, using call/pop pairs to obtain pointers to data may be incorporated into larger programs as an
additional anti-reverse-engineering technique.
The remaining code calls MessageBoxA
❹ to show the “Hello World!” message, and
then ExitProcess
❺ to cleanly exit. This sample uses hard-coded locations
for both function calls because imported functions in shellcode are not automatically resolved by
the loader, but hard-coded locations make this code fragile. (These addresses come from a Windows XP
SP3 box, and may differ from yours.)
To find these function addresses with OllyDbg, open any process and press CTRL-G to bring up the Enter Expression to Follow dialog. Enter MessageBoxA in the dialog and press
ENTER. The debugger should show the location of the function, as
long as the library with this export (user32.dll) is loaded by the process
being debugged.
To load and step through this example with shellcode_launcher.exe, enter the following at the command line:
shellcode_launcher.exe -i helloworld.bin -bp -L user32The -L user32 option is required because the shellcode does
not call LoadLibraryA, so
shellcode_launcher.exe must make sure this library is loaded. The -bp option inserts a breakpoint instruction just prior to jumping to the
shellcode binary specified with the -i option. Recall that
debuggers can be registered for just-in-time debugging and can be launched automatically (or when
prompted) when a program encounters a breakpoint. If a debugger such as OllyDbg has been registered
as a just-in-time debugger, it will open and attach to the process that encountered a breakpoint.
This allows you to skip over the contents of the shellcode_launcher.exe program
and begin at the start of the shellcode binary.
You can set OllyDbg as your just-in-time debugger by selecting Options ▶ Just-in-time Debugging ▶ Make OllyDbg Just-in-time Debugger.
Readers who wish to execute this example may need to modify the hard-coded function
locations for
MessageBoxA
and
ExitProcess. These addresses can
be found as described in the text. Once the addresses have been found, you can patch
helloworld.bin within OllyDbg by placing the cursor on the instruction that loads the
hard-coded function location into register EAX and then pressing the spacebar. This brings up
OllyDbg’s Assemble At dialog, which allows you to enter your own assembly code. This will be
assembled by OllyDbg and overwrite the current instruction. Simply replace the
7E4507EAh
value with the correct value from your machine, and OllyDbg will patch the program in
memory, allowing the shellcode to execute correctly.
The x87 floating-point unit (FPU) provides a separate execution environment within the normal
x86 architecture. It contains a separate set of special-purpose registers that need to be saved by
the OS on a context switch when a process is performing floating-point arithmetic with the FPU.
Example 19-2 shows the 28-byte structure used by the fstenv and fnstenv instructions to
store the state of the FPU to memory when executing in 32-bit protected mode.
Example 19-2. FpuSaveState structure definition
struct FpuSaveState {
uint32_t control_word;
uint32_t status_word;
uint32_t tag_word;
uint32_t fpu_instruction_pointer;
uint16_t fpu_instruction_selector;
uint16_t fpu_opcode;
uint32_t fpu_operand_pointer;
uint16_t fpu_operand_selector;
uint16_t reserved;
};The only field that matters for use here is fpu_instruction_pointer at byte offset 12. This will contain the address of the last CPU
instruction that used the FPU, providing context information for exception handlers to identify
which FPU instructions may have caused a fault. This field is required because the FPU is running in
parallel with the CPU. If the FPU generates an exception, the exception handler cannot simply look
at the interrupt return address to identify the instruction that caused the fault.
Example 19-3 shows the disassembly of another Hello World
program that uses fnstenv to obtain the EIP value.
Example 19-3. fnstenv Hello World example
Bytes Disassembly 83 EC 20 sub esp, 20h 31 D2 xor edx, edx EB 15 jmp short loc_1C EA 07 45 7E dd 7E4507EAh ; MessageBoxA FA CA 81 7C dd 7C81CAFAh ; ExitProcess 48 65 6C 6C 6F db 'Hello World!',0 20 57 6F 72 6C 64 21 00 loc_1C: D9 EE fldz ❶ D9 74 24 F4 fnstenv byte ptr [esp-0Ch] ❷ 5B pop ebx ❸ ; ebx points to fldz 8D 7B F3 lea edi, [ebx-0Dh] ❹ ; load HelloWorld pointer 52 push edx ; uType: MB_OK 57 push edi ; lpCaption 57 push edi ; lpText 52 push edx ; hWnd: NULL 8B 43 EB mov eax, [ebx-15h] ❺ ; load MessageBoxA FF D0 call eax ; call MessageBoxA 52 push edx ; uExitCode 8B 43 EF mov eax, [ebx-11h] ❻ ; load ExitProcess FF D0 call eax ; call ExitProcess
The fldz instruction at ❶ pushes the floating-point number 0.0 onto the FPU stack. The fpu_instruction_pointer value is updated within the FPU to point to the fldz instruction.
Performing the fnstenv at ❷ stores the FpuSaveState structure onto the stack at
[esp-0ch], which allows the shellcode to do a pop at ❸ that loads EBX with
the fpu_instruction_pointer value. Once the pop executes, EBX will contain a value that points to the location of the
fldz instruction in memory. The shellcode then starts using EBX
as a base register to access the data embedded in the code.
As in the previous Hello World example, which used the call/pop technique, this code calls MessageBoxA and ExitProcess using
hard-coded locations, but here the function locations are stored as data along with the ASCII string
to print. The lea instruction at ❹ loads the address of the Hello World! string by subtracting 0x0d from the address of the
fldz instruction stored in EBX. The mov instruction at ❺ loads the first function
location for MessageBoxA, and the mov instruction at ❻ loads the second
function location for ExitProcess.
Example 19-3 is a contrived example, but it is
common for shellcode to store or create function pointer arrays. We used the
fldz
instruction in this example, but any non-control FPU instruction can be
used.
This example can be executed using shellcode_launcher.exe with the following command:
shellcode_launcher.exe -i hellofstenv.bin -bp -L user32