Shellcode Encodings

In order to execute, the shellcode binary must be located somewhere in the program’s address space when it is triggered. When paired with an exploit, this means that the shellcode must be present before the exploit occurs or be passed along with the exploit. For example, if the program is performing some basic filtering on input data, the shellcode must pass this filter, or it will not be in the vulnerable process’s memory space. This means that shellcode often must look like legitimate data in order to be accepted by a vulnerable program.

One example is a program that uses the unsafe string functions strcpy and strcat, both of which do not set a maximum length on the data they write. If a program reads or copies malicious data into a fixed-length buffer using either of these functions, the data can easily exceed the size of the buffer and lead to a buffer-overflow attack. These functions treat strings as an array of characters terminated by a NULL (0x00) byte. Shellcode that an attacker wants copied into this buffer must look like valid data, which means that it must not have any NULL bytes in the middle that would prematurely end the string-copy operation.

Example 19-8 shows a small piece of disassembly of code used to access the registry, with seven NULL bytes in this selection alone. This code could typically not be used as-is in a shellcode payload.

Example 19-8. Typical code with highlighted NULL bytes

57                  push    edi
50                  push    eax             ; phkResult
6A 01               push    1               ; samDesired
8D 8B D0 13 00 00   lea     ecx, [ebx+13D0h]
6A 00               push    0               ; ulOptions
51                  push    ecx             ; lpSubKey
68 02 00 00 80      push    80000002h       ; hKey: HKEY_LOCAL_MACHINE
FF 15 20 00 42 00   call    ds:RegOpenKeyExA

Programs may perform additional sanity checks on data that the shellcode must pass in order to succeed, such as the following:

All bytes are printable (less than 0x80) ASCII bytes.
All bytes are alphanumeric (A through Z, a through z, or 0 through 9).

To overcome filtering limitations by the vulnerable program, nearly all shellcode encodes the main payload to pass the vulnerable program’s filter and inserts a decoder that turns the encoded payload into executable bytes. Only the small decoder section must be written carefully so that its instruction bytes will pass the strict filter requirements; the rest of the payload can be encoded at compile time to also pass the filter. If the shellcode writes the decoded bytes back on top of the encoded bytes (as usual), the shellcode is self-modifying. When the decoding is complete, the decoder transfers control to the main payload to execute.

The following are common encoding techniques:

XOR all payload bytes with constant byte mask. Remember that for all values of the same size a,b that (a XOR b) XOR b == a.
Use an alphabetic transform where a single byte of payload is split into two 4-bit nibbles and added to a printable ASCII character (such as A or a).

Shellcode encodings have additional benefits for the attackers, in that they make analysis more difficult by hiding human-readable strings such as URLs or IP addresses. Also, they may help evade network IDSs.

Previous Chapter

A Full Hello World Example

Next Chapter

NOP Sleds

Table of Contents for Practical Malware Analysis

Shellcode Encodings

Table of Contents for
Practical Malware Analysis