In order to execute, the shellcode binary must be located somewhere in the program’s address space when it is triggered. When paired with an exploit, this means that the shellcode must be present before the exploit occurs or be passed along with the exploit. For example, if the program is performing some basic filtering on input data, the shellcode must pass this filter, or it will not be in the vulnerable process’s memory space. This means that shellcode often must look like legitimate data in order to be accepted by a vulnerable program.
One example is a program that uses the unsafe string functions strcpy and strcat, both of which do not set a maximum
length on the data they write. If a program reads or copies malicious data into a fixed-length
buffer using either of these functions, the data can easily exceed the size of the buffer and lead
to a buffer-overflow attack. These functions treat strings as an array of characters terminated by a
NULL (0x00) byte. Shellcode that an attacker wants copied into
this buffer must look like valid data, which means that it must not have any NULL bytes in the
middle that would prematurely end the string-copy operation.
Example 19-8 shows a small piece of disassembly of code used to access the registry, with seven NULL bytes in this selection alone. This code could typically not be used as-is in a shellcode payload.
Example 19-8. Typical code with highlighted NULL bytes
57 push edi 50 push eax ; phkResult 6A 01 push 1 ; samDesired 8D 8B D0 13 00 00 lea ecx, [ebx+13D0h] 6A 00 push 0 ; ulOptions 51 push ecx ; lpSubKey 68 02 00 00 80 push 80000002h ; hKey: HKEY_LOCAL_MACHINE FF 15 20 00 42 00 call ds:RegOpenKeyExA
Programs may perform additional sanity checks on data that the shellcode must pass in order to succeed, such as the following:
All bytes are printable (less than 0x80) ASCII bytes.
All bytes are alphanumeric (A through Z, a through z, or 0 through 9).
To overcome filtering limitations by the vulnerable program, nearly all shellcode encodes the main payload to pass the vulnerable program’s filter and inserts a decoder that turns the encoded payload into executable bytes. Only the small decoder section must be written carefully so that its instruction bytes will pass the strict filter requirements; the rest of the payload can be encoded at compile time to also pass the filter. If the shellcode writes the decoded bytes back on top of the encoded bytes (as usual), the shellcode is self-modifying. When the decoding is complete, the decoder transfers control to the main payload to execute.
The following are common encoding techniques:
XOR all payload bytes with constant byte mask. Remember that for all values of the same size a,b that (a XOR b) XOR b == a.
Use an alphabetic transform where a single byte of payload is split into two 4-bit nibbles and added to a printable ASCII character (such as A or a).
Shellcode encodings have additional benefits for the attackers, in that they make analysis more difficult by hiding human-readable strings such as URLs or IP addresses. Also, they may help evade network IDSs.