Advanced disassemblers can analyze the instructions in a function to deduce the construction of its stack frame, which allows them to display the local variables and parameters relevant to the function. This information is extremely valuable to a malware analyst, as it allows for the analysis of a single function at one time, and enables the analyst to better understand its inputs, outputs, and construction.
However, analyzing a function to determine the construction of its stack frame is not an exact science. As with many other facets of disassembly, the algorithms used to determine the construction of the stack frame must make certain assumptions and guesses that are reasonable but can usually be exploited by a knowledgeable malware author.
Defeating stack-frame analysis will also prevent the operation of certain analytical techniques, most notably the Hex-Rays Decompiler plug-in for IDA Pro, which produces C-like pseudocode for a function.
Let’s begin by examining a function that has been armored to defeat stack-frame analysis.
Example 15-1. A function that defeats stack-frame analysis
00401543 sub_401543 proc near ; CODE XREF: sub_4012D0+3Cp
00401543 ; sub_401328+9Bp
00401543
00401543 arg_F4 = dword ptr 0F8h
00401543 arg_F8 = dword ptr 0FCh
00401543
00401543 000 sub esp, 8
00401546 008 sub esp, 4
00401549 00C cmp esp, 1000h
0040154F 00C jl short loc_401556
00401551 00C add esp, 4
00401554 008 jmp short loc_40155C
00401556 ; --------------------------------------------------------------
00401556
00401556 loc_401556: ; CODE XREF: sub_401543+Cj
00401556 00C add esp, 104h
0040155C
0040155C loc_40155C: ; CODE XREF: sub_401543+11j
0040155C -F8❶ mov [esp-0F8h+arg_F8], 1E61h
00401564 -F8 lea eax, [esp-0F8h+arg_F8]
00401568 -F8 mov [esp-0F8h+arg_F4], eax
0040156B -F8 mov edx, [esp-0F8h+arg_F4]
0040156E -F8 mov eax, [esp-0F8h+arg_F8]
00401572 -F8 inc eax
00401573 -F8 mov [edx], eax
00401575 -F8 mov eax, [esp-0F8h+arg_F4]
00401578 -F8 mov eax, [eax]
0040157A -F8 add esp, 8
0040157D -100 retn
0040157D sub_401543 endp ; sp-analysis failedStack-frame anti-analysis techniques depend heavily on the compiler used. Of course, if the malware is entirely written in assembly, then the author is free to use more unorthodox techniques. However, if the malware is crafted with a higher-level language such as C or C++, special care must be taken to output code that can be manipulated.
In Example 15-1, the column on the far left is the standard IDA Pro line prefix, which contains the segment name and memory address for each function. The next column to the right displays the stack pointer. For each instruction, the stack pointer column shows the value of the ESP register relative to where it was at the beginning of the function. This view shows that this function is an ESP-based stack frame rather than an EBP-based one, like most functions. (This stack pointer column can be enabled in IDA Pro through the Options menu.)
At ❶, the stack pointer begins to be shown as a negative number. This should never happen for an ordinary function because it means that this function could damage the calling function’s stack frame. In this listing, IDA Pro is also telling us that it thinks this function takes 62 arguments, of which it thinks 2 are actually being used.
Press CTRL-K in IDA Pro to examine this monstrous stack frame in detail. If you attempt to press Y to give this function a prototype, you’ll be presented with one of the most ghastly abominations of a function prototype you’ve ever seen.
As you may have guessed, this function doesn’t actually take 62 arguments. In reality, it takes no arguments and has two local variables. The code responsible for breaking IDA Pro’s analysis lies near the beginning of the function, between locations 00401546 and 0040155C. It’s a simple comparison with two branches.
The ESP register is being compared against the value 0x1000. If it is less than 0x1000, then it executes
the code at 00401556; otherwise, it executes the code at 00401551. Each branch adds some value to
ESP—0x104 on the “less-than” branch and 4 on the
“greater-than-or-equal-to” branch. From a disassembler’s perspective, there are
two possible values of the stack pointer offset at this point, depending on which branch has been
taken. The disassembler is forced to make a choice, and luckily for the malware author, it is
tricked into making the wrong choice.
Earlier, we discussed conditional branch instructions, which were not conditional at all
because they exist where the condition is constant, such as a jz
instruction immediately following an xor eax, eax instruction.
Innovative disassembler authors could code special semantics in their algorithm to track such
guaranteed flag states and detect the presence of such fake conditional branches. The code would be
useful in many scenarios and would be very straightforward, though cumbersome, to implement.
In Example 15-1, the instruction cmp esp, 1000h will always produce a fixed result. An experienced malware
analyst might recognize that the lowest memory page in a Windows process would not be used as a
stack, and thus this comparison is virtually guaranteed to always result in the
“greater-than-or-equal-to” branch being executed. The disassembly program doesn’t have this level of
intuition. Its job is to show you the instructions. It’s not designed to evaluate every
decision in the code against a set of real-world scenarios.
The crux of the problem is that the disassembler assumed that the add
esp, 104h instruction was valid and relevant, and adjusted its interpretation of the stack
accordingly. The add esp, 4 instruction in the
greater-than-or-equal-to branch was there solely to readjust the stack after the sub esp, 4 instruction that came before the comparison. The net result in
real time is that the ESP value will be identical to what it was prior to the beginning of the
sequence at address 00401546.
To overcome minor adjustments to the stack frame (which occur occasionally due to the inherently fallible nature of stack-frame analysis), in IDA Pro, you can put the cursor on a particular line of disassembly and press ALT-K to enter an adjustment to the stack pointer. In many cases, such as in Example 15-1, it may prove more fruitful to patch the stack-frame manipulation instructions, as in the previous examples.