Chapter 25. Disassembler/Debugger Integration

An integrated disassembler/debugger combination such as IDA should be a pretty powerful tool for manipulating binaries and seamlessly applying static and dynamic techniques as part of the reverse engineering process. This turns out to be true if you understand the capabilities and limitations of each tool individually and in combination.

In this chapter we will discuss some important points concerning the manner in which the static side of IDA interacts with its dynamic side, and we will take a look at techniques that can be employed with IDA’s debugger in order to defeat certain anti-debugging (and anti-disassembly) techniques in the context of malware analysis. In that regard, it is important to remember that the goal in malware analysis is usually not to run the malware but to obtain a disassembly of sufficient quality to allow static analysis tools to take over. Recall from Chapter 21 that there are many techniques designed specifically to prevent disassemblers from performing properly. In the face of such anti-disassembly techniques, the debugger is simply one means to an end. By running an obfuscated program under debugger control, we will attempt to obtain a de-obfuscated version of the program, which we then prefer to analyze using the disassembler.

Background

Some background on debugger-assisted de-obfuscation may be useful before proceeding. It is well known that an obfuscated program must de-obfuscate itself before it can get down to its intended business. The following steps provide a basic and somewhat simplistic guide for dynamic de-obfuscation of binaries.

Open an obfuscated program with a debugger.
Search for and set a breakpoint on the end of the de-obfuscation routine.
Launch the program from the debugger and wait for your breakpoint to trigger.
Utilize the debugger’s memory-dumping features to capture the current state of the process to a file.
Terminate the process before it can do anything malicious.
Perform static analysis on the captured process image.

Most modern debuggers contain enough features to perform the tasks just mentioned. OllyDbg^[226] is a very popular Windows-only debugger often used for such work. Step 2 is not always as straightforward as it may sound. It may take a combination of tools, including spending some amount of time in a disassembler such as IDA, or a lot of single stepping before the end of the de-obfuscation algorithm can be properly identified. In many cases, the end of de-obfuscation is marked by a behavior rather than a specific instruction. One such behavior might be a large change in the instruction pointer value, indicating a jump to a location far from the de-obfuscation code. In the case of UPX-packed binaries, for example, all you need to do is observe that the instruction pointer holds a value that is less than the program’s entry point address to know that de-obfuscation is complete and the program has jumped to the newly de-obfuscated code. In generic terms, this process is called original entry point (OEP) recognition, the OEP being the address at which the program would have begun execution had it not been obfuscated.

Complicating matters, some modern obfuscators are capable of transforming an input executable into an equivalent byte code program, which is then executed on a custom virtual machine generated by the obfuscator.^[227] Executables protected with such virtualizing obfuscators cannot be analyzed with the traditional expectation of recovering the original binary or locating the original entry point. This is a result of the fact that the original x86 (or other processor) instructions are not embedded in the obfuscated binary and are therefore unavailable for recovery.

If you are not careful, step 3 can be a dangerous one. In any case, you should always think twice before you allow a piece of malware to run unhindered in the hope that you have set your breakpoints or breakpoint conditions properly. If the program manages to bypass your breakpoint(s), it may well proceed to execute malicious code before you know what has happened. For this reason, attempts to de-obfuscate malware under debugger control should always be conducted in a sandbox environment that you are not afraid to wipe clean in the event things go wrong.

Step 4 may require some level of effort, because memory dumping is usually supported in debuggers, while entire-process image dumping may not be. The OllyDump^[228] plug-in, by Gigapede, adds process-dumping capabilities to OllyDbg. Keep in mind that the image that gets dumped from memory contains content from a running process and does not necessarily reflect the original state of the binary at rest in a disk file. In malware analysis, however, the goal is generally to create not a working de-obfuscated executable file, but rather an image file that is correctly structured so that it can be loaded into a disassembler for further analysis.

One of the trickiest parts of reconstructing a binary image from an obfuscated process is restoration of the program’s imported function table. As part of the obfuscation process, a program’s import table is often obfuscated as well. As a result, the de-obfuscation process must also take care of linking the newly de-obfuscated process to all of the shared libraries and functions the process requires in order to execute properly. The only trace of this process is usually a table of imported function addresses somewhere within the process’s memory image. When dumping a de-obfuscated process image to a file, steps are often taken to attempt to reconstruct a valid import table in the dumped process image. In order to do this, the headers of the dumped image need to be modified to point to a new import table structure that must properly reflect all of the shared library dependencies of the original de-obfuscated program. A popular tool for automating this process is the ImpREC^[229] (Import REConstruction) utility by MackT. As with process dumping, keep in mind that extracting a standalone executable may not be your primary goal in malware analysis, in which case reconstructing valid headers and a working import table is less important than knowing which functions have been resolved and where the addresses of those functions have been stored.

^[226]See http://www.ollydbg.de/.

^[227]For a discussion of one such obfuscator, VMProtect, see “Unpacking Virtualization Obfuscators” by Rolf Rooles at http://www.usenix.org/event/woot09/tech/full_papers/rolles.pdf.

^[228]See http://www.woodmann.com/collaborative/tools/index.php/OllyDump.

^[229]See http://www.woodmann.com/collaborative/tools/index.php/ImpREC.

Previous Chapter

Summary

Next Chapter

IDA Databases and the IDA Debugger

Table of Contents for The IDA Pro Book, 2nd Edition

Chapter 25. Disassembler/Debugger Integration

Background

Table of Contents for
The IDA Pro Book, 2nd Edition