Debug vs. Release Binaries

Microsoft’s Visual Studio projects are usually capable of building either debug or release versions of program binaries. One way to note the differences is to compare the build options specified for the debug version of a project to the build options specified for the release version. Simple differences include the fact that release versions are generally optimized,^[146] while debug versions are not, and debug versions are linked with additional symbol information and debugging versions of the runtime library, while release versions are not. The addition of debugging-related symbols allows debuggers to map assembly language statements back to their source code counterparts and to determine the names of local variables.^[147] Such information is typically lost during the compilation process. The debugging versions of Microsoft’s runtime libraries have also been compiled with debugging symbols included, optimizations disabled, and additional safety checks enabled to verify that some function parameters are valid.

When disassembled using IDA, debug builds of Visual Studio projects look significantly different from release builds. This is a result of compiler and linker options specified only in debug builds, such as basic runtime checks (/RTCx^[148]), which introduce extra code into the resulting binary. A side effect of this extra code is that it defeats IDA’s startup signature-matching process, resulting in IDA’s frequent failure to automatically locate main in debug builds of binaries.

One of the first differences you may notice in a debug build of a binary is that virtually all functions are reached via jump functions (also known as thunk functions), as shown in the following code fragments:

 .text:00411050
 sub_411050      proc near               ; CODE XREF: start_0+3↓p
  .text:00411050                jmp     sub_412AE0
   .text:00411050 sub_411050     endp
  ...
 .text:0041110E start           proc near
    .text:0041110E                jmp     start_0
  .text:0041110E start           endp
  ...
 .text:00411920 start_0         proc near               ; CODE XREF: start↑j
  .text:00411920                 push    ebp
  .text:00411921                 mov     ebp, esp
  .text:00411923                call    sub_411050
  .text:00411928                 call    sub_411940
  .text:0041192D                 pop     ebp
  .text:0041192E                 retn
  .text:0041192E start_0         endp

In this example, the program entry point does nothing other than jump to the actual startup function . The startup function, in turn, calls another function , which simply jumps to the actual implementation of that function. The two functions and that contain nothing but a single jump statement are called thunk functions. The heavy use of thunk functions in debug binaries is one of the obstacles to IDA’s signature-matching process. While the presence of thunk functions may briefly slow down your analysis, using the techniques described in the previous section, it is still possible to track down the main function of the binary.

The basic runtime checks in a debug build cause several additional operations to be performed upon entry to any function. An example of an extended prologue in a debug build is shown here:

.text:00411500                 push    ebp
.text:00411501                 mov     ebp, esp
.text:00411503                sub     esp, 0F0h
.text:00411509                 push    ebx
.text:0041150A                 push    esi
.text:0041150B                 push    edi
.text:0041150C                lea     edi, [ebp+var_F0]
.text:00411512                 mov     ecx, 3Ch
.text:00411517                 mov     eax, 0CCCCCCCCh
.text:0041151C                 rep stosd
.text:0041151E                mov     [ebp+var_8], 0
.text:00411525                 mov     [ebp+var_14], 1
.text:0041152C                 mov     [ebp+var_20], 2
.text:00411533                 mov     [ebp+var_2C], 3

The function in this example utilizes four local variables that should require only 16 bytes of stack space. Instead we see that this function allocates 240 bytes of stack space and then proceeds to fill each of the 240 bytes with the value 0xCC. The four lines starting at equate to the following function call:

memset(&var_F0, 0xCC, 240);

The byte value 0xCC corresponds to the x86 opcode for int 3, which is a software interrupt that causes a program to trap to a debugger. The intent of filling the stack frame with an overabundance of 0xCC values may be to ensure that the debugger is invoked in the event that the program somehow attempts to execute instructions from the stack (an error condition that one would hope to catch in a debug build).

The function’s local variables are initialized beginning at , where we note that the variables are not adjacent to one another. The intervening space will have been filled with the value 0xCC by the preceding memset operation. Providing extra space between variables in this manner can make it easier to detect overflows from one variable that may spill into and corrupt another variable. Under normal conditions, none of the 0xCC values used as filler, outside of any declared variables, should be overwritten. For comparison purposes, the release version of the same code is shown here:

.text:004018D0                 push    ebp
.text:004018D1                 mov     ebp, esp
.text:004018D3                sub     esp, 10h
.text:004018D6                mov     [ebp+var_4], 0
.text:004018DD                 mov     [ebp+var_C], 1
.text:004018E4                 mov     [ebp+var_8], 2
.text:004018EB                 mov     [ebp+var_10], 3

In the release version we see that only the required amount of space is requested for local variables and that all four local variables are adjacent to one another . Also note that the use of 0xCC as a filler value has been eliminated.

^[146]Optimization generally involves elimination of redundancy in code or selection of faster, but potentially larger, sequences of code in order to satisfy a developer’s desire to create either faster or smaller executable files. Optimized code may not be as straightforward to analyze as nonoptimized code and may therefore be considered a bad choice for use during a program’s development and debugging phases.

^[147]gcc also offers the ability to insert debugging symbols during the compilation process.

^[148]See http://msdn.microsoft.com/en-us/library/8wtf2dfz.aspx.

Previous Chapter

Locating main

Next Chapter

Alternative Calling Conventions

Table of Contents for The IDA Pro Book, 2nd Edition

Debug vs. Release Binaries

Table of Contents for
The IDA Pro Book, 2nd Edition