Table of Contents for
The IDA Pro Book, 2nd Edition

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition The IDA Pro Book, 2nd Edition by Chris Eagle Published by No Starch Press, 2011
  1. Cover
  2. The IDA Pro Book
  3. PRAISE FOR THE FIRST EDITION OF THE IDA PRO BOOK
  4. Acknowledgments
  5. Introduction
  6. I. Introduction to IDA
  7. 1. Introduction to Disassembly
  8. The What of Disassembly
  9. The Why of Disassembly
  10. The How of Disassembly
  11. Summary
  12. 2. Reversing and Disassembly Tools
  13. Summary Tools
  14. Deep Inspection Tools
  15. Summary
  16. 3. IDA Pro Background
  17. Obtaining IDA Pro
  18. IDA Support Resources
  19. Your IDA Installation
  20. Thoughts on IDA’s User Interface
  21. Summary
  22. II. Basic IDA Usage
  23. 4. Getting Started with IDA
  24. IDA Database Files
  25. Introduction to the IDA Desktop
  26. Desktop Behavior During Initial Analysis
  27. IDA Desktop Tips and Tricks
  28. Reporting Bugs
  29. Summary
  30. 5. IDA Data Displays
  31. Secondary IDA Displays
  32. Tertiary IDA Displays
  33. Summary
  34. 6. Disassembly Navigation
  35. Stack Frames
  36. Searching the Database
  37. Summary
  38. 7. Disassembly Manipulation
  39. Commenting in IDA
  40. Basic Code Transformations
  41. Basic Data Transformations
  42. Summary
  43. 8. Datatypes and Data Structures
  44. Creating IDA Structures
  45. Using Structure Templates
  46. Importing New Structures
  47. Using Standard Structures
  48. IDA TIL Files
  49. C++ Reversing Primer
  50. Summary
  51. 9. Cross-References and Graphing
  52. IDA Graphing
  53. Summary
  54. 10. The Many Faces of IDA
  55. Using IDA’s Batch Mode
  56. Summary
  57. III. Advanced IDA Usage
  58. 11. Customizing IDA
  59. Additional IDA Configuration Options
  60. Summary
  61. 12. Library Recognition Using FLIRT Signatures
  62. Applying FLIRT Signatures
  63. Creating FLIRT Signature Files
  64. Summary
  65. 13. Extending IDA’s Knowledge
  66. Augmenting Predefined Comments with loadint
  67. Summary
  68. 14. Patching Binaries and Other IDA Limitations
  69. IDA Output Files and Patch Generation
  70. Summary
  71. IV. Extending IDA’s Capabilities
  72. 15. IDA Scripting
  73. The IDC Language
  74. Associating IDC Scripts with Hotkeys
  75. Useful IDC Functions
  76. IDC Scripting Examples
  77. IDAPython
  78. IDAPython Scripting Examples
  79. Summary
  80. 16. The IDA Software Development Kit
  81. The IDA Application Programming Interface
  82. Summary
  83. 17. The IDA Plug-in Architecture
  84. Building Your Plug-ins
  85. Installing Plug-ins
  86. Configuring Plug-ins
  87. Extending IDC
  88. Plug-in User Interface Options
  89. Scripted Plug-ins
  90. Summary
  91. 18. Binary Files and IDA Loader Modules
  92. Manually Loading a Windows PE File
  93. IDA Loader Modules
  94. Writing an IDA Loader Using the SDK
  95. Alternative Loader Strategies
  96. Writing a Scripted Loader
  97. Summary
  98. 19. IDA Processor Modules
  99. The Python Interpreter
  100. Writing a Processor Module Using the SDK
  101. Building Processor Modules
  102. Customizing Existing Processors
  103. Processor Module Architecture
  104. Scripting a Processor Module
  105. Summary
  106. V. Real-World Applications
  107. 20. Compiler Personalities
  108. RTTI Implementations
  109. Locating main
  110. Debug vs. Release Binaries
  111. Alternative Calling Conventions
  112. Summary
  113. 21. Obfuscated Code Analysis
  114. Anti–Dynamic Analysis Techniques
  115. Static De-obfuscation of Binaries Using IDA
  116. Virtual Machine-Based Obfuscation
  117. Summary
  118. 22. Vulnerability Analysis
  119. After-the-Fact Vulnerability Discovery with IDA
  120. IDA and the Exploit-Development Process
  121. Analyzing Shellcode
  122. Summary
  123. 23. Real-World IDA Plug-ins
  124. IDAPython
  125. collabREate
  126. ida-x86emu
  127. Class Informer
  128. MyNav
  129. IdaPdf
  130. Summary
  131. VI. The IDA Debugger
  132. 24. The IDA Debugger
  133. Basic Debugger Displays
  134. Process Control
  135. Automating Debugger Tasks
  136. Summary
  137. 25. Disassembler/Debugger Integration
  138. IDA Databases and the IDA Debugger
  139. Debugging Obfuscated Code
  140. IdaStealth
  141. Dealing with Exceptions
  142. Summary
  143. 26. Additional Debugger Features
  144. Debugging with Bochs
  145. Appcall
  146. Summary
  147. A. Using IDA Freeware 5.0
  148. Using IDA Freeware
  149. B. IDC/SDK Cross-Reference
  150. Index
  151. About the Author

Chapter 2. Reversing and Disassembly Tools

image with no caption

With some disassembly background under our belts, and before we begin our dive into the specifics of IDA Pro, it will be useful to understand some of the other tools that are used for reverse engineering binaries. Many of these tools predate IDA and continue to be useful for quick glimpses into files as well as for double-checking the work that IDA does. As we will see, IDA rolls many of the capabilities of these tools into its user interface to provide a single, integrated environment for reverse engineering. Finally, although IDA does contain an integrated debugger, we will not cover debuggers here as Chapter 24, Chapter 25, and Chapter 26 are dedicated to the topic.

Classification Tools

When first confronted with an unknown file, it is often useful to answer simple questions such as “What is this thing?” The first rule of thumb when attempting to answer that question is to never rely on a filename extension to determine what a file actually is. That is also the second, third, and fourth rules of thumb. Once you have become an adherent of the file extensions are meaningless line of thinking, you may wish to familiarize yourself with one or more of the following utilities.

file

The file command is a standard utility, included with most *NIX-style operating systems and with the Cygwin[4] or MinGW[5] tools for Windows. File attempts to identify a file’s type by examining specific fields within the file. In some cases file recognizes common strings such as #!/bin/sh (a shell script) or <html> (an HTML document). Files containing non-ASCII content present somewhat more of a challenge. In such cases, file attempts to determine whether the content appears to be structured according to a known file format. In many cases it searches for specific tag values (often referred to as magic numbers[6]) known to be unique to specific file types. The hex listings below show several examples of magic numbers used to identify some common file types.

Windows PE executable file
00000000   4D 5A 90 00  03 00 00 00  04 00 00 00  FF FF 00 00  MZ..............
00000010   B8 00 00 00  00 00 00 00  40 00 00 00  00 00 00 00  ........@.......

Jpeg image file
00000000   FF D8 FF E0  00 10 4A 46  49 46 00 01  01 01 00 60  ......JFIF.....`
00000010   00 60 00 00  FF DB 00 43  00 0A 07 07  08 07 06 0A  .`.....C........

Java .class file
00000000   CA FE BA BE  00 00 00 32  00 98 0A 00  2E 00 3E 08  .......2......>.
00000010   00 3F 09 00  40 00 41 08  00 42 0A 00  43 00 44 0A  .?..@.A..B..C.D.

file has the capability to identify a large number of file formats, including several types of ASCII text files and various executable and data file formats. The magic number checks performed by file are governed by rules contained in a magic file. The default magic file varies by operating system, but common locations include /usr/share/file/magic, /usr/share/misc/magic, and /etc/magic. Please refer to the documentation for file for more information concerning magic files.

In some cases, file can distinguish variations within a given file type. The following listing demonstrates file’s ability to identify not only several variations of ELF binaries but also information pertaining to how the binary was linked (statically or dynamically) and whether the binary was stripped or not.

idabook# file ch2_ex_*
ch2_ex.exe:                  MS-DOS executable PE  for MS Windows (console)
                             Intel 80386 32-bit
ch2_ex_upx.exe:              MS-DOS executable PE  for MS Windows (console)
                             Intel 80386 32-bit, UPX compressed
ch2_ex_freebsd:              ELF 32-bit LSB executable, Intel 80386,
                             version 1 (FreeBSD), for FreeBSD 5.4,
                             dynamically linked (uses shared libs),
                             FreeBSD-style, not stripped
ch2_ex_freebsd_static:       ELF 32-bit LSB executable, Intel 80386,
                             version 1 (FreeBSD), for FreeBSD 5.4,
                             statically linked, FreeBSD-style, not stripped
ch2_ex_freebsd_static_strip: ELF 32-bit LSB executable, Intel 80386,
                             version 1 (FreeBSD), for FreeBSD 5.4,
                             statically linked, FreeBSD-style, stripped
ch2_ex_linux:                ELF 32-bit LSB executable, Intel 80386,
                             version 1 (SYSV), for GNU/Linux 2.6.9,
                             dynamically linked (uses shared libs),
                             not stripped
ch2_ex_linux_static:         ELF 32-bit LSB executable, Intel 80386,
                             version 1 (SYSV), for GNU/Linux 2.6.9,
                             statically linked, not stripped
ch2_ex_linux_static_strip:   ELF 32-bit LSB executable, Intel 80386,
                             version 1 (SYSV), for GNU/Linux 2.6.9,
                             statically linked, stripped
ch2_ex_linux_stripped:       ELF 32-bit LSB executable, Intel 80386,
                             version 1 (SYSV), for GNU/Linux 2.6.9,
                             dynamically linked (uses shared libs), stripped

file and similar utilities are not foolproof. It is quite possible for a file to be misidentified simply because it happens to bear the identifying marks of some file format. You can see this for yourself by using a hex editor to modify the first four bytes of any file to the Java magic number sequence: CA FE BA BE. The file utility will incorrectly identify the newly modified file as compiled Java class data. Similarly, a text file containing only the two characters MZ will be identified as an MS-DOS executable. A good approach to take in any reverse engineering effort is to never fully trust the output of any tool until you have correlated that output with several tools and manual analysis.

PE Tools

PE Tools[7] is a collection of tools useful for analyzing both running processes and executable files on Windows systems. Figure 2-1 shows the primary interface offered by PE Tools, which displays a list of active processes and provides access to all of the PE Tools utilities.

The PE Tools utility

Figure 2-1. The PE Tools utility

From the process list, users can dump a process’s memory image to a file or utilize the PE Sniffer utility to determine what compiler was used to build the executable or whether the executable was processed by any known obfuscation utilities. The Tools menu offers similar options for analysis of disk files. Users can view a file’s PE header fields by using the embedded PE Editor utility, which also allows for easy modification of any header values. Modification of PE headers is often required when attempting to reconstruct a valid PE from an obfuscated version of that file.

PEiD

PEiD[8] is another Windows tool whose primary purposes are to identify the compiler used to build a particular Windows PE binary and to identify any tools used to obfuscate a Windows PE binary. Figure 2-2 shows the use of PEiD to identify the tool (ASPack in this case) used to obfuscate a variant of the Gaobot[9] worm.

The PEiD utility

Figure 2-2. The PEiD utility

Many additional capabilities of PEiD overlap those of PE Tools, including the ability to summarize PE file headers, collect information on running processes, and perform basic disassembly.



[6] A magic number is a special tag value required by some file format specifications whose presence indicates conformance to such specifications. In some cases humorous reasons surround the selection of magic numbers. The MZ tag in MS-DOS executable file headers represents the initials of Mark Zbikowski, one of the original architects of MS-DOS, while the hex value 0xcafebabe, the well-known magic number associated with Java .class files, was chosen because it is an easily remembered sequence of hex digits.