Table of Contents for
Practical Malware Analysis

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Practical Malware Analysis by Andrew Honig Published by No Starch Press, 2012
  1. Cover
  2. Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software
  3. Praise for Practical Malware Analysis
  4. Warning
  5. About the Authors
  6. About the Technical Reviewer
  7. About the Contributing Authors
  8. Foreword
  9. Acknowledgments
  10. Individual Thanks
  11. Introduction
  12. What Is Malware Analysis?
  13. Prerequisites
  14. Practical, Hands-On Learning
  15. What’s in the Book?
  16. 0. Malware Analysis Primer
  17. The Goals of Malware Analysis
  18. Malware Analysis Techniques
  19. Types of Malware
  20. General Rules for Malware Analysis
  21. I. Basic Analysis
  22. 1. Basic Static Techniques
  23. Antivirus Scanning: A Useful First Step
  24. Hashing: A Fingerprint for Malware
  25. Finding Strings
  26. Packed and Obfuscated Malware
  27. Portable Executable File Format
  28. Linked Libraries and Functions
  29. Static Analysis in Practice
  30. The PE File Headers and Sections
  31. Conclusion
  32. Labs
  33. 2. Malware Analysis in Virtual Machines
  34. The Structure of a Virtual Machine
  35. Creating Your Malware Analysis Machine
  36. Using Your Malware Analysis Machine
  37. The Risks of Using VMware for Malware Analysis
  38. Record/Replay: Running Your Computer in Reverse
  39. Conclusion
  40. 3. Basic Dynamic Analysis
  41. Sandboxes: The Quick-and-Dirty Approach
  42. Running Malware
  43. Monitoring with Process Monitor
  44. Viewing Processes with Process Explorer
  45. Comparing Registry Snapshots with Regshot
  46. Faking a Network
  47. Packet Sniffing with Wireshark
  48. Using INetSim
  49. Basic Dynamic Tools in Practice
  50. Conclusion
  51. Labs
  52. II. Advanced Static Analysis
  53. 4. A Crash Course in x86 Disassembly
  54. Levels of Abstraction
  55. Reverse-Engineering
  56. The x86 Architecture
  57. Conclusion
  58. 5. IDA Pro
  59. Loading an Executable
  60. The IDA Pro Interface
  61. Using Cross-References
  62. Analyzing Functions
  63. Using Graphing Options
  64. Enhancing Disassembly
  65. Extending IDA with Plug-ins
  66. Conclusion
  67. Labs
  68. 6. Recognizing C Code Constructs in Assembly
  69. Global vs. Local Variables
  70. Disassembling Arithmetic Operations
  71. Recognizing if Statements
  72. Recognizing Loops
  73. Understanding Function Call Conventions
  74. Analyzing switch Statements
  75. Disassembling Arrays
  76. Identifying Structs
  77. Analyzing Linked List Traversal
  78. Conclusion
  79. Labs
  80. 7. Analyzing Malicious Windows Programs
  81. The Windows API
  82. The Windows Registry
  83. Networking APIs
  84. Following Running Malware
  85. Kernel vs. User Mode
  86. The Native API
  87. Conclusion
  88. Labs
  89. III. Advanced Dynamic Analysis
  90. 8. Debugging
  91. Source-Level vs. Assembly-Level Debuggers
  92. Kernel vs. User-Mode Debugging
  93. Using a Debugger
  94. Exceptions
  95. Modifying Execution with a Debugger
  96. Modifying Program Execution in Practice
  97. Conclusion
  98. 9. OllyDbg
  99. Loading Malware
  100. The OllyDbg Interface
  101. Memory Map
  102. Viewing Threads and Stacks
  103. Executing Code
  104. Breakpoints
  105. Loading DLLs
  106. Tracing
  107. Exception Handling
  108. Patching
  109. Analyzing Shellcode
  110. Assistance Features
  111. Plug-ins
  112. Scriptable Debugging
  113. Conclusion
  114. Labs
  115. 10. Kernel Debugging with WinDbg
  116. Drivers and Kernel Code
  117. Setting Up Kernel Debugging
  118. Using WinDbg
  119. Microsoft Symbols
  120. Kernel Debugging in Practice
  121. Rootkits
  122. Loading Drivers
  123. Kernel Issues for Windows Vista, Windows 7, and x64 Versions
  124. Conclusion
  125. Labs
  126. IV. Malware Functionality
  127. 11. Malware Behavior
  128. Downloaders and Launchers
  129. Backdoors
  130. Credential Stealers
  131. Persistence Mechanisms
  132. Privilege Escalation
  133. Covering Its Tracks—User-Mode Rootkits
  134. Conclusion
  135. Labs
  136. 12. Covert Malware Launching
  137. Launchers
  138. Process Injection
  139. Process Replacement
  140. Hook Injection
  141. Detours
  142. APC Injection
  143. Conclusion
  144. Labs
  145. 13. Data Encoding
  146. The Goal of Analyzing Encoding Algorithms
  147. Simple Ciphers
  148. Common Cryptographic Algorithms
  149. Custom Encoding
  150. Decoding
  151. Conclusion
  152. Labs
  153. 14. Malware-Focused Network Signatures
  154. Network Countermeasures
  155. Safely Investigate an Attacker Online
  156. Content-Based Network Countermeasures
  157. Combining Dynamic and Static Analysis Techniques
  158. Understanding the Attacker’s Perspective
  159. Conclusion
  160. Labs
  161. V. Anti-Reverse-Engineering
  162. 15. Anti-Disassembly
  163. Understanding Anti-Disassembly
  164. Defeating Disassembly Algorithms
  165. Anti-Disassembly Techniques
  166. Obscuring Flow Control
  167. Thwarting Stack-Frame Analysis
  168. Conclusion
  169. Labs
  170. 16. Anti-Debugging
  171. Windows Debugger Detection
  172. Identifying Debugger Behavior
  173. Interfering with Debugger Functionality
  174. Debugger Vulnerabilities
  175. Conclusion
  176. Labs
  177. 17. Anti-Virtual Machine Techniques
  178. VMware Artifacts
  179. Vulnerable Instructions
  180. Tweaking Settings
  181. Escaping the Virtual Machine
  182. Conclusion
  183. Labs
  184. 18. Packers and Unpacking
  185. Packer Anatomy
  186. Identifying Packed Programs
  187. Unpacking Options
  188. Automated Unpacking
  189. Manual Unpacking
  190. Tips and Tricks for Common Packers
  191. Analyzing Without Fully Unpacking
  192. Packed DLLs
  193. Conclusion
  194. Labs
  195. VI. Special Topics
  196. 19. Shellcode Analysis
  197. Loading Shellcode for Analysis
  198. Position-Independent Code
  199. Identifying Execution Location
  200. Manual Symbol Resolution
  201. A Full Hello World Example
  202. Shellcode Encodings
  203. NOP Sleds
  204. Finding Shellcode
  205. Conclusion
  206. Labs
  207. 20. C++ Analysis
  208. Object-Oriented Programming
  209. Virtual vs. Nonvirtual Functions
  210. Creating and Destroying Objects
  211. Conclusion
  212. Labs
  213. 21. 64-Bit Malware
  214. Why 64-Bit Malware?
  215. Differences in x64 Architecture
  216. Windows 32-Bit on Windows 64-Bit
  217. 64-Bit Hints at Malware Functionality
  218. Conclusion
  219. Labs
  220. A. Important Windows Functions
  221. B. Tools for Malware Analysis
  222. C. Solutions to Labs
  223. Lab 1-1 Solutions
  224. Lab 1-2 Solutions
  225. Lab 1-3 Solutions
  226. Lab 1-4 Solutions
  227. Lab 3-1 Solutions
  228. Lab 3-2 Solutions
  229. Lab 3-3 Solutions
  230. Lab 3-4 Solutions
  231. Lab 5-1 Solutions
  232. Lab 6-1 Solutions
  233. Lab 6-2 Solutions
  234. Lab 6-3 Solutions
  235. Lab 6-4 Solutions
  236. Lab 7-1 Solutions
  237. Lab 7-2 Solutions
  238. Lab 7-3 Solutions
  239. Lab 9-1 Solutions
  240. Lab 9-2 Solutions
  241. Lab 9-3 Solutions
  242. Lab 10-1 Solutions
  243. Lab 10-2 Solutions
  244. Lab 10-3 Solutions
  245. Lab 11-1 Solutions
  246. Lab 11-2 Solutions
  247. Lab 11-3 Solutions
  248. Lab 12-1 Solutions
  249. Lab 12-2 Solutions
  250. Lab 12-3 Solutions
  251. Lab 12-4 Solutions
  252. Lab 13-1 Solutions
  253. Lab 13-2 Solutions
  254. Lab 13-3 Solutions
  255. Lab 14-1 Solutions
  256. Lab 14-2 Solutions
  257. Lab 14-3 Solutions
  258. Lab 15-1 Solutions
  259. Lab 15-2 Solutions
  260. Lab 15-3 Solutions
  261. Lab 16-1 Solutions
  262. Lab 16-2 Solutions
  263. Lab 16-3 Solutions
  264. Lab 17-1 Solutions
  265. Lab 17-2 Solutions
  266. Lab 17-3 Solutions
  267. Lab 18-1 Solutions
  268. Lab 18-2 Solutions
  269. Lab 18-3 Solutions
  270. Lab 18-4 Solutions
  271. Lab 18-5 Solutions
  272. Lab 19-1 Solutions
  273. Lab 19-2 Solutions
  274. Lab 19-3 Solutions
  275. Lab 20-1 Solutions
  276. Lab 20-2 Solutions
  277. Lab 20-3 Solutions
  278. Lab 21-1 Solutions
  279. Lab 21-2 Solutions
  280. Index
  281. Index
  282. Index
  283. Index
  284. Index
  285. Index
  286. Index
  287. Index
  288. Index
  289. Index
  290. Index
  291. Index
  292. Index
  293. Index
  294. Index
  295. Index
  296. Index
  297. Index
  298. Index
  299. Index
  300. Index
  301. Index
  302. Index
  303. Index
  304. Index
  305. Index
  306. Index
  307. Updates
  308. About the Authors
  309. Copyright

Lab 6-2 Solutions

Short Answers

  1. The first subroutine at 0x401000 is the same as in Lab 6-1 Solutions. It’s an if statement that checks for an active Internet connection.

  2. printf is the subroutine located at 0x40117F.

  3. The second function called from main is located at 0x401040. It downloads the web page located at: http://www.practicalmalwareanalysis.com/cc.htm and parses an HTML comment from the beginning of the page.

  4. This subroutine uses a character array filled with data from the call to InternetReadFile. This array is compared one byte at a time to parse an HTML comment.

  5. There are two network-based indicators. The program uses the HTTP User-Agent Internet Explorer 7.5/pma and downloads the web page located at: http://www.practicalmalwareanalysis.com/cc.htm.

  6. First, the program checks for an active Internet connection. If none is found, the program terminates. Otherwise, the program attempts to download a web page using a unique User-Agent. This web page contains an embedded HTML comment starting with <!--. The next character is parsed from this comment and printed to the screen in the format “Success: Parsed command is X,” where X is the character parsed from the HTML comment. If successful, the program will sleep for 1 minute and then terminate.

Detailed Analysis

We begin by performing basic static analysis on the binary. We see several new strings of interest, as shown in Example C-1.

Example C-1. Interesting new strings contained in Lab 6-2 Solutions

Error 2.3: Fail to get command
Error 2.2: Fail to ReadFile
Error 2.1: Fail to OpenUrl
http://www.practicalmalwareanalysis.com/cc.htm
Internet Explorer 7.5/pma
Success: Parsed command is %c

The three error message strings that we see suggest that the program may open a web page and parse a command. We also notice a URL for an HTML web page, http://www.practicalmalwareanalysis.com/cc.htm. This domain can be used immediately as a network-based indicator.

These imports contain several new Windows API functions used for networking, as shown in Example C-2.

Example C-2. Interesting new import functions contained in Lab 6-2 Solutions

InternetReadFile
InternetCloseHandle
InternetOpenUrlA
InternetOpenA

All of these functions are part of WinINet, a simple API for using HTTP over a network. They work as follows:

  • InternetOpenA is used to initialize the use of the WinINet library, and it sets the User-Agent used for HTTP communication.

  • InternetOpenUrlA is used to open a handle to a location specified by a complete FTP or HTTP URL. (Programs use handles to access something that has been opened. We discuss handles in Chapter 7.)

  • InternetReadFile is used to read data from the handle opened by InternetOpenUrlA.

  • InternetCloseHandle is used to close the handles opened by these files.

Next, we perform dynamic analysis. We choose to listen on port 80 because WinINet often uses HTTP and we saw a URL in the strings. If we set up Netcat to listen on port 80 and redirect the DNS accordingly, we will see a DNS query for www.practicalmalwareanalysis.com, after which the program requests a web page from the URL, as shown in Example C-3. This tells us that this web page has some significance to the malware, but we won’t know what that is until we analyze the disassembly.

Example C-3. Netcat output when listening on port 80

C:\>nc -l -p 80

GET /cc.htm HTTP/1.1
User-Agent: Internet Explorer 7.5/pma
Host: www.practicalmalwareanalysis.com

Finally, we load the executable into IDA Pro. We begin our analysis with the main method since much of the other code is generated by the compiler. Looking at the disassembly for main, we notice that it calls the same method at 0x401000 that we saw in Lab 6-1 Solutions. However, two new calls (401040 and 40117F) in the main method were not in Lab 6-1 Solutions.

In the new call to 0x40117F, we notice that two parameters are pushed on the stack before the call. One parameter is the format string Success: Parsed command is %c, and the other is the byte returned from the previous call at 0x401148. Format characters such as %c and %d tell us that we’re looking at a format string. Therefore, we can deduce that printf is the subroutine located at 0x40117F, and we should rename it as such, so that it’s renamed everywhere it is referenced. The printf subroutine will print the string with the %c replaced by the other parameter pushed on the stack.

Next, we examine the new call to 0x401040. This function contains all of the WinINet API calls we discovered during the basic static analysis process. It first calls InternetOpen, which initializes the use of the WinINet library. Notice that Internet Explorer 7.5/pma is pushed on the stack, matching the User-Agent we noticed during dynamic analysis. The next call is to InternetOpenUrl, which opens the static web page pushed onto the stack as a parameter. This function caused the DNS request we saw during dynamic analysis.

Example C-4 shows the InternetOpenUrlA and the InternetReadFile calls.

Example C-4. InternetOpenUrlA and InternetReadFile calls

00401070     call    ds:InternetOpenUrlA
00401076     mov     [ebp+hFile], eax
00401079     cmp     [ebp+hFile], 0  
...
0040109D     lea     edx, [ebp+dwNumberOfBytesRead]
004010A0     push    edx  ; lpdwNumberOfBytesRead
004010A1     push    200h ;  dwNumberOfBytesToRead
004010A6     lea     eax, [ebp+Buffer ]
004010AC     push    eax           ; lpBuffer
004010AD     mov     ecx, [ebp+hFile]
004010B0     push    ecx           ; hFile
004010B1     call    ds:InternetReadFile
004010B7     mov     [ebp+var_4], eax
004010BA     cmp     [ebp+var_4], 0 
004010BE     jnz     short loc_4010E5

We can see that the return value from InternetOpenUrlA is moved into the local variable hFile and compared to 0 at . If it is 0, this function will be terminated; otherwise, the hFile variable will be passed to the next function, InternetReadFile. The hFile variable is a handle—a way to access something that has been opened. This handle is accessing a URL.

InternetReadFile is used to read the web page opened by InternetOpenUrlA. If we read the MSDN page on this API function, we can learn about the other parameters. The most important of these parameters is the second one, which IDA Pro has labels Buffer, as shown at . Buffer is an array of data, and in this case, we will be reading up to 0x200 bytes worth of data, as shown by the NumberOfBytesToRead parameter at . Since we know that this function is reading an HTML web page, we can think of Buffer as an array of characters.

Following the call to InternetReadFile, code at checks to see if the return value (EAX) is 0. If it is 0, the function closes the handles and terminates; if not, the code immediately following this line compares Buffer one character at a time, as shown in Example C-5. Notice that each time, the index into Buffer goes up by 1 before it is moved into a register, and then compared.

Example C-5. Buffer handling

004010E5     movsx   ecx, byte ptr [ebp+Buffer]
004010EC     cmp     ecx, 3Ch 
004010EF     jnz     short loc_40111D
004010F1     movsx   edx, byte ptr [ebp+Buffer+1] 
004010F8     cmp     edx, 21h
004010FB     jnz     short loc_40111D
004010FD     movsx   eax, byte ptr [ebp+Buffer+2]
00401104     cmp     eax, 2Dh
00401107     jnz     short loc_40111D
00401109     movsx   ecx, byte ptr [ebp+Buffer+3]
00401110     cmp     ecx, 2Dh
00401113     jnz     short loc_40111D
00401115     mov     al, [ebp+var_20C] 
0040111B     jmp     short loc_40112C

At , the cmp instruction checks to see if the first character is equal to 0x3C, which corresponds to the < symbol in ASCII. We can right-click on 3Ch, and IDA Pro will offer to change it to display <. In the same way, we can do this throughout the listing for 21h, 2Dh, and 2Dh. If we combine the characters, we will have the string <!--, which happens to be the start of a comment in HTML. (HTML comments are not displayed when viewing web pages in a browser, but you can see them by viewing the web page source.)

Notice at that Buffer+1 is moved into EDX before it is compared to 0x21 (! in ASCII). Therefore, we can assume that Buffer is an array of characters from the web page downloaded by InternetReadFile. Since Buffer points to the start of the web page, the four cmp instructions are used to check for an HTML comment immediately at the start of the web page. If all comparisons are successful, the web page starts with the embedded HTML comment, and the code at is executed. (Unfortunately, IDA Pro fails to realize that the local variable Buffer is of size 512 and has displayed a local variable named var_20C instead.)

We need to fix the stack of this function to display a 512-byte array in order for the Buffer array to be labeled properly throughout the function. We can do this by pressing CTRL-K anywhere within the function. For example, the left side of Figure C-19 shows the initial stack view. To fix the stack, we right-click on the first byte of Buffer and define an array 1 byte wide and 512 bytes large. The right side of the figure shows what the corrected stack should look like.

Manually adjusting the stack like this will cause the instruction numbered in Example C-5 to be displayed as [ebp+Buffer+4]. Therefore, if the first four characters (Buffer[0]-Buffer[3]) match <!--, the fifth character will be moved into AL and returned from this function.

Creating an array and fixing the stack

Figure C-19. Creating an array and fixing the stack

Returning to the main method, let’s analyze what happens after the 0x401040 function returns. If this function returns a nonzero value, the main method will print as “Success: Parsed command is X,” where X is the character parsed from the HTML comment, followed by a call to the Sleep function at 0x401173. Using MSDN, we learn that the Sleep function takes a single parameter containing the number of milliseconds to sleep. It pushes 0xEA60 on the stack, which corresponds to sleeping for one minute (60,000 milliseconds).

To summarize, this program checks for an active Internet connection, and then downloads a web page containing the string <!--, the start of a comment in HTML. An HTML comment will not be displayed in a web browser, but you can view it by looking at the HTML page source. This technique of hiding commands in HTML comments is used frequently by attackers to send commands to malware while having the malware appear as if it were going to a normal web page.