Learning Malware Analysis

Let's start by assigning the symbolic names to the address (ebp-4). After assigning the symbolic names to the memory address references, we get the following code:

mov dword ptr [x], 1
cmp dword ptr [x], 0  ➊
jnz loc_40101C  ➋
mov eax, [x]  ➍
xor eax, 2
mov [x], eax
jmp loc_401025  ➌

loc_40101C:  
mov ecx, [x]  ➎
xor ecx, 3  
mov [x], ecx   ➏  

loc_401025:

In the preceding code, notice the cmp and jnz instructions at ➊ and ➋ (this is a conditional statement) and note that jnz is the same as jne (jump if not equal to). Now that we have identified the conditional statement, let's try to determine what type of conditional statement this is (if, or if/else, or if/else if/else, and so on); to do that, focus on the jumps. The conditional jump at ➋ is taken to loc_401010C, and before the loc_40101C, there is an unconditional jump at ➌ to loc_401025. From what we learned previously, this has the characteristics of an if-else statement. To be precise, the code from ➍ to ➌ is part of the if block and the code from ➎ to ➏ is part of the else block. Let's rename loc_40101C to else and loc_401025 to end for better readability:

mov dword ptr [x], 1  ➐
cmp dword ptr [x], 0  ➊
jnz else  ➋
mov eax, [x]  ➍
xor eax, 2
mov [x], eax  ➑
jmp end  ➌

else:
mov ecx, [x]  ➎
xor ecx, 3
mov [x], ecx  ➏
end:

In the preceding assembly code, x is assigned a value of 1 at ➐; the value of x is compared with 0, and if it is equal to 0 (➊ and ➋), the value of x is xor with 2, and the result is stored in x (➍ to ➑). If x is not equal to 0, then the value of x is xor with 3 (➎ to ➏).

Reading the assembly code is slightly tricky, so let's write the preceding code in a high-level language equivalent. We know that ➊ and ➋ is an if statement, and you can read it as jump is taken to else, if x is not equal to 0 (remember jnz is an alias for jne).

If you recall, looking at how the C code was translated to assembly, the condition in the if statement was reversed when translated to assembly code. Since we are now looking at the assembly code, to write these statements back to a high-level language, you need to reverse the condition. To do that, ask yourself this question, at ➋, when will the jump not be taken?. The jump will not be taken when x is equal to 0, so you can write the preceding code to a pseudocode, as follows. Note that in the following code, the cmp and jnz instruction is translated to an if statement; also, note how the condition is reversed:

x = 1
if(x == 0)
{
  eax = x
  eax = eax ^ 2  ➒
  x = eax  ➒
} 
else {
 ecx = x
 ecx = ecx ^ 3  ➒
 x = ecx  ➒
}

Now that we have identified the conditional statements, next let's replace all of the registers on the right-hand side of the = operator (at ➒) with their corresponding values. After doing that, we get the following code:

x = 1
if(x == 0)
{
  eax = x  ➓
  eax = x ^ 2  ➓
  x = x ^ 2
} 
else {
  ecx = x  ➓
  ecx = x ^ 3  ➓
  x = x ^ 3
}

Removing all of the entries containing the registers on the left-hand side of the = operator (at ➓), we get the following code:

x = 1;
if(x == 0)
{
  x = x ^ 2;
} 
else {
  x = x ^ 3;
}

If you are curious, the following is the original C program of the disassembled output used in the disassembly challenge; compare it with what we got in the preceding code snippet. As you can see, we were able to reduce multiple lines of assembly code back to their high-level language equivalent. Now, the code is much easier to understand, as compared to reading the assembly code:

int a = 1;
if (a == 0) 
{
    a = a ^ 2;
}
else {
    a = a ^ 3;
}

Table of Contents for
Learning Malware Analysis

6.7 Disassembly Solution

Table of Contents for Learning Malware Analysis

Table of Contents for
Learning Malware Analysis