Encoding/decoding – XOR Deobfuscation

You will come across the XOR Boolean operation being used for initialization of variables as xor eax,eax or as an elementary obfuscation device. In the following simple C code, you can trace through sample XORing de-obfuscation of an ASCII string with a single static key and a dynamic key. You can also make use of string matches and brute-forcing (static key in this sample, you can easily replace it or embellish it with the dynamic key using one line of code, try it) function to get an idea as to how it may be used by malware. Use the locals window in VC++ to check the variable values within the loop and function scopes:

#include "stdafx.h"
#include <conio.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

void dynaXor(char *p, int key){
  int l=strlen(p);
  for (int i =0; i< l; i++) {

    printf("%c",p[i]^key);
    key+=1;   //the key is incremented for every subsequent byte xor
  }
  printf("\n");

}

void xor(char *p, int key){
  int l=strlen(p);
  for (int i =0; i<l; i++) {

  printf("%c",p[i]^key);    //key is static
}
  printf("\n");

}

void bruteForcer(char *p, char *matchString, int fourByteMode){

  int length = strlen(p);
  int matchLength=strlen(matchString);
  int exitFlag=0;
  unsigned int xorLength=256;   //default length of 1 byte xor

  if (fourByteMode == 1) { //increases the xor key range to (2^32-1)
     
  xorLength=UINT_MAX-1;
}

for(int i=0; i < xorLength ; i++) {

  if (exitFlag==1) {
    break;
  }
  int counter =0;
  int hitIndex=0;

/*
#pragma region conditional breakpoint emulation
//since we already know the sample key in code 0x22, which gets stores in EAX (use the disassembly window and registers view in VC++ 2008 Express Edition as discussed in earlier chapters), you can set a conditional breakpoint using the int 3 assembly mnemonic. Uncomment for use and replace with key of your choice.

_asm{
  cmp eax,0x22
  jne normal
  int 3
  normal:
  nop

}
#pragma endregion
*/

for (int j =0; j < length; j++) {
  printf("%c",p[j]^i);

//If there is no match string then it just bruteforces all the values and //displays them in standard output
//else it looks for a continuous match for every first hit of the match string and the subsequent characters, and quits if a match is successfully found.

  if (matchString!=""){
    char temp=p[j]^i;
    if ((int)(matchString[counter])==(int)(p[j]^i) && (j-hitIndex) < matchLength ){
      if (counter == 0) {
        hitIndex=j;
      }
      if (counter == (length-1) && matchString!="") {
        printf(" :  match is true at key 0x%x",i );

        exitFlag=1; break;
      }
    counter++;

    } else {
      counter=0;
      }
    }
  }
printf("\n");

}

}


int _tmain(int argc, _TCHAR* argv[])
{

  char * p1 = (char *)malloc(strlen("@MLHMWP"));
  strcpy(p1,"@MLHMWP");         //pre-xored obfuscated string
    
  char * p2 = (char *)malloc(strlen("@LJOIRZ"));
  strcpy(p2,"@LJOIRZ");
   
  printf("Xor de-obfuscation for %s with key 0x22: ",p1);  
  xor(p1,0x22);

  printf("Dynamic xor de-obfuscation for %s with key 0x22: ",p2);
  dynaXor(p2,0x22);

  bruteForcer(p1,"bonjour",0);  //use 1 as 3rd parameter for 4 byte xor


  getche();
  return 0;
}

Output:

Encoding/decoding – XOR Deobfuscation

For malware research and xor deobfuscation of malware codes, and detection of strings that may be initially obfuscated in the static file image, XORSearch and XORStrings are two pre-fabricated command line and open source tools available at http://blog.didierstevens.com/programs/xorsearch/.

They both have additional modes for ROL, ROT, and SHIFT as well. You supply the mode type, the key, and the file to work on.

To de-obfuscate the memory regions (code/data), you can:

  • Let the malware decrypt/deobfuscate itself inside a debugger, then halt the debugger and proceed with analysis thereafter. As mentioned earlier, the OllyDump plugin in OllyDbg or the Debugger | Take memory snapshot feature in IDA Pro will be very useful during dynamic analysis.
  • Utilize a scripted disassemble/debugger to write customized scripts using their inbuilt languages, such as IDC for IDA Pro and Python for Immunity Debugger. This method can be useful even in the case where the malware cannot be executed if it is corrupted or partially unpacked.
  • Copy the regions from the debugger into a C character array and proceed with programmatic de-obfuscation of the regions by feeding the array into a loop with the decrypting logic implemented accordingly. In OllyDbg, you can use right-click Binary | Binary Copy to get the spaced textual representation of the hexadecimal codes/data.
    Encoding/decoding – XOR Deobfuscation