Several strings that look like error messages (Error sending Http
post, Error sending Http get, Error reading response, and so on) tell us that this program will be using HTTP GET and POST commands. We also see HTML
paths (/srv.html, /put.html, and so on), which hint at the files
that this malware will attempt to open.
Several WS2_32 imports tell us that this program will be
communicating over the network. An import to CreateProcess
suggests that this program may launch another process.
The function called at 0x4036F0 does not take any parameters other than the string, but ECX
contains the this pointer for the object. We know the object that
contains the function is an exception object because that object is later used as a parameter to the
CxxThrowException functions. We can tell from the context that
the function at 0x4036F0 initializes an exception object, which stores a string that describes what
caused the exception.
The six entries of the switch table implement six different backdoor commands: NOOP, sleep, execute a program, download a file, upload a file, and survey the victim.
The program implements a backdoor that uses HTTP as the command channel and has the ability to launch programs, download or upload a file, and collect information about the victim machine.
When we look at the program’s strings, we see several that look like error messages, as shown in Example C-214.
Example C-214. Abbreviated listing of strings from Lab20-03.exe
Encoding Args Error Beacon response Error Caught exception during pollstatus: %s Polling error Arg parsing error Error uploading file Error downloading file Error conducting machine survey Create Process Failed Failed to gather victim information Config error Caught exception in main: %s Socket Connection Error Host lookup failed. Send Data Error Error reading response Error sending Http get Error sending Http post
These error messages provide excellent insight into the program’s functionality. These messages tell us that the malware probably does the following:
Uses HTTP POST and GET
commands
Sends a beacon to a remote machine
Polls a remote server for some reason (probably for commands to execute)
Uploads files
Downloads files
Creates additional processes
Conducts a machine survey
With just the information from these strings, we can guess that this program is a backdoor
that uses HTTP GET and POST
commands for command and control. It looks like the program supports uploading files, downloading
files, creating a new process, and surveying the victim’s computer.
When we open the program in IDA Pro, we see that its main
method calls a function at 0x403BE0 and then returns. The function at 0x403BE0 contains the main
program flow, so we will call it main2. It starts by creating a
new object with the new operator and calling a function for the
new object with config.dat as an argument to the function, as
shown in Example C-215.
Example C-215. An object being created and used in main2
00403C03 push 30h 00403C05 mov [ebp+var_4], ebx 00403C08 ❶call ??2@YAPAXI@Z ; operator new(uint) 00403C0D ❷mov ecx, eax 00403C0F add esp, 4 00403C12 mov [ebp+var_14], ecx 00403C15 cmp ecx, ebx 00403C17 mov byte ptr [ebp+var_4], 1 00403C1B jz short loc_403C2B 00403C1D push offset FileName ; "config.dat" 00403C22 ❸call sub_401EE0 00403C27 mov esi, eax
IDA Pro labels the new operator at ❶ and returns a
pointer to the new object in EAX. A pointer to the object is moved into ECX at ❷, where it is used as the this
pointer to the function call at ❸. This tells us that
the function sub_401EE0 is a member function of the class of the
object created at ❶. For now, we’ll call this
object firstObject. Example C-216 shows how it’s used in sub_401EE0.
Example C-216. The first function being called on firstObject
00401EF7 ❶mov esi, ecx 00401EF9 push 194h 00401EFE ❷call ??2@YAPAXI@Z ; operator new(uint) 00401F03 add esp, 4 00401F06 mov [esp+14h+var_10], eax 00401F0A test eax, eax 00401F0C mov [esp+14h+var_4], 0 00401F14 jz short loc_401F24 00401F16 mov ecx, [esp+14h+arg_0] 00401F1A push ecx 00401F1B mov ecx, eax 00401F1D ❸call sub_403180
sub_401EE0 first stores the pointer to firstObject in ESI at ❶, and
then creates another new object at ❷, which we’ll
call secondObject. Then it calls a function of the secondObject at ❸. We need to
keep analyzing before we can determine the purpose of these objects, so we now look at sub_403180, as shown in Example C-217.
Example C-217. An exception being created and thrown
00403199 push offset FileName ; "config.dat" 0040319E mov dword ptr [esi], offset off_41015C 004031A4 mov byte ptr [esi+18Ch], 4Eh 004031AB ❶call ds:CreateFileA 004031B1 mov edi, eax 004031B3 cmp edi, 0FFFFFFFFh 004031B6 ❷jnz short loc_4031D5 004031B8 push offset aConfigError ; "Config error" 004031BD ❹lea ecx, [esp+0BCh+var_AC] 004031C1 ❸call sub_4036F0 004031C6 lea eax, [esp+0B8h+var_AC] 004031CA push offset unk_411560 004031CF ❺push eax 004031D0 call __CxxThrowException@8 ; _CxxThrowException(x,x)
Based on the call to CreateFileA with the
config.dat filename, we guess that this function reads the configuration file
from disk, and we rename it setupConfig. The code in Example C-217 tries to open the config.dat
file at ❶. If the file is opened successfully, a jump is
taken, and the remainder of the code in Example C-217 is
skipped, as shown at ❷. If the file is not opened
successfully, we see the string Config error passed as an
argument to the function at 0x4036F0 at ❸.
The function at 0x4036F0 takes the strings as a parameter, but also uses ECX as the this pointer. A reference to the object used by the this pointer is stored on the stack at var_AC at ❹. We later see that object passed
to the CxxThrowException function at ❺, which tells us that the function at 0x4036F0 is a member
function of an exception object. Based on the context in which sub_4036F0 is called, we can assume that the function is initializing an exception with
the string Config error.
It’s important to recognize the function call with an error string argument followed by
a call to CxxThrowException because similar code consisting of an
error string passed to a function followed by a call to CxxThrowException appears throughout this program. Each time we see this pattern, we can
conclude that the function is initializing an exception, so we don’t need to waste time
analyzing these functions.
If we continue analyzing the function at 0x403180, we realize that it reads data from the
configuration file config.dat and stores it in secondObject. We can now conclude that secondObject is
an object to store and read configuration information, and we rename it configObject.
Now we return to sub_401EE0 to see if we can better
determine how firstObject is used. After creating the configObject object, sub_401EE0 stores
a bunch of information in firstObject, as shown in Example C-218.
Example C-218. Data being stored in firstObject
00401F2A mov [esi], eax 00401F2C mov dword ptr [esi+10h], offset aIndex_html ; "/index.html" 00401F33 mov dword ptr [esi+14h], offset aInfo_html ; "/info.html" 00401F3A mov dword ptr [esi+18h], offset aResponse_html ; "/response.html" 00401F41 mov dword ptr [esi+1Ch], offset aGet_html ; "/get.html" 00401F48 mov dword ptr [esi+20h], offset aPut_html ; "/put.html" 00401F4F mov dword ptr [esi+24h], offset aSrv_html ; "/srv.html" 00401F56 mov dword ptr [esi+28h], 544F4349h 00401F5D mov dword ptr [esi+2Ch], 41534744h 00401F64 mov eax, esi
First, eax is stored in firstObject, formerly a pointer to configObject. Next,
we see a series of hard-coded URL paths, then two hard-coded integers, and then the function returns
a pointer to firstObject. We still can’t be completely sure
what firstObject does, but it appears to store all of the
program’s global data, so we’ll rename this object globalDataObject for now, until we can learn enough to give it a better name.
We have now finished analyzing the first function called by main2. We have determined that it loads the configuration information from a file and
initializes an object that stores the global data for the program. Having analyzed the first
function that it calls, we can now return to main2. The remainder
of main2 is shown in Example C-219.
Example C-219. Beacon and poll commands in the main2 function
00403C2D ❶mov ecx, esi 00403C2F mov byte ptr [ebp+var_4], bl 00403C32 call sub_401F80 00403C37 mov edi, ds:Sleep 00403C3D loc_403C3D: 00403C3D mov eax, [esi] 00403C3F mov eax, [eax+190h] 00403C45 lea eax, [eax+eax*4] 00403C48 lea eax, [eax+eax*4] 00403C4B lea ecx, [eax+eax*4] 00403C4E shl ecx, 2 00403C51 push ecx ; dwMilliseconds 00403C52 call edi ; Sleep 00403C54 ❷mov ecx, esi 00403C56 call loc_402410 00403C5B inc ebx 00403C5C jmp short loc_403C3D
We see that this function calls sub_401F80 outside the
loop, and then it calls sub_402410 and the Sleep function inside an infinite loop. From what we know about the
program from the strings, we could guess that sub_401F80 sends a
beacon to the remote machine and that sub_402410 polls the remote
server. We’ll rename those functions maybe_beacon and
maybe_poll. We see that maybe_beacon and maybe_poll are both passed our
globalDataObject in the ECX pointer (at ❶ and ❷), and that
they are member functions of what we’ve called globalDataObject. Based on this realization, we’ll rename our object mainObject.
First, we’ll analyze maybe_beacon. We see that it
creates another new object and calls sub_403D50, as shown in
Example C-220.
Example C-220. First function call in the maybe_beacon function
00401FC8 mov ❶eax, [esi] 00401FCA mov ❷edx, [eax+144h] 00401FD0 add ❸eax, 104h 00401FD5 push edx ; hostshort 00401FD6 push eax ; char * 00401FD7 call sub_403D50
We see that IDA Pro has labeled some of the arguments to sub_403D50 because it knows they will be used as parameters to imported functions later.
The most telling of these is hostshort, which tells us that it
will be used as a parameter to the networking function htons. The
values for these parameters are retrieved from our mainObject,
which was stored in ESI.
We see that ESI is dereferenced at ❶ to obtain a
pointer to configObject, which is stored at offset 0 in the
mainObject. Next, the hostshort is retrieved at an offset of +144 into configObject at ❷, and char * is stored within configObject at
offset 0x248 at ❸ (0x104 + 0x144). This level of
indirection is common in C++ programs. In a C program, these values would be stored as global data
with offsets that are labeled and tracked by IDA Pro, but in C++ they are stored as offsets into
objects that are harder to track.
In order to determine the data that will be pushed onto the stack, we would need to go back to
the function that initializes configObject to see what is stored
at offsets 0x144 and 0x248. In practice, it’s often easier to use dynamic analysis to
determine those values, but without access to the command-and-control server, you may need to go
back to configObject.
Looking at sub_403D50, we see that it calls htons, socket, and connect to establish a connection to a remote socket. maybe_beacon then calls sub_402FF0,
which contains the code shown in Example C-221.
Example C-221. Beginning of the victim survey function
0040301C call ds:GetComputerNameA 00403022 test eax, eax 00403024 jnz short loc_403043 00403026 push offset aErrorConductin ; "Error conducting machine survey" 0040302B lea ecx, [esp+40h+var_1C] 0040302F call sub_403910 00403034 lea eax, [esp+3Ch+var_1C] 00403038 push offset unk_411150 0040303D push eax 0040303E call __CxxThrowException@8 ; _CxxThrowException(x,x)
We see from this code that the function is trying to obtain the computer’s hostname. If it fails to do so, it throws an exception with the error message “Error conducting machine survey.” This tells us that this function is conducting a survey of the victim’s machine.
The remainder of sub_402FF0 shows the malware gathering
additional victim information. We can now rename sub_402FF0 to
surveyVictim and move on.
Next, we analyze the function called by maybe_beacon, which
calls sub_404ED0. From the error message, we can see that
sub_404ED0 does an HTTP POST
to the remote server. maybe_beacon then calls sub_404B10, which from the error messages we can see is checking the
beacon response. Without going into too much detail, we can tell that maybe_beacon is, in fact, the beacon function and that it expects a specific beacon
response in order for the program to continue running.
We return to main2 to check the maybe_poll (0x402410) function. We see that its first call is to sub_403D50, which we analyzed earlier and know initializes a connection to the
command-and-control server. The maybe_poll function then calls
sub_404CF0, which sends an HTTP GET in order to retrieve information from the remote server. It then calls sub_404B10, which retrieves the server’s response to the HTTP
GET request. We then see two blocks of code that raise an
exception if the response doesn’t meet certain formatting criteria.
Next, we come across a switch statement with six options,
as shown in Example C-222.
Example C-222. switch statements inside the maybe_poll function
0040251F mov al, [esi+4] 00402522 add eax, -61h ; switch 6 cases 00402525 cmp eax, 5 00402528 ja short loc_40257D ; default 0040252A jmp ds:off_4025C8[eax*4] ; switch jump
The value used for the switch decision is stored in [esi+4]. That value is then stored in EAX, and 0x61 is subtracted from it. If the value
is not lower than five, none of the switch jumps are taken. This ensures that the value is between
0x61 and 0x66 (which represents ASCII characters a through
f). 0x61 less than the value is then used as an offset into the switch table.
IDA Pro has recognized and labeled the switch table.
We click off_4025C8, which takes us to the six
possible locations that we need to analyze. We’ll label these case_1 through case_6 and analyze them one at a
time:
case_1 calls the delete operator and then immediately
returns without actually doing anything. We’ll rename this case_doNothing.
case_2 calls atoi to
parse a string into a number, and then calls the sleep function
before returning. We’ll rename it case_sleep.
case_3 does some string parsing, and then calls CreateProcess. We’ll rename it case_ExecuteCommand.
case_4 calls CreateFile
and writes the HTTP response received from the command-and-control server to disk. We’ll
rename it case_downloadFile.
case_5 also calls CreateFile, but it uploads the data from the file to the remote server using an HTTP
POST command. We’ll rename it case_uploadFile.
case_6 calls GetComputerName, GetUserName, GetVersionEx, and GetDefaultLCID, which together
perform a survey of the victim’s machine and send the results back to the command-and-control
server.
Overall, we have a backdoor program that reads a configuration file that determines the command-and-control server, sends a beacon to the command-and-control server, and implements several different functions based on the response from the command-and-control server.