Chapter 7. Walkthrough Four

The Protocol Problem

It’s not uncommon to be presented with an executable where the protocol is either partially unknown or completely unknown. As a reverse engineer, it’s your job to either figure out the protocol for compatibility or to check a program for any hidden features that may cause security problems. In this chapter we’ll cover tracking a protocol through a binary and recovering its message structure.

Protocol Structure

Most protocols are streams of discrete messages meant to be interpreted individually. There are exceptions to this rule. HTTP, for instance, dumps off a mostly unstructured request and then gives an unstructured reply. FTP uses a text-based control channel and establishes a separate TCP session for each file transferred. These are in the minority of protocols you’ll have to reverse.

If a reverse engineer doesn’t have access to either of the executables, the protocol can be reversed from only the raw bytes on the wire. If a reverse engineer only has access to the binary without the ability to run it, the protocol can eventually be extracted from the executables. In most cases, the engineer will have access to both the binaries and a working implementation. A hybrid approach is the fastest way to solve the problem. The bytes on the wire give the reverse engineer a quick picture of how the protocol is structured, but, in the end, any features not exercised by the client or server will have to be extracted from the binary. If the reverse engineer has the time, the binary always gives the most accurate view.

Framing and Reassembly

Every protocol needs to know where one message ends and the following message begins. This is commonly referred to as framing. Most protocols can survive getting out of synchronization with the other end of the connection. If the sender thought a message was 30 bytes long and the receiver interpreted the message as 20 bytes long, the receiver might try to interpret the remaining 10 bytes as a new message, resulting in a corrupt message and the connection being dropped.

TCP/IP doesn’t guarantee a message will remain together as it makes its way across the Internet. The message may be broken up into smaller pieces along the way. The mechanics are unimportant to the programmer. What is important is that a single call to a read( ) function may return a whole message or some piece of the message. It might also include more than one message. It’s up to the application to make sure it has read a full message into the buffer before continuing. I’d like to be able to say that most programs do this well. Unfortunately, most programs you’ll encounter in the wild do a very poor job of reassembling and then parsing the messages.

Most small or quickly written programs will assume that messages are transmitted whole across the Internet. Their basic block diagram looks something like Figure 7.1.

Basic Block Diagram of a Small Program

Figure 7.1. Basic Block Diagram of a Small Program

If a message is broken up en route, the program will either crash while parsing the message or reject the message as incomplete, depending on how the function was implemented. Programs written this way still work surprisingly well. When a message is fragmented, the message is returned to the program broken across two reads from the socket. Both the messages are discarded as invalid, but the next message (if it wasn’t also broken up) will be received and processed correctly by the system.

A popular instant messenger client is among the programs that implement this kind of loop. Spotting these programs is easy. Simply bring up a proxy between the client and the server and break the messages up in small pieces. If the messages are ignored, you have a program that assumes the messages will be delivered as a unit.

Larger commercial programs often buffer reads into a read buffer and then shift the read buffer down as full messages are received. Their basic block diagram looks like Figure 7.2.

Basic Block Diagram of a Large Commercial Program

Figure 7.2. Basic Block Diagram of a Large Commercial Program

This is a perfectly correct loop and easy to trace through the system. The receive buffer tends to be allocated as a global or on the heap, and if it’s properly implemented works in all cases.

Self Similarity

Protocols are self-similar, meaning that since operations are repeated over the life of the session, the bytes transferred over the wire also repeat. Nearly every protocol has a common protocol header. That header will appear in every message on the wire. At a minimum, the header will contain a length and a message type. The parser needs to know how big the current message is and how to parse it. The following list shows some of the fields often found in the base protocol header.

  • Magic Number

  • Sequence Number

  • Timestamp

  • Data or Section Lengths

  • Session ID

  • Number of Submessages

  • Error Code

  • Random Nonce

When starting to make sense of a protocol, it helps to have examples of the various packets in front of you for reference. I always print out the hex dumps so I can annotate them as I go. Below are the first few packets from the example used in this section. I’ve highlighted the packets going from the client to server in bold and the packets going from the server to the client in italic.

DE AD BE EF 00 18 01 00 76 B7 0B 5A 42 DD 54 B9 6B E8 1E 47 44 D9 67 C3

DE AD BE EF 00 18 02 00 C1 06 45 18 90 51 5D 71 44 46 D7 21 B6 4C 01 73

DE AD BE EF 00 18 01 00 45 EB 2E 08 42 68 22 72 60 E9 1B 32 16 FA 45 EB

DE AD BE EF 00 18 02 00 2E 08 42 68 22 72 60 E9 1B 32 16 FA 40 AB 55 AD

DE AD BE EF 00 18 01 00 64 D4 7F 88 7E E1 5A AA 21 46 49 3D E3 22 7E 1E

DE AD BE EF 00 18 02 00 79 18 D2 6C D7 3D C9 61 60 7B 02 00 DC 4F 40 59

DE AD BE EF 02 08 03 01 AF 59 3E 31 ED 45 FD 02 E3 26 A1 1B 08 12 19 05 16 82 59 26 3B 90 77

DE AD BE EF 02 08 03 00 77 0D 33 A4 03 19 4D F1 62 F5 1F B2 20 CB 37 82 25 87 46 0E 6E 8A 56

DE AD BE EF 02 08 04 00 77 0D 33 A4 03 19 4D F1 62 F5 1F B2 20 CB 37 82 25 87 46 0E 6E 8A 56

DE AD BE EF 02 08 04 00 AF 59 3E 31 ED 45 FD 02 E3 26 A1 1B 08 12 19 05 16 82 59 26 3B 90 77

DE AD BE EF 00 08 01 02 00 00 00 00

At this point, we have no idea how many different types of messages are in this dump and we have no idea how long the base header is. Looking at it column by column, we can start making some guesses. The first four bytes have the same value in every packet. Since they spell DEADBEEF, we can safely assume that they’re a magic number at the beginning of the protocol. The shortest packet in this dump is 12 bytes long. It’s safe to say that the base header is smaller than the smallest message observed. That leaves us with the start of a structure that looks like this:

struct base_header{
       int MagicNumber;   /*Always 0xDEADBEEF*/
       char unknown1;     /*00,02*/
       char unknown2;     /*08, 18*/
       char unknown3;     /*01, 02, 03, 04*/
       char unknown4;     /*00,01,02*/
       char unknown5;     /*Lots of possibilities*/
       char unknown6;     /*Lots of possibilities*/
       char unknown7;     /*Lots of possibilities*/
       char unknown8;     /*Lots of possibilities*/
}

The packets have to be read off the network and into the program. There are a limited number of API calls that accomplish the task. The program can directly make the system calls itself, but that’s only encountered in malware. The place(s) where the packet is read off the wire is always a good place to start your analysis. The following is a list of common API calls capable of reading traffic off a network socket.

  • read/write

  • recv/send

  • recvfrom,/sendto

  • WSARecv/WSASend

  • WSARecvFrom/WSASendTo

  • ioctl

  • ioctlsocket

  • WSARecvDisconnect/WSASendDisconnect

  • WSARecvEx/WSASendEx

  • recvmsg/sendmsg

  • WSARecvMsg/WSASendMsg

The import table of the executable only has a reference to WSARecv so that serves as in ideal starting point for the analysis of the protocol (see Figure 7.3). The WSARecv call reads the data into a buffer on the stack. It can read up to 0x4000 bytes in one read as shown in Figure 7.4.

Import Table of Executable Showing Reference to WSARecv

Figure 7.3. Import Table of Executable Showing Reference to WSARecv

WSARecv Call

Figure 7.4. WSARecv Call

Right after reading into a buffer on the stack, the program immediately checks to see if the number of bytes read is less than eight. If it’s fewer than eight bytes, it jumps to the exit function, as shown in Figure 7.5.

Exit Function

Figure 7.5. Exit Function

It then checks the first four bytes to see if they’re DEADBEEF. That matches the magic number from the dump (the first four bytes) nicely. The bytes read off the wire are passed through ntohl( ) before they’re compared. This means the wire protocol is big-endian. It will also help to determine the size of fields. The program will need to byte-swap all two- and four-byte fields before processing them, as shown in Figure 7.6.

Byte-swapping

Figure 7.6. Byte-swapping

If it passes the DEADBEEF test, it takes the next two bytes and checks to see if they’re less than eight. Packets with a small number are discarded. We now know that the two bytes following the magic number are a unit and they always have to be greater than eight. With the information obtained from these two basic blocks, the structure diagram can be updated. The number corresponds with the length of the packet, so for now I’ll label the two-byte field as the length field.

struct base_header{
       int MagicNumber;        /*Always 0xDEADBEEF*/
       unsigned short Len;     /*0018,0208*/ /*Greater than 8*/
       char unknown3;          /*01, 02, 03, 04*/
       char unknown4;          /*00,01,02*/
       char unknown5;          /*Lots of possibilities*/
       char unknown6;          /*Lots of possibilities*/
       char unknown7;          /*Lots of possibilities*/
       char unknown8;          /*Lots of possibilities*/
}

After the length check, the processing function pulls a single byte at offset 7 into a register and uses the and instruction against the result. Whenever a field pulled out of the protocol is passed to bitwise and or or operators, and the operand is a constant, it’s a good bet that the target is a bitfield of some sort. In this case, if the third bit is set, the process stops processing the packet. It’s safe to say that byte 7 is a bitfield and the constant 0x04 is invalid, as shown in Figure 7.7.

Use of and Operator

Figure 7.7. Use of and Operator

The updated struct looks something like:

struct  base_header{
        int MagicNumber;       /*Always 0xDEADBEEF*/
        unsigned short Len;    /*0018,0208*/ /*Greater than 8*/
        char unknown3;         /*01, 02, 03, 04*/
#define FLAG_NONE              0x00
#define FLAG_INVALID           0x04
        char Flags;            /*00,01,02*/
        char unknown5;         /*Lots of possibilities*/
        char unknown6;         /*Lots of possibilities*/
        char unknown7;         /*Lots of possibilities*/
        char unknown8;         /*Lots of possibilities*/
}

From the packet dump, we saw that two other flags were set. In most flags an empty bitfield means “do nothing,” so as a placeholder, zero is defined as FLAG_NONE. That leaves at least 0x01 and 0x02 as valid flags in the protocol.

Next up, the binary takes byte 6 and stores it into a stack variable. It compares the stack variable to three possible values: 0x01, 0x03, and 0x5C. In the dump, we’ve seen 0x01 through 0x04, but only 0x01 and 0x03 are packets destined for the server. It would seem that 0x02 and 0x04 are client→server constants while 0x01, 0x03, and 0x5C are server→client constants. All other values aren’t processed, as shown in Figure 7.8.

Other Values Are Not Processed

Figure 7.8. Other Values Are Not Processed

Depending on the value of unknown3, the binary will call one of three functions. The prototype for each of the called functions is the same. The first parameter is a pointer to the raw packet we sent on the wire. The second is the number of bytes read from the network, and the third is the socket handle the packet was read from, as shown in Figure 7.9.

Prototype for Each of Three Called Functions

Figure 7.9. Prototype for Each of Three Called Functions

Later in the tutorial, we’ll reverse each of the three functions. For now, we’ll finish out the logic of the current function. The final two comparisons are both against the flag field. The binary uses a bitwise and to check for the 0x01 bit. If the bit is set, the thread goes to sleep for 1000 ms, as shown in Figure 7.10.

If Bit Is Set, Thread Goes to Sleep for 1000 ms

Figure 7.10. If Bit Is Set, Thread Goes to Sleep for 1000 ms

The last check before the end of the processing loop is another bitwise and checking for the 0x02 bit in the bitfield. If the bit is set, the processing loop exits, as shown in Figure 7.11.

If Bit Is Set, Processing Loop Exits

Figure 7.11. If Bit Is Set, Processing Loop Exits

That completes the main parsing loop of the example server. Since we didn’t do anything with the other bytes in the struct, it can be safely assumed that they aren’t part of the base header in the protocol and those bytes belong to some deeper part of the protocol. The final structure appears in the code below.

struct  base_header{
        int MagicNumber;        /*Always 0xDEADBEEF*/
        unsigned short Len;     /*0018,0208*/ /*Greater than 8*/
#define PACKET_SERVER_01        0x01
#define PACKET_CLIENT_02        0x02
#define PACKET_SERVER_03        0x03
#define PACKET_CLIENT_04        0x04
#define PACKET_SERVER_5C        0x5C
        char PacketType;        /*01, 02, 03, 04*/
#define FLAG_NONE               0x00
#define FLAG_SLEEP              0x01
#define FLAG_PROCESS_AND_EXIT   0x02
#define FLAG_INVALID            0x04
        char Flags;             /*00,01,02*/
}

For this simple protocol, all we’re left with is to figure out what each of the three message types are for. The server processes messages of type 0x01, 0x03, and 0x5C. The 0x01 message is as good a place to start as any other, since we’re trying to reverse the entire protocol.

The function at 0x00401000 handles messages of type 0x01. Three parameters are passed to it: a pointer to the stack buffer containing the full message, the number of bytes read, and the socket it was read from, as shown in Figure 7.12.

Three Parameters Are Passed to the Function

Figure 7.12. Three Parameters Are Passed to the Function

The first thing you should notice is that the function never touches the buffer or the length fields passed into it. It jumps straight to constructing another message in a local stack buffer. It first zeros the buffer with memset, then byte-swaps the magic number (DEADBEEF) and stores it in the buffer. Then it byte-swaps and stores a 0x18 in the length field, a 0x02 in the type field, and a zero in the flags field. Then it fills the next 0x10 bytes with random values, as shown in Figure 7.13.

Fills Next 0x10 Bytes with Random Values

Figure 7.13. Fills Next 0x10 Bytes with Random Values

Finally it writes it out to the socket, as shown in Figure 7.14.

Writing It Out to Socket

Figure 7.14. Writing It Out to Socket

Basically, this generates a predetermined packet and returns it to the caller. The contents of the message received aren’t taken into account. We now know the bytes after the header in the calling packet are ignored and the bytes following the 0x02 message are just random values.

struct  base_header{
        int MagicNumber;         /*Always 0xDEADBEEF*/
        unsigned short Len;      /*0018,0208*/ /*Greater than 8*/
#define PACKET_GETRANDOM         0x01
#define PACKET_RANDOM            0x02
#define PACKET_SERVER_03         0x03
#define PACKET_CLIENT_04         0x04
#define PACKET_SERVER_5C         0x5C
        char PacketType;         /*01, 02, 03, 04*/
#define FLAG_NONE                0x00
#define FLAG_SLEEP               0x01
#define FLAG_PROCESS_AND_EXIT    0x02
#define FLAG_INVALID             0x04
        char Flags;              /*00,01,02*/
}
struct  packet_get_random{
        struct base_header BaseHeader;
        char Ignored[16];
}
struct  packet_random{
        struct base_header BaseHeader;
        char RandomValues[16];
}

The next message is processed by the function at 0x004010B0. It’s passed the same three parameters as the packet above: a pointer to the message buffer, the message buffer length, and the socket the message was read from.

The first thing it does is to make sure the message length is at least 520 bytes, as shown in Figure 7.15.

Ensure Message Length at Least 520 Bytes

Figure 7.15. Ensure Message Length at Least 520 Bytes

The entire processing of this packet is done in a single basic block, as shown in Figure 7.16.

Single Basic Block Processes Packet

Figure 7.16. Single Basic Block Processes Packet

The basic block zeros a stack buffer with memset. Then it byte-swaps the magic number (DEADBEEF) and stores it in the first four bytes of the buffer. It sets the byte-swapped length to 0x208, the type to 4 and the flags to zero. Then it copies 512 bytes from the incoming packet and puts it in the outgoing packet.

struct  base_header{
        int MagicNumber;           /*Always 0xDEADBEEF*/
        unsigned short Len;        /*0018,0208*/ /*Greater than 8*/
#define PACKET_GETRANDOM           0x01
#define PACKET_RANDOM              0x02
#define PACKET_ECHO                0x03
#define PACKET_ECHOREPLY           0x04
#define PACKET_SERVER_5C           0x5C
        char PacketType;           /*01, 02, 03, 04, 0x5C*/
#define FLAG_NONE                  0x00
#define FLAG_SLEEP                 0x01
#define FLAG_PROCESS_AND_EXIT      0x02
#define FLAG_INVALID               0x04
        char Flags;                /*00,01,02*/
}
/*must  be 520 bytes long or longer*/
struct  packet_get_echo{
        struct base_header BaseHeader;
        char RandomData[512];
}
struct  packet_echoreply{
        struct base_header BaseHeader;
        char EchoData[512];
}

The last message (type 0x5C) is handled by the function at 0x00401190. It’s passed the same three parameters as the other message handlers: a pointer to the message, the message length, and the socket the message came in on. The message is handled in a single basic block, as shown in Figure 7.17.

Single Basic Block That Handles Last Message

Figure 7.17. Single Basic Block That Handles Last Message

It looks like this message spawns a calculator. The final protocol description is given below.

struct  base_header{
        int MagicNumber;         /*Always 0xDEADBEEF*/
        unsigned short Len;      /*0018,0208*/ /*Greater than 8*/
#define PACKET_GETRANDOM         0x01
#define PACKET_RANDOM            0x02
#define PACKET_ECHO              0x03
#define PACKET_ECHOREPLY         0x04
#define PACKET_CALC              0x5C
        char PacketType;         /*01, 02, 03, 04, 0x5C*/
#define FLAG_NONE                0x00
#define FLAG_SLEEP               0x01
#define FLAG_PROCESS_AND_EXIT    0x02
#define FLAG_INVALID             0x04
        char Flags;              /*00,01,02*/
}
struct  packet_get_random{
        struct base_header BaseHeader;
        char Ignored[16];
}
struct  packet_random{
        struct base_header BaseHeader;
        char RandomValues[16];
}
/*must  be 520 bytes long or longer*/
struct  packet_get_echo{
        struct base_header BaseHeader;
        char RandomData[512];
}
struct  packet_echoreply{
        struct base_header BaseHeader;
        char EchoData[512];
}
struct  packet_calc{
        struct base_header BaseHeader;
}

The protocol description above should be enough to implement a client. We’re now also sure that this is all of the functionality of the server—there are no hidden uses the server can be put to.

Hit Marking

The example used in the first part of this section is a simple one. There are only a few functions and all of them directly deal with the protocol. Reverse engineering the entire program is feasible for the example, but it’s rare that a reverse engineer will have the luxury of reversing an entire program. It’s easy to find the functions to tear apart when a program’s call tree looks like Figure 7.18.

Easily Identified Functions in a Program Call Tree

Figure 7.18. Easily Identified Functions in a Program Call Tree

It’s much more difficult when the call tree looks like Figure 7.19.

More Difficult to Identify Functions in a Program Call Tree

Figure 7.19. More Difficult to Identify Functions in a Program Call Tree

The first question the reverse engineer must answer is which of the functions in this monstrous graph are used in processing messages and which ones are ignored. The most common method of picking which functions to reverse engineer is referred to as hit marking.

The basic concept behind hit marking is to record the path through the decision tree that each message takes. A breakpoint is set at the beginning of each function or basic block. When a breakpoint is reached, the program notes that the function was accessed and lets the program continue on to the next function. After the message has been fully processed, the list of breakpoints that were hit is sorted and duplicates are removed. This list is referred to as a hit list. By analyzing just the functions on the hit list, the reverse engineer should have an understanding of the structures that make up that particular message.

Figure 7.20 shows a short trace from IDA before duplicates are removed.

Short Trace from IDA

Figure 7.20. Short Trace from IDA

Unfortunately, IDA doesn’t have a good mechanism for building hit lists. A number of third-party plug-ins and applications can be purchased to make the process more useful and less painful, but we’re going to stick to pure IDA. The rest of this chapter will walk through developing a hit list.

The first thing you’ll need to build a hit list is a list of all the functions in the program. If you happen to have a way of removing functions that you know you’re not interested in, then by all means remove them. To obtain a list of functions, open your binary in IDA and open the function window (shift-f3). Right click and select copy (or hit ctrl-ins). IDA will copy the list of functions to the clipboard. Paste the list of functions into your favorite editor. You’ll get a list something like:

_IID_ISAXErrorHandler       .text 01001308 00000010 R . . . . . .
_IID_IXMLDOMDocument2       .text 01001318 00000019 R . . . . T .
_IID_ISchemaElement         .text 01001338 00000009 R . . . . . .
_IID_ISchemaAttribute       .text 01001348 00000009 R . . . . . .
_IID_ISchemaModelGroup      .text 01001358 0000000D R . . . . . .
_IID_ISchemaComplexType     .text 01001368 0000000D R . . . . . .
_IID_ISchemaType            .text 01001378 0000000D R . . . . . .
_IID_ISchemaItem            .text 01001388 00000009 R . . . . . .
_IID_ISAXAttributes         .text 010013A8 00000010 R . . . . . .

Use your text edit to remove everything but the function addresses. It should look something like:

01001308
01001318
01001338
01001348
01001358
01001368
01001378
01001388
010013A8

The IDA debugger has a built-in mechanism for tracing. It can be accessed from the Debug|tracing window or you can right-click on any instruction and add an execution trace. Theoretically, you could manually run through your entire function list and manually add a trace to each breakpoint. Four days later, your boss will probably fire you.

To efficiently add tracepoints to each function, we need to write a quick IDA plug-in. IDA plug-ins can be written in C, ruby, or python. Programming language holy wars aren’t a good thing, so I’m going to sidestep the issue and just write the plug-in in python because I wanted to (not because python is better). You can get the python plug-in from www.d-dome.net/idapython/. Just follow the instructions to install it.

What we want the plug-in to do is to automate setting all those hundreds of tracepoints. As a simple example, you can set a breakpoint at 0x004011E8 with just a few lines of python. Create a file with the following content:

#Set a breakpoint at 0x004011E8
from idautils import *
ea=ScreenEA()
ea=0x004011E8
add_bpt(ea, 1, 4)

Now run the file by hitting alt-9 and selecting it from the dialog box. (The file has to end in .py if you’re using the python bindings.) A breakpoint should now be set at 0x004011E8. You can check to see if it worked by looking at the breakpoints window in Figure 7.21. The hotkey for the breakpoints window is ctrl-alt-B.

Breakpoints Window

Figure 7.21. Breakpoints Window

The new breakpoint is set to break. What we want is for it to create a log entry every time the breakpoint is hit. In IDA terminology, that’s a trace instead of a break. Actually, tracing and breaking aren’t mutually exclusive. You can set it to stop execution (break) when it hits the breakpoint and log the event (trace) at the same time. Since we’re just looking for which functions are executed when processing a particular message, we want all the breakpoints set to trace. We have to remove the break flag and add the trace flag from the structure. The following code adds a breakpoint at 0x004011E8 and sets it to trace.

from idautils import *
ea=ScreenEA()
ea=0x004011E8
add_bpt(ea, 1, 4)
bp=bpt_t()
get_bpt(ea, bp)
bp.flags^=BPT_BRK
bp.flags|=BPT_TRACE

Now that we can set a tracepoint anywhere in IDA, with the list of function addresses from before, it’s a simple task to add a loop and set a tracepoint at the beginning of each function.

from idautils import *
funclist=[0x004011F2, 0x00401200, 0x00401206]   /*add all the other
addresses here*/
ea=ScreenEA()
for i in funclist:
      ea=i
      add_bpt(ea, 1, 4)
      bp=bpt_t()
      get_bpt(ea, bp)
      bp.flags^=BPT_BRK
      bp.flags|=BPT_TRACE

For those cases where it’s necessary to mark absolutely every function as a tracepoint, the IDA plug-in can be automated to find every function reference instead of doing it manually. In practice, you’ll probably have a number of functions you want to eliminate off the bat, so the example above will be used more than the fully automated one. Just for completeness, here’s a python snippet that automates setting a tracepoint at every function.

from idautils import *
# Loop through all the functions and add a breakpoint
for i in range(get_func_qty()):
    f=getn_func(i)
    print "Function %s at 0x%x" % (GetFunctionName(f.startEA), f.startEA)
    add_bpt(f.startEA, 1, 4)
#change all the breakpoints to trace-only
for i in range(get_bpt_qty()):
    b=bpt_t()
    getn_bpt(i, b)
    b.flags^=BPT_BRK
    b.flags|=BPT_TRACE
    update_bpt(b)

Now that there’s a tracepoint set at the beginning of each function, simply run the client program to generate a hit list.

Example Hitlist

Small examples don’t really get the point across, so a larger program is in order. I’m going to use Pidgin as an example (www.pidgin.im). Pidgin is a popular open-source chat program that supports most of the popular protocols. Since it’s open source, you can compare the disassembly against the binary if you get lost.

In this example, I’m going to generate a hit list against Pidgin v2.1.1 for Windows. Pidgin implements each protocol as a plug-in. I just picked the Yahoo! Instant Messenger plug-in at random. The protocol logic is implemented in libyahoo.dll. The call graph of the DLL is large and complex. Graphing it in IDA gives a blob that can’t be interpreted by humans, like the graph in Figure 7.22. (Or at least by humans that ever leave the house.)

“Blob” Graph from IDA

Figure 7.22. “Blob” Graph from IDA

In this example, I’m going to mark every function as a tracepoint with the script in the previous section. There are 549 functions in the DLL. For the first pass, I’m only interested in the functions used during initialization and login. The hit list should narrow the scope down considerably from the 549 functions to something more manageable.

First, create an account on Yahoo! and log onto it with Pidgin to make sure everything is up and working. Next open libyahoo.dll in IDA and set the Debugger→Process Options to run Pidgin when the debugger is launched. Finally, invoke the script from the previous section with alt-9 to mark every function as a tracepoint. Tell IDA to run and go get a cup of coffee. Running the executable under IDA’s debugger takes considerably longer than running the executable without the debugger. When the bar at the bottom of the window says Available, Pidgin has finished logging in, as shown in Figure 7.23.

Logging In Complete

Figure 7.23. Logging In Complete

Now we can take a look at the trace window and determine which functions were involved in connecting to Yahoo!. At most we’ll have to reverse engineer this set of functions to figure out the login portion of the protocol so it gives us a good upper bound on the amount of work ahead of us.

The trace window (Debugger|Tracing|Trace Window) lists the functions in the order they were called. The breakpoint list (Debugger|Breakpoints|Breakpoint List) is closer to a true hit list. It lists each breakpoint and a count of the number of times that breakpoint was invoked, as shown in Figure 7.24.

Invoked Breakpoints

Figure 7.24. Invoked Breakpoints

If we eliminate all the functions that didn’t get called at least once, we’re down to 133 functions. That cuts our search space down to 24% of the original. Many of the functions in the list are wrapper functions that simply call another function in the list. I like to start with the functions that are called often, but not hundreds of times. Just glancing at each of the functions, you should fairly quickly come to the function shown in Figure 7.25.

Frequently Called Functions

Figure 7.25. Frequently Called Functions

It’s always good to have dumps of the protocol in front of you for comparison. This is the first message my client sent to the server in hex.

59 4D 53 47 00 0F 00 00 13 00 57 00 00 00 00 00 00 00 00 31 C0 80 69 64 61 70 6C
75 67 69 6E 31 32 33 34 35 C0 80

The call to wpurple_read makes this function a good candidate to start reverse engineering. The function copies the data into another buffer, makes sure it’s longer than four bytes, and then drops into the main processing blocks. The first protocol processing block is shown in Figure 7.26.

First Protocol Processing Block

Figure 7.26. First Protocol Processing Block

It compares the first four bytes of the read buffer to the string “YMSG” and exits if it doesn’t match. The YMSG must be used as a magic number, as shown in Figure 7.27.

YMSG Magic Number

Figure 7.27. YMSG Magic Number

The next check looks like a length field. It pulls bytes 8 and 9 out of the packet, shifts byte 8 to the left, and adds them together. So we can guess that the length field is a big-endian short. That leaves our structure something like:

struct login_packet{
       char Magic[4];         /*Always YMSG*/
       char unknown1;
       char unknown2;
       char unknown3;
       char unknown4;
       unsigned short Len;    /*Big Endian*/
}

The process and number of function calls involved is greater in the Yahoo! Instant Messenger protocol than in the example protocol used earlier in the chapter, but extracting the protocol from the binary is the same. Input comes from a read and then is processed through a set of function calls. With a little work, you should be able to reverse the rest of the protocol.