ELF program headers are what describe segments within a binary and are necessary for program loading. Segments are understood by the kernel during load time and describe the memory layout of an executable on disk and how it should translate to memory. The program header table can be accessed by referencing the offset found in the initial ELF header member called e_phoff (program header table offset), as shown in the ElfN_Ehdr structure in display 1.7.
There are five common program header types that we will discuss here. Program headers describe the segments of an executable file (shared libraries included) and what type of segment it is (that is, what type of data or code it is reserved for). First, let's take a look at the Elf32_Phdr structure that makes up a program header entry in the program header table of a 32-bit ELF executable.
Here's the Elf32_Phdr struct:
typedef struct {
uint32_t p_type; (segment type)
Elf32_Off p_offset; (segment offset)
Elf32_Addr p_vaddr; (segment virtual address)
Elf32_Addr p_paddr; (segment physical address)
uint32_t p_filesz; (size of segment in the file)
uint32_t p_memsz; (size of segment in memory)
uint32_t p_flags; (segment flags, I.E execute|read|read)
uint32_t p_align; (segment alignment in memory)
} Elf32_Phdr;An executable will always have at least one PT_LOAD type segment. This type of program header is describing a loadable segment, which means that the segment is going to be loaded or mapped into memory.
For instance, an ELF executable with dynamic linking will generally contain the following two loadable segments (of type PT_LOAD):
The preceding two segments are going to be mapped into memory and aligned in memory by the value stored in p_align. I recommend reading the ELF man pages in Linux to understand all of the members in a Phdr structure as they describe the layout of both the segments in the file as well as in memory.
Program headers are primarily there to describe the layout of a program for when it is executing and in memory. We will be utilizing Phdrs later in this chapter to demonstrate what they are and how to use them in reverse engineering software.
The text segment (also known as the code segment) will generally have segment permissions set as PF_X | PF_R (READ+EXECUTE).
The data segment will generally have segment permissions set to PF_W | PF_R (READ+WRITE).
A file infected with a polymorphic virus might have changed these permissions in some way such as modifying the text segment to be writable by adding the PF_W flag into the program header's segment flags (p_flags).
The dynamic segment is specific to executables that are dynamically linked and contains information necessary for the dynamic linker. This segment contains tagged values and pointers, including but not limited to the following:
Following is a complete list of the tag names:
|
Tag name |
Description |
|---|---|
|
|
Address of symbol hash table |
|
|
Address of string table |
|
| |
|
|
Address of Rela relocs table |
|
|
Size in bytes of Rela table |
|
|
Size in bytes of a Rela table entry |
|
|
Size in bytes of string table |
|
|
Size in bytes of string table |
|
|
Size in bytes of string table |
|
|
Size in bytes of a symbol table entry |
|
|
Address of the initialization function |
|
|
Address of the termination function |
|
|
String table offset to name of shared object |
|
|
String table offset to library search path |
|
|
Alert linker to search this shared object before the executable for symbols |
|
|
Address of Rel relocs table |
|
|
Size in bytes of Rel table |
|
|
Size in bytes of a Rel table entry |
|
|
Type of reloc the PLT refers (Rela or Rel) |
|
|
Undefined use for debugging |
|
|
Absence of this indicates that no relocs should apply to a nonwritable segment |
|
|
Address of reloc entries solely for the PLT |
|
|
Instructs the dynamic linker to process all relocs before transferring control to the executable |
|
|
String table offset to library search path |
The dynamic segment contains a series of structures that hold relevant dynamic linking information. The d_tag member controls the interpretation of d_un.
The 32-bit ELF dynamic struct:
typedef struct {
Elf32_Sword d_tag;
union {
Elf32_Word d_val;
Elf32_Addr d_ptr;
} d_un;
} Elf32_Dyn;
extern Elf32_Dyn _DYNAMIC[];We will explore more about dynamic linking later in this chapter.
A segment of type PT_NOTE may contain auxiliary information that is pertinent to a specific vendor or system. Following is a definition of PT_NOTE from the formal ELF specification:
Sometimes a vendor or system builder needs to mark an object file with special information that other programs will check for conformance, compatibility, and so on. Sections of type SHT_NOTE and program header elements of type PT_NOTE can be used for this purpose. The note information in sections and program header elements holds any number of entries, each of which is an array of 4-byte words in the format of the target processor. Labels appear below to help explain note information organization, but they are not part of the specification.
A point of interest: because of the fact that this segment is only used for OS specification information, and is actually not necessary for an executable to run (since the system will just assume the executable is native either way), this segment becomes an interesting place for virus infection, although not necessarily the most practical way to go about it due to size constraints. Some information on NOTE segment infections can be found at http://vxheavens.com/lib/vhe06.html.
This small segment contains only the location and size to a null terminated string describing where the program interpreter is; for instance, /lib/linux-ld.so.2 is generally the location of the dynamic linker, which is also the program interpreter.
This segment contains the location and size of the program header table itself. The Phdr table contains all of the Phdr's describing the segments of the file (and in the memory image).
Consult the ELF(5) man pages or the ELF specification paper to see all possible Phdr types. We have covered the most commonly seen ones that are vital to program execution or that we will be seeing most commonly in our reverse engineering endeavors.
We can use the readelf -l <filename> command to view a file's Phdr table:
Elf file type is EXEC (Executable file)
Entry point 0x8049a30
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x08048000 0x08048000 0x1622c 0x1622c R E 0x1000
LOAD 0x016ef8 0x0805fef8 0x0805fef8 0x003c8 0x00fe8 RW 0x1000
DYNAMIC 0x016f0c 0x0805ff0c 0x0805ff0c 0x000e0 0x000e0 RW 0x4
NOTE 0x000168 0x08048168 0x08048168 0x00044 0x00044 R 0x4
GNU_EH_FRAME 0x016104 0x0805e104 0x0805e104 0x0002c 0x0002c R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
GNU_RELRO 0x016ef8 0x0805fef8 0x0805fef8 0x00108 0x00108 R 0x1We can see the entry point of the executable as well as some of the different segment types we just finished discussing. Notice the offsets to the right of the permission flags and alignment flags of the two first PT_LOAD segments.
The text segment is READ+EXECUTE and the data segment is READ+WRITE, and both segments have an alignment of 0x1000 or 4,096 which is a page size on a 32-bit executable, and this is for alignment during program loading.