┌─────────────────────────────────────────────────────┬─────────────────────┐ │ DELUKS │ PROGRAM_HEADERS.TXT │ └─────────────────────────────────────────────────────┴─────────────────────┘ INTRO: As beginners on our little journey of understanding the ELF file format, it is important to understand the program headers, as they are essential for how the OS loads and executes a program. Without understanding these concepts it would be far more challenging to analyze, modify, or exploit ELF binaries effectively. ─ SEGMENT I: WHAT IS A PROGRAM HEADER? ────────────────────────────────────── A program or segment header tells our beloved operating system how to create a process image during the program execution. It describes how to map the segments and what permissions to set for that region. Each entry in the program header contains the following info: - segment type - location in the file and memory - permissions - segment size So by now you may be wondering: > That's cool and all, but what even is this thing called a "segment"??? Well allow me to clarify: A segment in an ELF is just a part of the program that the operating system loads into memory for execution. Each segment contains code, data, or some other important data required for the program to run. ─ SEGMENT II: WHERE DO I FIND THE PROGRAM HEADERS? ────────────────────────── At this point, you may be wondering where can you find these program headers. Do not worry, I'm here to help, if we take another look at our ELF header, we can see 3 important members: ──┬─┬─┐ │ │ │ typedef struct { │ │ │ unsigned char e_ident[EI_NIDENT]; │ │ │ uint16_t e_type; │ │ │ uint16_t e_machine; │ │ │ uint32_t e_version; │ │ │ Elf64_Addr e_entry; │ │ │ Elf64_Off e_phoff; <──────────┘ │ │ Elf64_Off e_shoff; │ │ uint32_t e_flags; │ │ uint16_t e_ehsize; │ │ uint16_t e_phentsize; <────────────┘ │ uint16_t e_phnum; <──────────────┘ uint16_t e_shentsize; uint16_t e_shnum; uint16_t e_shstrndx; } Elf64_Ehdr; The three little members stand for the following: e_phoff - program header offset e_phentsize - size of one program header entry e_phnum - the number of program headers With those entries we can find the first program header using the e_phoff and then we can loop trough the program headers by going from 1 to e_phnum, calculating the address of each program header by adding e_phentsize to e_phoff. ─ SEGMENT III: WHAT DOES THE PROGRAM HEADER CONSIST OF? ───────────────────── Moving on, let us take a closer look at the definition of the program header: typedef struct { uint32_t p_type; uint32_t p_flags; Elf64_Off p_offset; Elf64_Addr p_vaddr; Elf64_Addr p_paddr; uint64_t p_filesz; uint64_t p_memsz; uint64_t p_align; } Elf64_Phdr; The first member describes the segment type. There are, of course, many segment types, including architecture-specific types. To get started, we only need to know the 8 most common ones: PT_NULL - 0 - PLACE HOLDER PT_LOAD - 1 - LOADABLE SEGMENT PT_DYNAMIC - 2 - INFO FOR DYNAMIC LINKING PT_INTERP - 3 - LOCATION OF INTERPRETER PT_NOTE - 4 - METADATA PT_PHDR - 6 - LOCATION OF PROGRAM HEADER TABLE IN MEMORY PT_TLS - 7 - THREAD LOCAL STORAGE INFORMATION The p_flags member defines the permissions for the region in a bit field like so: ┌──────┬───────┬──────┐ │ READ │ WRITE │ EXEC │ ├──────┼───────┼──────┤ │ 4 │ 2 │ 1 │ └──────┴───────┴──────┘ Neeext up, p_offset, which defines the offset of the section from the start of the file. Right after that we have p_vaddr and p_paddr. The p_vaddr gives us the virtual address at which the first byte of the segment is located in memory and the p_paddr defines the same thing just for the physical address, although the p_paddr is rarely used. Finally, we have p_filesz and p_memsz, so as you could probably assume, p_filesz and p_memsz define the size of the segment in the binary and in memory respectively. Note that sometimes the p_memsz may be larger than p_filesz. A huge difference between the two could mean the segment is packed (commonly seen in malware). Now all that is left is the p_align member. It makes sure all our segments are properly aligned in memory and on disk. To ensure this it has 3 rules: 1. The p_vaddr and p_offset must be aligned based on the systems page size. 2. If p_align is 0 or 1, no alignment is needed. 3. If p_align is larger than 1, it must be a power of 2, and p_vaddr and p_offset should match when divided by this alignment value. So to summarize it makes sure it's just loaded in the correct boundaries. ─ SEGMENT IV: CONCLUSION ─────────────────────────────────────────────────── Well that is all for this time, I hope you enjoyed the read and hope that the article has helped you in understanding the ELF program headers in some way. Now go out there and have fun with the newly learned knowledge! I'm also a very big fan of utilizing the knowledge that I learned directly so as a suggestion, you can try to parse the program headers now! ~ DeLuks ─ EXTRA SEGMENT V: VISUALISATION OF PT_NOTE TO PT_LOAD INFECTION ─────────── Here's a little extra part for ya. If you ever wondered what changes in the program header during the PT_NOTE to PT_LOAD infection: BEFORE: AFTER: Elf64_Phdr.p_type = 0x4 Elf64_Phdr.p_type = 0x1 Elf64_Phdr.p_flags = 0x4 Elf64_Phdr.p_flags = 0x5 Elf64_Phdr.p_vaddr = 0x0000 Elf64_Phdr.p_vaddr = 0xc000000 + F_sz Elf64_Phdr.p_filesz = 0x0000 Elf64_Phdr.p_filesz = V-Body Elf64_Phdr.p_memsz = 0x0000 Elf64_Phdr.p_memsz = V-Body Elf64_Phdr.p_offset = 0x0000 Elf64_Phdr.p_offset = F_sz F_sz = file size, V-Body = virus body, 0x0000 = some unknown value --[ PREV | HOME | NEXT ]--