Carnegie Mellon Introduction to Computer Systems 15 213 fall 2009 xxth Lecture Oct xth Instructors Majd Sakr and Khaled Harras Carnegie Mellon Today Linking Carnegie Mellon Example C Program main c swap c int buf 2 1 2 extern int buf int main swap return 0 static int bufp0 buf 0 static int bufp1 void swap int temp bufp1 buf 1 temp bufp0 bufp0 bufp1 bufp1 temp Carnegie Mellon Static Linking Programs are translated and linked using a compiler driver unix gcc O2 g o p main c swap c unix p main c swap c Translators cpp cc1 as Source files Translators cpp cc1 as main o swap o Separately compiled relocatable object files Linker ld p Fully linked executable object file contains code and data for all functions defined in main c and swap c Carnegie Mellon Why Linkers Modularity Program can be written as a collection of smaller source files rather than one monolithic mass Can build libraries of common functions more on this later e g Math library standard C library Carnegie Mellon Why Linkers Efficiency Time Separate Compilation Change one source file compile and then relink No need to recompile other source files Space Libraries Common functions can be aggregated into a single file Yet executable files and running memory images contain only code for the functions they actually use Carnegie Mellon What Do Linkers Do Step 1 Symbol resolution Programs define and reference symbols variables and functions void swap swap int xp x define symbol swap reference symbol swap define xp reference x Symbol definitions are stored by compiler in symbol table Symbol table is an array of structs Each entry includes name type size and location of symbol Linker associates each symbol reference with exactly one symbol definition Carnegie Mellon What Do Linkers Do cont Step 2 Relocation Merges separate code and data sections into single sections Relocates symbols from their relative locations in the o files to their final absolute memory locations in the executable Updates all references to these symbols to reflect their new positions Carnegie Mellon Three Kinds of Object Files Modules Relocatable object file o file Contains code and data in a form that can be combined with other relocatable object files to form executable object file Each o file is produced from exactly one source c file Executable object file Contains code and data in a form that can be copied directly into memory and then executed Shared object file so file Special type of relocatable object file that can be loaded into memory and linked dynamically at either load time or run time Called Dynamic Link Libraries DLLs by Windows Carnegie Mellon Executable and Linkable Format ELF Standard binary format for object files Originally proposed by AT T System V Unix Later adopted by BSD Unix variants and Linux One unified format for Relocatable object files o Executable object files Shared object files so Generic name ELF binaries Carnegie Mellon ELF Object File Format Elf header Word size byte ordering file type o exec so machine type etc Segment header table Page size virtual addresses memory segments sections segment sizes text section Code rodata section Read only data jump tables data section Initialized global variables bss section Uninitialized global variables Block Started by Symbol Better Save Space Has section header but occupies no space ELF header Segment header table required for executables text section rodata section data section bss section symtab section rel txt section rel data section debug section Section header table 0 Carnegie Mellon ELF Object File Format cont symtab section Symbol table Procedure and static variable names Section names and locations rel text section Relocation info for text section Addresses of instructions that will need to be modified in the executable Instructions for modifying rel data section Relocation info for data section Addresses of pointer data that will need to be ELF header Segment header table required for executables text section rodata section data section bss section symtab section rel txt section modified in the merged executable rel data section debug section Info for symbolic debugging gcc g debug section Section header table Offsets and sizes of each section Section header table 0 Carnegie Mellon Linker Symbols Global symbols Symbols defined by module m that can be referenced by other modules E g non static C functions and non static global variables External symbols Global symbols that are referenced by module m but defined by some other module Local symbols Symbols that are defined and referenced exclusively by module m E g C functions and variables defined with the static attribute Local linker symbols are not local program variables Carnegie Mellon Resolving Symbols Global External Local int buf 2 1 2 extern int buf int main swap return 0 static int bufp0 buf 0 static int bufp1 External main c void swap int temp Linker knows nothing of temp Global bufp1 buf 1 temp bufp0 bufp0 bufp1 bufp1 temp swap c Carnegie Mellon Relocating Code and Data Relocatable Object Files System code text System data data Executable Object File 0 Headers System code main main o swap main text int buf 2 1 2 data More system code text System data int buf 2 1 2 int bufp0 buf 0 Uninitialized data symtab debug swap o swap int bufp0 buf 0 data int bufp1 bss text data bss Carnegie Mellon Relocation Info main main c int buf 2 1 2 int main swap return 0 main o 0000000 main 0 55 1 89 e5 3 83 ec 08 6 e8 fc ff ff ff b d f 10 31 c0 89 ec 5d c3 push ebp mov esp ebp sub 0x8 esp call 7 main 0x7 7 R 386 PC32 swap xor eax eax mov ebp esp pop ebp ret Disassembly of section data 00000000 buf 0 01 00 00 00 02 00 00 00 Source objdump Carnegie Mellon Relocation Info swap text swap c swap o extern int buf Disassembly of section text static int bufp0 buf 0 static int bufp1 00000000 swap 0 55 1 8b 15 00 00 00 00 void swap int temp bufp1 buf 1 temp bufp0 bufp0 bufp1 bufp1 temp 7 a1 c 89 e c7 15 00 18 1a 1c 1e 89 8b 89 a1 23 89 25 5d 26 c3 push ebp mov 0x0 edx 3 R 386 32 bufp0 0 00 00 00 mov 0x4 eax 8 R 386 32 buf e5 mov esp ebp 05 00 00 00 00 04movl 0x4 0x0 00 00 10 R 386 32 bufp1 14 R 386 32 buf ec mov ebp esp 0a mov edx ecx 02 mov eax edx 00 00 00 00 mov 0x0 eax 1f R 386 32 bufp1 08 mov ecx eax pop ebp ret Carnegie Mellon Relocation Info swap data swap c extern int buf Disassembly of section data static int bufp0 buf 0 static int bufp1 00000000 bufp0 0 00 00 00 00 void swap int temp bufp1 buf 1 temp bufp0 bufp0 bufp1 bufp1 temp 0 R
View Full Document