Carnegie Mellon Linking 15 213 Introduction to Computer Systems 11th Lecture Sept 30 2010 Instructors Randy Bryant and Dave O Hallaron 1 Carnegie Mellon Today Linking Case study Library interpositioning 2 Carnegie Mellon Example C Program main c swap c int buf 2 1 2 extern int buf int main swap return 0 int bufp0 buf 0 static int bufp1 void swap int temp bufp1 buf 1 temp bufp0 bufp0 bufp1 bufp1 temp 3 Carnegie Mellon Static Linking Programs are translated and linked using a compiler driver unix gcc O2 g o p main c swap c unix p main c swap c Translators cpp cc1 as Translators cpp cc1 as main o swap o Source files Separately compiled relocatable object files Linker ld p Fully linked executable object file contains code and data for all functions defined in main c and swap c 4 Carnegie Mellon Why Linkers Reason 1 Modularity Program can be written as a collection of smaller source files rather than one monolithic mass Can build libraries of common functions more on this later e g Math library standard C library 5 Carnegie Mellon Why Linkers cont Reason 2 Efficiency Time Separate compilation Change one source file compile and then relink No need to recompile other source files Space Libraries Common functions can be aggregated into a single file Yet executable files and running memory images contain only code for the functions they actually use 6 Carnegie Mellon What Do Linkers Do Step 1 Symbol resolution Programs define and reference symbols variables and functions void swap swap int xp x define symbol swap reference symbol a define symbol xp reference x Symbol definitions are stored by compiler in symbol table Symbol table is an array of structs Each entry includes name size and location of symbol Linker associates each symbol reference with exactly one symbol definition 7 Carnegie Mellon What Do Linkers Do cont Step 2 Relocation Merges separate code and data sections into single sections Relocates symbols from their relative locations in the o files to their final absolute memory locations in the executable Updates all references to these symbols to reflect their new positions 8 Carnegie Mellon Three Kinds of Object Files Modules Relocatable object file o file Contains code and data in a form that can be combined with other relocatable object files to form executable object file Each o file is produced from exactly one source c file Executable object file a out file Contains code and data in a form that can be copied directly into memory and then executed Shared object file so file Special type of relocatable object file that can be loaded into memory and linked dynamically at either load time or run time Called Dynamic Link Libraries DLLs by Windows 9 Carnegie Mellon Executable and Linkable Format ELF Standard binary format for object files Originally proposed by AT T System V Unix Later adopted by BSD Unix variants and Linux One unified format for Relocatable object files o Executable object files a out Shared object files so Generic name ELF binaries 10 Carnegie Mellon ELF Object File Format Elf header Word size byte ordering file type o exec so machine type etc Segment header table Page size virtual addresses memory segments sections segment sizes text section Code rodata section Read only data jump tables data section Initialized global variables bss section Uninitialized global variables Block Started by Symbol Better Save Space Has section header but occupies no space 0 ELF header Segment header table required for executables text section rodata section data section bss section symtab section rel txt section rel data section debug section Section header table 11 Carnegie Mellon ELF Object File Format cont symtab section Symbol table Procedure and static variable names Section names and locations rel text section Relocation info for text section Addresses of instructions that will need to be modified in the executable Instructions for modifying rel data section Relocation info for data section Addresses of pointer data that will need to be 0 ELF header Segment header table required for executables text section rodata section data section bss section symtab section rel txt section modified in the merged executable rel data section debug section Info for symbolic debugging gcc g debug section Section header table Offsets and sizes of each section Section header table 12 Carnegie Mellon Linker Symbols Global symbols Symbols defined by module m that can be referenced by other modules E g non static C functions and non static global variables External symbols Global symbols that are referenced by module m but defined by some other module Local symbols Symbols that are defined and referenced exclusively by module m E g C functions and variables defined with the static attribute Local linker symbols are not local program variables 13 Carnegie Mellon Resolving Symbols Global External Global Local int buf 2 1 2 extern int buf int main swap return 0 int bufp0 buf 0 static int bufp1 External main c void swap int temp Linker knows nothing of temp Global bufp1 buf 1 temp bufp0 bufp0 bufp1 bufp1 temp swap c 14 Carnegie Mellon Relocating Code and Data Relocatable Object Files System code text System data data Executable Object File 0 Headers System code main main o swap main text int buf 2 1 2 data More system code text System data int buf 2 1 2 int bufp0 buf 0 int bufp1 symtab debug swap o swap int bufp0 buf 0 data static int bufp1 bss text data bss Even though private to swap requires allocation in bss 15 Carnegie Mellon Relocation Info main main c int buf 2 1 2 int main swap return 0 main o 0000000 0 4 7 a b d e 11 16 19 1b 1c 1d 20 main 8d 4c 24 04 83 e4 f0 ff 71 fc 55 89 e5 51 83 ec 04 e8 fc ff ff ff 83 c4 04 31 c0 59 5d 8d 61 fc c3 lea 0x4 esp ecx and 0xfffffff0 esp pushl 0xfffffffc ecx push ebp mov esp ebp push ecx sub 0x4 esp call 12 main 0x12 12 R 386 PC32 swap add 0x4 esp xor eax eax pop ecx pop ebp lea 0xfffffffc ecx esp ret Disassembly of section data Source objdump r d 00000000 buf 0 01 00 00 00 02 00 00 00 16 Carnegie Mellon Relocation Info swap text swap c extern int buf swap o Disassembly of section text 00000000 swap int 0 8b 15 00 00 00 00 bufp0 buf 0 static int bufp1 void swap int temp bufp1 buf 1 temp bufp0 bufp0 bufp1 bufp1 temp 6 b c e 15 18 1a 1c 1d 23 mov 2 R 386 32 a1 04 00 00 00 mov 7 R 386 32 55 push 89 e5 mov c7 05 00 00 00 00 04 movl 00 00 00 10 R 386 32 14 R 386 32 8b 08 mov 89 10 mov 5d pop 89 0d 04 00 00 00 mov 1f R 386 32 c3 ret 0x0 edx buf 0x4 eax buf
View Full Document
Unlocking...