DOC PREVIEW
Stanford CS 140 - Today's Big Adventure

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Today's Big AdventureLinking as our first naming systemPerspectives on memory contentsHow is a process specified?How is a program executed?What does a process look like? (Unix)Who builds what?ExampleLinkers (Linkage editors)Simple linker: two passes neededWhere to put emitted objects?Where to put emitted objectsWhere is everything?Linker: Where is everythingExample: 2 modules and C libInitial object filesPass 1: Linker reorganizationPass 2: RelocationWhat gets written outExamining programs with nmExamining programs with objdumpTypes of relocationName manglingInitialization and destructionOther information in executablesVariation 0: Dynamic linkingVariation 1: Static shared librariesStatic shared librariesVariation 2: Dynamic shared libsPosition-independent codeLazy dynamic linkingCode = data, data = codeHow?Linking and securityLinking SummaryToday’s Big Adventure- How to name and refer to things that don’t exist yet- How to merge separate name spaces into a cohesive whole• Readings-a.out & elf man pages, ELF standard- Run “nm” or “objdump” on a few .o and a.out files.1/35Linking as our first naming system• Naming is a very deep theme that comes upeverywhere• Naming system: maps names to values• Examples:- Linking: Where is printf? How to refer to it? How to deal withsynonyms? What if it doesn’t exist?- Virtual memory address (name) resolved to physical address(value) using page table- File systems: translating file and directory names to disklocations, organizing names so you can navigate, . . .- www.stanford.edu resolved 171.67.216.17 using DNS- IP addresses resolved to Ethernet addresses with ARP- Street names: translating (elk, pine, . . . ) vs (1st, 2nd, . . . ) toactual location2/35Perspectives on memory contents• Programming language view:x += 1; add $1, %eax- Instructions: Specify operations to perform-Variables: Operands that can change over time-Constants: Operands that never change• Hardware view:-executable: code, usually read-only-read only: constants (maybe one copy for all processes)-read/write: variables (each process needs own copy)• Need addresses to use data:- Addresses locate things. Mu st update them when you move- Examples: linkers, garbage collectors, changing apartment• Binding time: When is a value determined/computed?- Early to late: Compile time, Link time, Load time, Runtime3/35How is a process specified?• Executable file: the linker/OS interface.- What is code? What is data?- Where should they live?• Linker builds executables from object files:4/35How is a program executed?• On Unix systems, read by “loader”- Reads all code/data segs into buffer cache;Maps code (read only) and initialized data (r/w) into addr space- Or. . . fakes process state to look like paged out• Lots of optimizations happen in practice:- Zero-initialized data does not need to be read in.- Demand load: wait until code used before get from disk- Copies of same program running? Share code- Multiple programs use same routines: share code (harder)5/35What does a process look like? (Unix)• Process address space divided into “segments”- text (code), data, heap (dynamic data), and stackStackCodeRead-only dataInitialized dataUninitialized dataHeapKernelregionsmmapped- Why? (1) different allocation patterns; (2) separate code/data6/35Who builds what?• Heap: allocated and laid out at runtime by malloc- Compiler, linker not involved other than saying where it can start- Namespace constructed dynamically and managed by programmer(names stored in pointers, and organized using data structures)• Stack: alloc a t runtime (proc c a ll), layout by compiler- Names are relative off of stack (or frame) pointer- Managed by compiler (alloc on proc entry, free on exit)- Linker not involved because name space entirely local:Compiler has enough information to build it.• Global data/code: alloc by compiler, layout by linker- Compiler emits them and names with symbolic references- Linker lays them out and translates references7/35Example• Simple program has “printf ("hello world\n");”• Compile w ith: cc -m32 -fno-builtin -S hello.c- -S says don’t run assembler (-m32 is 32-bit x86 code)• Output in hello.s has symbolic reference to printf.section .rodata.LC0: .string "hello world\n".text.globl mainmain: ...subl $4, %espmovl $.LC0, (%esp)callprintf• Disassemble w ith objdump -d:18: e8fc ff ff ff call 19 <main+0x19>- Jumps to PC - 4 = address of address within instruction8/35Linkers (Linkage editors)• Unix: ld- Usually hidden behind compiler- Run gcc -v hello.c to see ld or invoked (may see collect2)• Three functions:- Collect together all pieces of a program- Coalesce like segments- Fix addresses of code and data so the program can run• Result: runnable program stored in new object file• Why can’t compiler do this?- Limited world view: sees one file, rather than all files• Usually linkers don’t rearrange segments, but can- E.g., re-order instructions for fewer cache misses;remove routines that are never called from a.out9/35Simple linker: two passes needed• Pass 1:- Coalesce like segments; arrange in non-overlapping mem.- Read file’s symbol table, construct global symbol table withentry for every symbol used or defined- Compute virtual address of each segment (at start+offset)• Pass 2:- Patch references using file and global symbol table- Emit result• Symbol table: information about program kept whilelinker running- Segments: name, size, old location, new location- Symbols: name, input segment, offset within segment10/35Where to put emitted objects?• Assember:- Doesn’t know where data/code should beplaced in the process’s address space- Assumes everything starts at zero- Emitssymbol table that holds the name andoffset of each created object- Routines/variables exported by file arerecorded asglobal definitions• Simpler perspective:- Code is in a big char array- Data is in another big char array- Assembler creates (object name, index) tuplefor each interesting thing- Linker then merges all of these arrays0 foo:call printfret40 bar:...retfoo: 0: Tbar: 40: t11/35Where to put emitted objects• At link time, linker- Determines the size of each segment and the resulting addressto place each object at- Stores all global definitions in a global symbol table that mapsthe definition to its final virtual address12/35Where is everything?• How to call procedures or reference variables?- E.g., call to printf needs a target


View Full Document

Stanford CS 140 - Today's Big Adventure

Documents in this Course
Homework

Homework

25 pages

Notes

Notes

8 pages

Load more
Download Today's Big Adventure
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Today's Big Adventure and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Today's Big Adventure 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?