15 213 The course that gives CMU its Zip Machine Level Programming I Introduction Sept 12 2000 Topics Assembly Programmer s Execution Model Accessing Information Registers Memory Arithmetic operations class05 ppt IA32 Processors Totally Dominate Computer Market Evolutionary Design Starting in 1978 with 8086 Added more features as time goes on Still support old features although obsolete Complex Instruction Set Computer CISC Many different instructions with many different formats But only small subset encountered with Linux programs Hard to match performance of Reduced Instruction Set Computers RISC But Intel has done just that class05 ppt 2 CS 213 F 00 X86 Evolution Programmer s View Name 8086 Date 1978 Transistors 29K 80286 1982 134K 386 1985 275K 486 Pentium 1989 1993 1 9M 3 1M 16 bit processor Basis for IBM PC DOS Limited to 1MB address space DOS only gives you 640K Added elaborate but not very useful addressing scheme Basis for IBM PC AT and Windows Extended to 32 bits Added flat addressing Capable of running Unix Linux gcc uses no instructions introduced in later models class05 ppt 3 CS 213 F 00 X86 Evolution Programmer s View Name Date Pentium MMX 1997 Transistors 4 5M Pentium II 1997 7M Pentium III 1999 8 2M Pentium 4 2001 42M Added special collection of instructions for operating on 64 bit vectors of 1 2 or 4 byte integer data Added conditional move instructions Big change in underlying microarchitecture Added streaming SIMD instructions for operating on 128 bit vectors of 1 2 or 4 byte integer or floating point data Added 8 byte formats and 144 new instructions for streaming SIMD mode class05 ppt 4 CS 213 F 00 Name New Species IA64 Date Transistors Itanium 2000 10M Extends to IA64 a 64 bit architecture Radically new instruction set designed for high performance Will be able to run existing IA32 programs On board x86 engine class05 ppt 5 CS 213 F 00 Assembly Programmer s View CPU E I P Memory Addresses Registers Data Condition Codes Instructions Stack Programmer Visible State Program Counter EIP Address of next instruction Register File Heavily used program data Condition Codes Store status information about most recent arithmetic operation Used for conditional branching class05 ppt Object Code Program Data OS Data Memory Byte addressable array Code user data some OS data Includes stack used to support procedures 6 CS 213 F 00 Turning C into Object Code p1 c p2 c Code in files Compile with command gcc O p1 c p2 c o p Use optimizations O Put resulting binary in file p C program p1 c p2 c text Compiler gcc S Asm program p1 s p2 s text Assembler gcc or as Object program p1 o p2 o binary Static libraries a Linker gcc or ld binary class05 ppt Executable program p 7 CS 213 F 00 Compiling Into Assembly Generated Assembly C Code int sum int x int y int t x y return t sum pushl ebp movl esp ebp movl 12 ebp eax addl 8 ebp eax movl ebp esp popl ebp ret Obtain with command gcc O S code c Produces file code s class05 ppt 8 CS 213 F 00 Assembly Characteristics Minimal Data Types Integer data of 1 2 or 4 bytes Data values Addresses untyped pointers Floating point data of 4 8 or 10 bytes No aggregate types such as arrays or structures Just contiguously allocated bytes in memory Primitive Operations Perform arithmetic function on register or memory data Transfer data between memory and register Load data from memory into register Store register data into memory Transfer control Unconditional jumps to from procedures Conditional branches class05 ppt 9 CS 213 F 00 Code for sum Object Code 0x401040 sum 0x55 Total of 13 0x89 bytes 0xe5 Each 0x8b instruction 1 2 0x45 or 3 bytes 0x0c Starts at 0x03 address 0x45 0x401040 0x08 0x89 0xec 0x5d 0xc3 class05 ppt Assembler Translates s into o Binary encoding of each instruction Nearly complete image of executable code Missing linkages between code in different files Linker Resolves references between files Combines with static run time libraries E g code for malloc printf Some libraries are dynamically linked Linking occurs when program begins execution 10 CS 213 F 00 Machine Instruction Example C Code Add two signed integers int t x y Assembly Add 2 4 byte integers Long words in GCC parlance Same instruction whether signed or unsigned Operands x Register eax y Memory M ebp 8 t Register eax Return function value in eax addl 8 ebp eax Similar to expression x y 0x401046 class05 ppt 03 45 08 Object Code 3 byte instruction Stored at address 0x401046 11 CS 213 F 00 Disassembling Object Code Disassembled 00401040 sum 0 55 1 89 e5 3 8b 45 0c 6 03 45 08 9 89 ec b 5d c c3 d 8d 76 00 push mov mov add mov pop ret lea ebp esp ebp 0xc ebp eax 0x8 ebp eax ebp esp ebp 0x0 esi esi Disassembler objdump d p Useful tool for examining object code Analyzes bit pattern of series of instructions Produces approximate rendition of assembly code Can be run on either a out complete executable or o file class05 ppt 12 CS 213 F 00 Alternate Disassembly Disassembled Object 0x401040 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x89 0xec 0x5d 0xc3 class05 ppt 0x401040 0x401041 0x401043 0x401046 0x401049 0x40104b 0x40104c 0x40104d sum sum 1 sum 3 sum 6 sum 9 sum 11 sum 12 sum 13 push mov mov add mov pop ret lea ebp esp ebp 0xc ebp eax 0x8 ebp eax ebp esp ebp 0x0 esi esi Within gdb Debugger gdb p disassemble sum Disassemble procedure x 13b sum Examine the 13 bytes starting at sum 13 CS 213 F 00 What Can be Disassembled objdump d WINWORD EXE WINWORD EXE file format pei i386 No symbols in WINWORD EXE Disassembly of section text 30001000 text 30001000 55 30001001 8b ec 30001003 6a ff 30001005 68 90 10 00 30 3000100a 68 91 dc 4c 30 push mov push push push ebp esp ebp 0xffffffff 0x30001090 0x304cdc91 Anything that can be interpreted as executable code Disassembler examines bytes and reconstructs assembly source class05 ppt 14 CS 213 F 00 Moving Data Moving Data movl Source Dest Move 4 byte long word Accounts for 31 of all instructions in sample Operand Types Immediate Constant integer data Like C constant but prefixed with E g 0x400 533 Encoded with 1 2 or 4 bytes Register One of 8 integer registers But esp and ebp reserved for special use Others have special uses for particular instructions Memory 4 consecutive bytes of memory Various address modes class05 ppt 15 eax edx ecx ebx esi edi esp ebp CS 213 F 00 movl Operand Combinations Source movl Destination C Analog Imm Reg Mem Reg Reg Mem movl eax edx temp2 temp1 movl eax edx p temp Mem Reg movl eax edx temp p movl 0x4 eax temp 0x4 movl
View Full Document