IA32 Processors 15 213 Totally Dominate Computer Market Machine Level Programming I Introduction Feb 1 2000 Evolutionary Design Starting in 1978 with 8086 Added more features as time goes on Still support old features although obsolete Complex Instruction Set Computer CISC Topics Many different instructions with many different formats But only small subset encountered with Linux programs Hard to match performance of Reduced Instruction Set Computers RISC But Intel has done just that Assembly Programmer s Execution Model Accessing Information Registers Memory Arithmetic operations 2 CS 213 S 00 class05 ppt class05 ppt X86 Evolution Programmer s View X86 Evolution Programmer s View Name 8086 Name Date Pentium MMX 1997 Date 1978 Transistors 29K Added special collection of instructions for operating on 64 bit vectors of 1 2 or 4 byte integer data 16 bit processor Basis for IBM PC DOS Limited to 1MB address space DOS only gives you 640K 80286 1982 Pentium II 134K 1985 Pentium III 275K class05 ppt 1989 1993 Merced 1999 8 2M 2000 10M Extends to IA64 a 64 bit architecture Radically new instruction set designed for high performance Will be able to run existing IA32 programs On board x86 engine 1 9M 3 1M 3 7M Added streaming SIMD instructions for operating on 128 bit vectors of 1 2 or 4 byte integer or floating point data Extended to 32 bits Added flat addressing Capable of running Unix Linux gcc uses no instructions introduced in later models 486 Pentium 1997 Added conditional move instructions Big change in underlying microarchitecture Added elaborate but not very useful addressing scheme Basis for IBM PC AT and Windows 386 Transistors 4 5M CS 213 S 00 class05 ppt Page 1 4 CS 213 S 00 Turning C into Object Code Assembly Programmer s View CPU E I P Registers Condition Codes Code in files p1 c p2 c Compile with command gcc O p1 c p2 c o p Use optimizations O Put resulting binary in file p Memory Addresses Data Instructions Object Code Program Data OS Data C program p1 c p2 c text Compiler gcc S Programmer Visible State Stack EIP Program Counter Address of next instruction Register File Heavily used program data Condition Codes Store status information about most recent arithmetic operation Used for conditional branching class05 ppt text Assembler gcc or as 5 int sum int x int y int t x y return t Static libraries a Linker gcc or ld binary CS 213 S 00 Executable program p class05 ppt 6 CS 213 S 00 Assembly Characteristics Generated Assembly Minimal Data Types Integer data of 1 2 or 4 bytes Data values Addresses untyped pointers Floating point data of 4 or 8 bytes No aggregate types such as arrays or structures Just contiguously allocated bytes in memory sum pushl ebp movl esp ebp movl 12 ebp eax addl 8 ebp eax movl ebp esp popl ebp ret Primitive Operations Perform arithmetic function on register or memory data Transfer data between memory and register Load data from memory into register Store register data into memory Transfer control Unconditional jumps to from procedures Conditional branches Obtain with command gcc O S code c Produces file code s class05 ppt Object program p1 o p2 o binary Memory Byte addressable array Code user data some OS data Includes stack used to support procedures Compiling Into Assembly C Code Asm program p1 s p2 s 7 CS 213 S 00 class05 ppt Page 2 8 CS 213 S 00 Machine Instruction Example Object Code Code for sum C Code Assembler 0x401040 sum 0x55 Total of 13 0x89 bytes 0xe5 Each 0x8b instruction 1 0x45 2 or 3 bytes 0x0c Starts at 0x03 address 0x45 0x401040 0x08 0x89 0xec 0x5d 0xc3 Translates s into o Binary encoding of each instruction Nearly complete image of executable code Missing linkages between code in different files Assembly Similar to expression x y Resolves references between files Combines with static run time libraries E g code for malloc printf Some libraries are dynamically linked Linking occurs when program begins execution Object Code 0x401046 CS 213 S 00 ebp esp ebp 0xc ebp eax 0x8 ebp eax ebp esp ebp 0x0 esi esi objdump d p Useful tool for examining object code Analyzes bit pattern of series of instructions Produces approximate rendition of assembly code Can be run on either a out complete executable or o file class05 ppt 11 CS 213 S 00 Disassembled Object 0x401040 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x89 0xec 0x5d 0xc3 Disassembler 10 Alternate Disassembly Disassembled push mov mov add mov pop ret lea 3 byte instruction Stored at address 0x401046 03 45 08 class05 ppt Disassembling Object Code 00401040 sum 0 55 1 89 e5 3 8b 45 0c 6 03 45 08 9 89 ec b 5d c c3 d 8d 76 00 Add 2 4 byte integers Long words in GCC parlance Same instruction whether signed or unsigned Operands x Register eax y Memory M ebp 8 t Register eax Return function value in eax addl 8 ebp eax Linker 9 class05 ppt Add two signed integers int t x y 0x401040 0x401041 0x401043 0x401046 0x401049 0x40104b 0x40104c 0x40104d sum sum 1 sum 3 sum 6 sum 9 sum 11 sum 12 sum 13 push mov mov add mov pop ret lea ebp esp ebp 0xc ebp eax 0x8 ebp eax ebp esp ebp 0x0 esi esi Within gdb Debugger gdb p diassemble sum Disassemble procedure x 13b sum Examine the 13 bytes starting at sum CS 213 S 00 class05 ppt Page 3 12 CS 213 S 00 What Can be Disassembled Moving Data Moving Data objdump d WINWORD EXE WINWORD EXE movl Source Dest Move 4 byte long word Accounts for 31 of all instructions in sample file format pei i386 No symbols in WINWORD EXE Disassembly of section text 30001000 text 30001000 55 30001001 8b ec 30001003 6a ff 30001005 68 90 10 00 30 3000100a 68 91 dc 4c 30 push mov push push push ebp esp ebp 0xffffffff 0x30001090 0x304cdc91 13 movl Destination Imm Reg Mem CS 213 S 00 Reg Reg Mem Mem Reg temp 0x4 movl 147 eax p 147 movl eax edx temp2 temp1 movl eax edx p temp movl eax edx temp p 14 class05 ppt ebx esi edi esp ebp CS 213 S 00 Simple Addressing Modes Normal C Analog movl 0x4 eax ecx Immediate Constant integer data Like C constant but prefixed with E g 0x400 533 Encoded with 1 2 or 4 bytes Register One of 8 integer registers But esp and ebp reserved for special use Others have special uses for particular instructions Memory 4 consecutive bytes of memory Various address modes movl Operand Combinations Source edx Operand Types Anything that can be interpreted as executable code Disassembler examines bytes and reconstructs assembly source class05 ppt eax R Mem Reg R Register R specifies memory address movl ecx eax Displacement D R Mem Reg R D Register R specifies start of memory
View Full Document