Machine-Level Programming I: Introduction Feb. 1, 2000IA32 ProcessorsX86 Evolution: Programmer’s ViewSlide 4Assembly Programmer’s ViewTurning C into Object CodeCompiling Into AssemblyAssembly CharacteristicsObject CodeMachine Instruction ExampleDisassembling Object CodeAlternate DisassemblyWhat Can be Disassembled?Moving Datamovl Operand CombinationsSimple Addressing ModesUsing Simple Addressing ModesUnderstanding SwapIndexed Addressing ModesAddress Computation InstructionSome Arithmetic OperationsUsing leal for Arithmetic ExpressionsUnderstanding arithAnother ExampleCISC PropertiesSummary: Abstract MachinesPentium Pro (P6)PentiumPro Block DiagramPentiumPro OperationWhose Assembler?Machine-Level Programming I:IntroductionFeb. 1, 2000Topics•Assembly Programmer’s Execution Model•Accessing Information–Registers–Memory•Arithmetic operationsclass05.ppt15-213CS 213 S’00– 2 –class05.pptIA32 ProcessorsTotally Dominate Computer MarketEvolutionary Design•Starting in 1978 with 8086•Added more features as time goes on•Still support old features, although obsoleteComplex Instruction Set Computer (CISC)•Many different instructions with many different formats–But, only small subset encountered with Linux programs•Hard to match performance of Reduced Instruction Set Computers (RISC)•But, Intel has done just that!CS 213 S’00– 3 –class05.pptX86 Evolution: Programmer’s ViewName Date Transistors8086 1978 29K•16-bit processor. Basis for IBM PC & DOS•Limited to 1MB address space. DOS only gives you 640K80286 1982 134K•Added elaborate, but not very useful, addressing scheme•Basis for IBM PC-AT and Windows386 1985 275K•Extended to 32 bits. Added “flat addressing”•Capable of running Unix•Linux/gcc uses no instructions introduced in later models486 1989 1.9MPentium 1993 3.1MCS 213 S’00– 4 –class05.pptX86 Evolution: Programmer’s ViewName Date TransistorsPentium/MMX 1997 4.5M•Added special collection of instructions for operating on 64-bit vectors of 1, 2, or 4 byte integer dataPentium II 1997 7M•Added conditional move instructions•Big change in underlying microarchitecturePentium III 1999 8.2M•Added “streaming SIMD” instructions for operating on 128-bit vectors of 1, 2, or 4 byte integer or floating point dataMerced 2000? 10M•Extends to IA64, a 64-bit architecture•Radically new instruction set designed for high performance•Will be able to run existing IA32 programs–On-board “x86 engine”CS 213 S’00– 5 –class05.pptAssembly Programmer’s ViewProgrammer-Visible State•EIP Program Counter–Address of next instruction•Register File–Heavily used program data•Condition Codes–Store status information about most recent arithmetic operation–Used for conditional branchingEIPRegistersCPUMemoryObject CodeProgram DataOS DataAddressesDataInstructionsStackConditionCodes•Memory–Byte addressable array–Code, user data, (some) OS data–Includes stack used to support proceduresCS 213 S’00– 6 –class05.ppttexttextbinarybinaryCompiler (gcc -S)Assembler (gcc or as)Linker (gcc or ld)C program (p1.c p2.c)Asm program (p1.s p2.s)Object program (p1.o p2.o)Executable program (p)Static libraries (.a)Turning C into Object Code•Code in files p1.c p2.c•Compile with command: gcc -O p1.c p2.c -o p–Use optimizations (-O)–Put resulting binary in file pCS 213 S’00– 7 –class05.pptCompiling Into AssemblyC Codeint sum(int x, int y){ int t = x+y; return t;}Generated Assembly_sum:pushl %ebpmovl %esp,%ebpmovl 12(%ebp),%eaxaddl 8(%ebp),%eaxmovl %ebp,%esppopl %ebpretObtain with commandgcc -O -S code.cProduces file code.sCS 213 S’00– 8 –class05.pptAssembly CharacteristicsMinimal Data Types•“Integer” data of 1, 2, or 4 bytes–Data values–Addresses (untyped pointers)•Floating point data of 4 or 8 bytes•No aggregate types such as arrays or structures–Just contiguously allocated bytes in memoryPrimitive Operations•Perform arithmetic function on register or memory data•Transfer data between memory and register–Load data from memory into register–Store register data into memory•Transfer control–Unconditional jumps to/from procedures–Conditional branchesCS 213 S’00– 9 –class05.pptCode for sum0x401040 <sum>:0x550x890xe50x8b0x450x0c0x030x450x080x890xec0x5d0xc3Object CodeAssembler•Translates .s into .o•Binary encoding of each instruction•Nearly-complete image of executable code•Missing linkages between code in different filesLinker•Resolves references between files•Combines with static run-time libraries–E.g., code for malloc, printf•Some libraries are dynamically linked–Linking occurs when program begins execution•Total of 13 bytes•Each instruction 1, 2, or 3 bytes•Starts at address 0x401040CS 213 S’00– 10 –class05.pptMachine Instruction ExampleC Code•Add two signed integersAssembly•Add 2 4-byte integers–“Long” words in GCC parlance–Same instruction whether signed or unsigned•Operands:x: Register %eaxy: Memory M[%ebp+8]t: Register %eax»Return function value in %eaxObject Code•3-byte instruction•Stored at address 0x401046int t = x+y;addl 8(%ebp),%eax0x401046: 03 45 08Similar to expression x += yCS 213 S’00– 11 –class05.pptDisassembled00401040 <_sum>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 8b 45 0c mov 0xc(%ebp),%eax 6: 03 45 08 add 0x8(%ebp),%eax 9: 89 ec mov %ebp,%esp b: 5d pop %ebp c: c3 ret d: 8d 76 00 lea 0x0(%esi),%esiDisassembling Object CodeDisassemblerobjdump -d p•Useful tool for examining object code•Analyzes bit pattern of series of instructions•Produces approximate rendition of assembly code•Can be run on either a.out (complete executable) or .o fileCS 213 S’00– 12 –class05.pptDisassembled0x401040 <sum>: push %ebp0x401041 <sum+1>: mov %esp,%ebp0x401043 <sum+3>: mov 0xc(%ebp),%eax0x401046 <sum+6>: add 0x8(%ebp),%eax0x401049 <sum+9>: mov %ebp,%esp0x40104b <sum+11>: pop %ebp0x40104c <sum+12>: ret 0x40104d <sum+13>: lea 0x0(%esi),%esiAlternate DisassemblyWithin gdb Debuggergdb pdiassemble sum•Disassemble procedurex/13b sum•Examine the 13 bytes starting at sumObject0x401040: 0x550x890xe50x8b0x450x0c0x030x450x080x890xec0x5d0xc3CS 213 S’00– 13 –class05.pptWhat Can be Disassembled?•Anything that can be interpreted as executable code•Disassembler
View Full Document