Machine-Level Programming I: Introduction Sept. 7, 2007x86 Processorsx86 Evolution: Programmer’s View (Abbreviated)x86 Evolution: Programmer’s ViewItanium: a 64-bit architecturex86 ClonesIntel’s Response to AMD’s x86-64The Rate of Single-Thread Performance Improvement has DecreasedImpact of Power Density on the Microprocessor Industryx86 Evolution: Recent HistoryOur CoverageAssembly Programmer’s ViewTurning C into Object CodeCompiling Into AssemblyAssembly CharacteristicsObject CodeMachine Instruction ExampleDisassembling Object CodeAlternate DisassemblyWhat Can be Disassembled?Moving Data: IA32movl Operand CombinationsSimple Addressing ModesUsing Simple Addressing ModesSlide 25Understanding SwapSlide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33Indexed Addressing ModesAddress Computation ExamplesAddress Computation InstructionSome Arithmetic OperationsSlide 38Using leal for Arithmetic ExpressionsUnderstanding arithSlide 41Slide 42Slide 43Slide 44Slide 45Another ExampleSlide 47Slide 48Slide 49Data Representations: IA32 + x86-64x86-64 General Purpose RegistersSwap in 32-bit ModeSwap in 64-bit ModeSwap Long Ints in 64-bit ModeSummaryMachine-Level Programming I:IntroductionSept. 7, 2007TopicsAssembly Programmer’s Execution ModelAccessing InformationRegistersMemoryArithmetic operationsclass04.ppt15-213“The course that gives CMU its Zip!”15-213, F’07– 2 –15-213, F’07x86 ProcessorsDominate the Desktop, Laptop, and Server MarketsEvolutionary DesignStarting in 1978 with 8086Added more features as time goes onStill support old features, although obsolete“Complex Instruction Set Computer” (CISC)Many different instructions with many different formatsBut, only small subset encountered with Linux programs“Reduced Instruction Set Computers” (RISC) enjoyed a performance advantage during the late ’80s, early ’90sUntil a CMU alumnus (Bob Colwell) changed that with Pentium ProSince Pentium Pro, x86 has been a performance leader– 3 –15-213, F’07x86 Evolution: Programmer’s View(Abbreviated)x86 Evolution: Programmer’s View(Abbreviated)Name Date Transistors8086 1978 29K16-bit processor. Basis for IBM PC & DOSLimited to 1MB address space. DOS only gives you 640K386 1985 275KExtended to 32 bits. Added “flat addressing”Capable of running UnixReferred to as “IA32”32-bit Linux/gcc uses no instructions introduced in later models– 4 –15-213, F’07x86 Evolution: Programmer’s Viewx86 Evolution: Programmer’s ViewMachine Evolution486 1989 1.9MPentium 1993 3.1MPentium/MMX 1997 4.5MPentiumPro 1995 6.5MPentium III 1999 8.2MPentium 4 2001 42MAdded FeaturesInstructions to support multimedia operationsParallel operations on 1, 2, and 4-byte data, both integer & FPInstructions to enable more efficient conditional operationsLinux/GCC EvolutionNone!Watershed design– 5 –15-213, F’07Itanium: a 64-bit architectureItanium: a 64-bit architectureName Date TransistorsItanium 2001 10MExtends to IA64, a 64-bit architectureRadically new instruction set designed for high performanceCan run existing IA32 programsOn-board “x86 engine”Joint project with Hewlett-PackardItanium 2 2002 221MBig performance boostItanium 2 Dual-Core 2006 1.7BItanium has not taken off in marketplace as Intel had originally hoped– 6 –15-213, F’07x86 Clonesx86 ClonesAdvanced Micro Devices (AMD)HistoricallyAMD has followed just behind IntelA little bit slower, a lot cheaperStarting in roughly 2001Recruited top circuit designers from Digital Equipment Corp. and other downward trending companiesExploited fact that Intel was distracted by ItaniumStarted making very competitive products, especially at the high endDeveloped x86-64, its own extension to 64 bits Started eating into Intel’s high-end server market– 7 –15-213, F’07Intel’s Response to AMD’s x86-64Intel’s Response to AMD’s x86-642004: Intel Announces EM64T extension to IA32Extended Memory 64-bit TechnologyVery similar to x86-64Our Saltwater fish machines– 8 –15-213, F’07The Rate of Single-Thread Performance Improvement has DecreasedThe Rate of Single-Thread Performance Improvement has Decreased(Figure courtesy of Hennessy & Patterson, “Computer Architecture, A Quantitative Approach”, V4.)– 9 –15-213, F’07Impact of Power Density on the Microprocessor IndustryImpact of Power Density on the Microprocessor IndustryThe future is not higher clock rates, but multiple cores per die.Pat Gelsinger, ISSCC 2001– 10 –15-213, F’07x86 Evolution: Recent Historyx86 Evolution: Recent HistoryYear Transistors Clock (GHz) Power (W)Pentium 4 2000 42M 1.7-3.4 65-89Pentium M 2003 140M 1.4-2.1 21 Core Duo 2006 151M 2.3-2.5Core 2 Duo 2006 291M 2.6-2.9Core 2 Quad 2006 2x291M 2.6-2.9(To learn more about parallel processing, take 15-418 in Spring ’08.)Intel Core 2 Duo (Conroe)Copyright © IntelCopyright © Intel– 11 –15-213, F’07Our CoverageOur CoverageIA32The traditional x86x86-64The emerging standardPresentationBook has IA32Handout has x86-64Lecture will cover bothLabsLab #2 x86-64Lab #3 IA32– 12 –15-213, F’07Assembly Programmer’s ViewAssembly Programmer’s ViewProgrammer-Visible StatePC Program CounterAddress of next instructionCalled “EIP” (IA32) or “RIP” (x86-64)Register FileHeavily used program dataCondition CodesStore status information about most recent arithmetic operationUsed for conditional branchingPCRegistersCPUMemoryObject CodeProgram DataOS DataAddressesDataInstructionsStackConditionCodesMemoryByte addressable arrayCode, user data, (some) OS dataIncludes stack used to support procedures– 13 –15-213, F’07texttextbinarybinaryCompiler (gcc -S)Assembler (gcc or as)Linker (gcc or ld)C program (p1.c p2.c)Asm program (p1.s p2.s)Object program (p1.o p2.o)Executable program (p)Static libraries (.a)Turning C into Object CodeTurning C into Object CodeCode in files p1.c p2.cCompile with command: gcc -O p1.c p2.c -o pUse optimizations (-O)Put resulting binary in file p– 14 –15-213, F’07Compiling Into AssemblyC Codeint sum(int x, int y){ int t = x+y; return t;}Generated IA32 Assembly_sum:pushl %ebpmovl %esp,%ebpmovl 12(%ebp),%eaxaddl 8(%ebp),%eaxmovl %ebp,%esppopl %ebpretObtain with commandgcc -O -S code.cProduces file code.s– 15 –15-213,
View Full Document