DOC PREVIEW
U of I CS 433 - Computer System Organization

This preview shows page 1-2-3-20-21-40-41-42 out of 42 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 42 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 433U: Computer System OrganizationMajor Performance Devices We Will StudyStatic versus Dynamic MechanismsDimensions of PerformanceLatency and Throughput at the Application LevelHow to Know When Throughput is MaximizedMaximizing ThroughputHow to Know When Latency is MinimizedDetermining Minimum LatencyInstruction Set Architecture (ISA)Classifying Instruction SetsClassifying Instruction SetsStack OperationsAccumulator – Register Operations2 Operand Reg – Reg Operations3 Operand Reg – Reg Operations3 Operand Reg – Reg Operations with Multiple Register Banks2 Reg Banks With Destination Operand Routed to EitherOperand FieldsEncoding Immediate DataTrading 3-Operand Form Away In Favor of Immediate Bits2 vs 3 Operand InstructionsLoad (Register Direct)Store (Register Direct)Load Register + OffsetStore Reg + OffsetLoad with Post IncrementStore with Post IncrementLoad with PredecrementLoad IndexedLoad Indexed w/ PostIncrementLoad CircularNon Load/Store with Memory Operand (Reg + Offset)Operand Mode SelectionAddressing Mode Selection for Load InstructionsIntegrated Memory OperandsEncoding Vector Operations: Register Pairs (Tuples)Encoding Vector Operations: Dual Register FilesCA:AQA on Operand Modes and Other Architectural ComplexitiesDesign of Conditional Test / Branch MechanismsHolding Conditional Test OutcomesConditional Branching8/29/2005 CS 433 2005 Luddy Harrison 1CS 433U:Computer System OrganizationLuddy HarrisonLecture 2Dimensions of PerformanceInstruction Set EncodingTypes for Signal and Media ProcessingConditional Test / Branch Architecture8/29/2005 CS 433 2005 Luddy Harrison 2Major Performance Devices We Will Studyz Pipelining (App A)z Breaking work into stagesz Can be a whole instruction, or a memory reference, or an arithmetic operationz Instruction Level Parallelism (ILP)z Doing more than one instruction at the same timez Static (Ch 4)z Compiler or human arranges instructions into linesz Dynamic (Ch 3)z Hardware control logic finds parallelismz Vector and Media Processing (not well covered by book)z Memory Hierarchy (Ch 5)z Cachingz Banked memory / wide memoryz Multiprocessing / multithreadingz Multiple processors doing independent work simultaneouslyz Multiple contexts (threads) for latency hiding / throughput maximization8/29/2005 CS 433 2005 Luddy Harrison 3Static versus Dynamic MechanismsDynamic / HardwareStatic / SoftwareILP Superscalar VLIWMemory Access OptimizationCaching DMA, Wide Memory AccessesMultithreading Simultaneous MultithreadingExplicit Multithreading8/29/2005 CS 433 2005 Luddy Harrison 4Dimensions of Performancez Latencyz The time to get one thing donez Measured in units of timez Throughputz The number of things that can be done per unit timez Measured in frequency (N / time)z These are the two fundamental measures of performancez Formula 1 Car vs. Freight Train8/29/2005 CS 433 2005 Luddy Harrison 5Latency and Throughput at the Application Levelz Web Browsingz Packet Processingz Voice over IPz Factory Controlz Airline Reservationsz Computer Gamesz MP3 Compression (Music Distribution)z MP3 Decompression (Music Playback)z Audio Recordingz Economic Modeling8/29/2005 CS 433 2005 Luddy Harrison 6How to Know When Throughput is Maximizedz One architectural feature / unit is saturatedwith useful workz E.g., the multiply pipeline is producing a required result every cyclez E.g., the memory is performing one load and one store per cycle of needed data, and that is its peak bandwidthz This rule is remarkably easy to apply even to complex processor architectures.8/29/2005 CS 433 2005 Luddy Harrison 7Maximizing ThroughputREPEAT 128 TIMES:A0 += R1 * R2 || R1 = LOAD(P1++) || R2 = LOAD(P2++)If the machine•Can issue / retire one such instruction line per cycle•Has only one multiplier OR can load at most two words per cycleThen throughput is maximized (assuming that this computation is useful).This instruction line causes the machine to simultaneously•Multiply R1 by R2 and add it to A0•Load R1 and R2 from addresses pointed to by P1 and P2•Update P1 and P2 by incrementing them8/29/2005 CS 433 2005 Luddy Harrison 8How to Know When Latency is Minimizedz Isolate critical path(s)z This is a function of the dependence structurez Determine min end-to-end time for each critical pathz The worst among these is the minimum latencyz And this assumes all critical paths can be performed simultaneouslyz This is inherently more difficult than determining if throughput is maximized because it is a function of all the resources required by every critical path8/29/2005 CS 433 2005 Luddy Harrison 9Determining Minimum Latency8/29/2005 CS 433 2005 Luddy Harrison 10Instruction Set Architecture (ISA)z An ISA becomes an important interface toz Compilersz Assemblersz Linkersz … and it will live on in the form ofz Librariesz Firmwarez OS codez Device driversz For this reason, ISAs outlive individual processor models.z A successful ISA will have numerous embodimentsz MIPS (R3000, R4000, R4300, R10000, etc.)z PowerPC (601, 603, …) and Power (Power3, Power4, Power5, Power6, …)z X86 (8086, 80386, 80486, P5, P6, AMD!, …) Without a doubt the most successful ISA of all time by this measurez 6800, 68010, 68020, 68030, 68040, …z The ISA will bez Modified (original behavior altered in later embodiments)z Extended (unused encodings will become used)z ISA design and processor design are related but distinct activities!8/29/2005 CS 433 2005 Luddy Harrison 11Classifying Instruction Setsz Stackz 0 operandsz Accumulatorz 1 operandz Register-Memoryz 2 operands: one register, one memoryz Register-Register / Load-Storez 2 operands in registersz For LOAD, one operand is address, other is dest dataz For STORE, one operand is address, other is source data8/29/2005 CS 433 2005 Luddy Harrison 12Classifying Instruction Setsz The reality of modern processors does not fit this taxonomy very well, to say the leastz The Register-Register / Load-Store instruction sets is real (e.g., MIPS, Alpha, Sparc)z The others do not occur in pure form in modern processor architecturesz Think of these as categories of instructions or operationsrather than categories of machinesz A machine with Register-Register instructions may also have Stack instructions and Accumulator instructionsz A machine may have instructions with multiple operations; the operations may be encoded separately within the insnand may use different formats or styles.8/29/2005 CS 433 2005 Luddy Harrison 13Stack OperationsInstruction


View Full Document

U of I CS 433 - Computer System Organization

Download Computer System Organization
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Computer System Organization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Computer System Organization 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?