Unformatted text preview:

Today s Menu Introduction to EE392C EE392C Lecture 1 Introduction Technology Background Course focus Basic information Assignments projects Processor technology background Christos Kozyrakis christos ee stanford edu General purpose processors not a solved problem The vision a CMP processor with coarse grain reconfiguration capabilities Contributing technologies http www stanford edu class ee392c EE392c Spring 2003 C Kozyrakis EE392c Spring 2003 Lecture 1 2 EE392C Adv Topics in Computer Architecture Basic Information Focus next generation general purpose processors Lectures Tue Thu 1 15 2 30pm Room 61A Single chip multiprocessors CMPs with coarse grain reconfiguration capabilities aka Polymorphic Processors Topics architecture programming language compilers operating systems applications fault tolerance Bring your favorite coffee beverage but not your lunch Web page http www stanford edu class ee392c Latest schedule papers notes info etc Check regularly Communications channels Goals initiate research in this area C Kozyrakis Review previous and current work Identify open issues and key opportunities Propose initial approaches and demonstrate their potential If we work hard enough publish some ideas and results Newsgroup su class ee392c for questions discussion Mailing list ee392c spr0203 all lists stanford edu for announcements only Email the instructor for more specific issues Read the information sheet for details EE392c Spring 2003 Lecture 1 3 C Kozyrakis EE392c Spring 2003 Lecture 1 4 C Kozyrakis 1 The EE392C Instructors Team Your Participation Christos Kozyrakis Every class meeting Assistant professor of EE CS christos ee stanford edu Teaching Assistant Metha Jeeradit Lead one class meeting Maintains web page some of the project tools metha stanford edu Administrative support Chris Lilly Read papers before class meeting Actively participate in the class discussion 10 minute introduction to the topic Guide the open discussion Keep notes of one discussion Mini presentation on emerging applications Project in groups of 3 4 students clilly cs stanford edu All of you Original research on a open issue Includes proposal presentation final paper Review one final paper from another group EE392c Spring 2003 Lecture 1 5 C Kozyrakis EE392c Spring 2003 Lecture 1 6 C Kozyrakis Who Should Take EE392C Grad students interested in systems research Architecture compilers operating systems Or students in application areas interested in system implications networking databases graphics Diversity of interests and experiences is good and now for something completely different Prerequisites officially none Unofficially one of EE282 CS243 or CS240 Talk to instructor for specific questions Enrollment limited to 30 To allow for interesting round table discussions EE392c Spring 2003 Lecture 1 7 C Kozyrakis EE392c Spring 2003 Lecture 1 8 C Kozyrakis 2 Aren t We Done with Processors Yet Historical Performance Trend Performance improving at 55 per year since 1982 Similar improvements in cost Using the same sequential instruction set x86 1000 i nt el 386 i nt el 486 i nt el pent i um i nt el pent i um 2 i nt el pent i um 3 Faster transistors 20 per year Moore s Law Fewer gates per pipeline stage 12 per year Deeper Performance Contributing technologies i nt el i t ani um A l pha 21064 A l pha 21164 A l pha 21264 Spar c Super Spar c 10 Spar c 64 M i ps HP P A pipelines better circuits design P ower P C AMD K6 AMD K7 1 More instructions per cycle IPC 23 per year Wide i nt el pent i um 4 100 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 instruction issue multiple ALUs caches speculation Horowitz00 EE392c Spring 2003 Lecture 1 9 C Kozyrakis Trouble in Paradise EE392c Spring 2003 Lecture 1 10 C Kozyrakis Deep Pipelining Implications Moore s Law good for at least another decade But long wires will slow down compared to transistors Pipelining problems Grochowski Pentium4 Very deep pipelining hurts IPC Already at 8 16 inverters per stage difficult to go lower Pentium3 Intel 1997 IPC problems Inherently limited instruction level parallelism ILP Large HW structures for ILP difficult to clock fast High clock frequency but modest performance gains Other problems power consumption complexity Both translate to higher cost EE392c Spring 2003 Lecture 1 11 Due to memory latency and branch delays Power consumptions increases dangerously C Kozyrakis EE392c Spring 2003 Lecture 1 12 C Kozyrakis 3 Current State of IPC A Look at the Future 0 05 0 04 SpecInt95 MHz 230x 0 03 Agarwal00 Horowitz00 0 02 80386 80486 P e ntium P e ntium II P e ntiumIII P e ntium4 0 01 0 00 Jan 84 Jan 87 Jan 90 Jan 93 Jan 96 Superscalar processor scaling with technology Goal maximum clock frequency without hurting IPC Option 1 keep ILP hardware same but use deeper pipelines Option 2 use smaller ILP hardware to keep pipeline depth same IPC 1 5x per generation up to Pentium 3 Pentium 4 relies on clock frequency not IPC Still a lot of hardware power goes to IPC EE392c Spring 2003 Lecture 1 13 C Kozyrakis Rely Only on IPC EE392c Spring 2003 512 entry window optimistic prediction memory system Patt99 Branches date dependencies memory latency Even optimistic studies predict ILP 10 Instruction level parallelism is expensive to exploit Lecture 1 15 Borkar Intel 99 Despite reducing power supply and transistor size Power problems Instruction level parallelism is limited EE392c Spring 2003 C Kozyrakis Power Consumption 16 way issue Lecture 1 14 Power Density Jan 81 C Kozyrakis Portable systems battery life All systems cost power distribution packaging cooling EE392c Spring 2003 Lecture 1 16 C Kozyrakis 4 Design Efficiency Complexity Who Cares about Performance 20 of area for registers ALUs What the ISA promises to software Pentium 3 80 of area ILP overhead Caches predictors Critical for performance but They take up area and burn power Software has little control on them Many time critical global wires No design modularity Large design effort Even larger verification effort EE392c Spring 2003 Lecture 1 17 C Kozyrakis Embedded portable systems Need 10s of GOPs at a few mWatts and few Examples speech visual recognition video processing graphics wireless communications Servers systems Need 100s of GOPS at a few Watts and a few Examples OLTP data mining web servers video servers network routing bio computing climate modeling Desktop systems are fine But watch out for Mr Paperclip EE392c Spring 2003 Lecture 1 18 C Kozyrakis It Has to Be a Chip Multiprocessor Design


View Full Document

Stanford EE 392 - Lecture Notes

Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?