DOC PREVIEW
RIT EECC 756 - Parallel Computer Architecture

This preview shows page 1-2-3-19-20-38-39-40 out of 40 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Parallel Computer ArchitectureThe Need And Feasibility of Parallel ComputingScientific Computing DemandScientific Supercomputing TrendsRaw Uniprocessor Performance: LINPACKRaw Parallel Performance: LINPACKGeneral Technology TrendsSlide 8Transistor Count Growth RateSystem Attributes to PerformanceCPU Performance TrendsParallelism in Microprocessor VLSI GenerationsThe Goal of Parallel ComputingElements of Modern ComputersSlide 15Slide 16Slide 17Approaches to Parallel ProgrammingEvolution of Computer ArchitectureParallel Architectures HistoryProgramming ModelsFlynn’s 1972 Classification of Computer ArchitectureFlynn’s Classification of Computer ArchitectureCurrent Trends In Parallel ArchitecturesModern Parallel Architecture Layered FrameworkShared Address Space Parallel ArchitecturesShared Address Space (SAS) ModelModels of Shared-Memory MultiprocessorsSlide 29Uniform Memory Access Example: Intel Pentium Pro QuadUniform Memory Access Example: SUN EnterpriseDistributed Shared-Memory Multiprocessor System Example: Cray T3EMessage-Passing MulticomputersMessage-Passing AbstractionMessage-Passing Example: IBM SP-2Message-Passing Example: Intel ParagonMessage-Passing Programming ToolsData Parallel Systems SIMD in Flynn taxonomyDataflow ArchitecturesSystolic ArchitecturesEECC756 - ShaabanEECC756 - Shaaban#1 lec # 1 Spring 2002 3-12-2002Parallel Computer ArchitectureParallel Computer Architecture•A parallel computer is a collection of processing elements that cooperate to solve large problems fast•Broad issues involved:–Resource Allocation:•Number of processing elements (PEs).•Computing power of each element.•Amount of physical memory used.–Data access, Communication and Synchronization•How the elements cooperate and communicate.•How data is transmitted between processors.•Abstractions and primitives for cooperation.–Performance and Scalability•Performance enhancement of parallelism: Speedup.•Scalabilty of performance to larger systems/problems.EECC756 - ShaabanEECC756 - Shaaban#2 lec # 1 Spring 2002 3-12-2002The Need And Feasibility of The Need And Feasibility of Parallel ComputingParallel Computing•Application demands: More computing cycles:–Scientific computing: CFD, Biology, Chemistry, Physics, ...–General-purpose computing: Video, Graphics, CAD, Databases, Transaction Processing, Gaming…–Mainstream multithreaded programs, are similar to parallel programs•Technology Trends–Number of transistors on chip growing rapidly–Clock rates expected to go up but only slowly•Architecture Trends–Instruction-level parallelism is valuable but limited–Coarser-level parallelism, as in MPs, the most viable approach•Economics:–Today’s microprocessors have multiprocessor support eliminating the need for designing expensive custom PEs –Lower parallel system cost.–Multiprocessor systems to offer a cost-effective replacement of uniprocessor systems in mainstream computing.EECC756 - ShaabanEECC756 - Shaaban#3 lec # 1 Spring 2002 3-12-2002Scientific Computing DemandScientific Computing DemandEECC756 - ShaabanEECC756 - Shaaban#4 lec # 1 Spring 2002 3-12-2002Scientific Supercomputing TrendsScientific Supercomputing Trends•Proving ground and driver for innovative architecture and advanced techniques: –Market is much smaller relative to commercial segment –Dominated by vector machines starting in 70s–Meanwhile, microprocessors have made huge gains in floating-point performance•High clock rates.•Pipelined floating point units.•Instruction-level parallelism.•Effective use of caches.•Large-scale multiprocessors replace vector supercomputers–Well under way alreadyEECC756 - ShaabanEECC756 - Shaaban#5 lec # 1 Spring 2002 3-12-2002Raw Uniprocessor Performance: Raw Uniprocessor Performance: LINPACKLINPACKLINPACK (MFLOPS) 1101001,00010,0001975 1980 1985 1990 1995 2000 CRAY n = 100 CRAY n = 1,000 Micro n = 100 Micro n = 1,000CRAY 1sXmp/14seXmp/416YmpC90T94DEC 8200IBM Power2/990MIPS R4400HP9000/735DEC AlphaDEC Alpha AXPHP 9000/750IBM RS6000/540MIPS M/2000MIPS M/120Sun 4/260EECC756 - ShaabanEECC756 - Shaaban#6 lec # 1 Spring 2002 3-12-2002Raw Parallel Performance: Raw Parallel Performance: LINPACKLINPACKLINPACK (GFLOPS) CRAY peak MPP peakXmp /416(4)Ymp/832(8)nCUBE/2(1024)iPSC/860CM-2CM-200DeltaParagon XP/SC90(16)CM-5ASCI RedT932(32)T3DParagon XP/S MP(1024)Paragon XP/S MP(6768)0.11101001,00010,0001985 1987 1989 1991 1993 1995 1996EECC756 - ShaabanEECC756 - Shaaban#7 lec # 1 Spring 2002 3-12-2002General Technology TrendsGeneral Technology Trends•Microprocessor performance increases 50% - 100% per year•Transistor count doubles every 3 years•DRAM size quadruples every 3 years0204060801001201401601801987 1988 1989 1990 1991 1992Integer FPSun 4260MIPSM/120IBMRS6000540MIPSM2000HP 9000750DECalphaEECC756 - ShaabanEECC756 - Shaaban#8 lec # 1 Spring 2002 3-12-2002Clock Frequency Growth RateClock Frequency Growth Rate0.11101001,00019701975198019851990199520002005Clock rate (MHz)i4004i8008i8080i8086i80286i80386Pentium100 R10000• Currently increasing 30% per yearEECC756 - ShaabanEECC756 - Shaaban#9 lec # 1 Spring 2002 3-12-2002Transistor Count Growth RateTransistor Count Growth RateTransistors1,00010,000100,0001,000,00010,000,000100,000,00019701975198019851990199520002005i4004i8008i8080i8086i80286i80386R2000Pentium R10000R3000•100 million transistors on chip by early 2000’s A.D.•Transistor count grows much faster than clock rate- Currently 40% per yearEECC756 - ShaabanEECC756 - Shaaban#10 lec # 1 Spring 2002 3-12-2002System Attributes to PerformanceSystem Attributes to Performance•Performance benchmarking is program-mix dependent.•Ideal performance requires a perfect machine/program match.•Performance measures:–Cycles per instruction (CPI)–Total CPU time = T = C x  = C


View Full Document

RIT EECC 756 - Parallel Computer Architecture

Documents in this Course
Load more
Download Parallel Computer Architecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Parallel Computer Architecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Parallel Computer Architecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?