DOC PREVIEW
RIT EECC 756 - Parallel Computer Architecture

This preview shows page 1-2-3-19-20-38-39-40 out of 40 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 40 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

EECC756 - ShaabanEECC756 - Shaaban#1 lec # 1 Spring 2000 3-7-2000Parallel Computer ArchitectureParallel Computer Architecture• A parallel computer is a collection of processing elementsthat cooperate to solve large problems fast• Broad issues involved:– Resource Allocation:• Number of processing elements (PEs).• Computing power of each element.• Amount of physical memory used.– Data access, Communication and Synchronization• How the elements cooperate and communicate.• How data is transmitted between processors.• Abstractions and primitives for cooperation.– Performance and Scalability• Performance enhancement of parallelism: Speedup.• Scalabilty of performance to larger systems/problems.EECC756 - ShaabanEECC756 - Shaaban#2 lec # 1 Spring 2000 3-7-2000The Need And Feasibility of The Need And Feasibility of Parallel ComputingParallel Computing• Application demands: More computing cycles:– Scientific computing: CFD, Biology, Chemistry, Physics, ...– General-purpose computing: Video, Graphics, CAD, Databases,Transaction Processing, Gaming…– Mainstream multithreaded programs, are similar to parallel programs• Technology Trends– Number of transistors on chip growing rapidly– Clock rates expected to go up but only slowly• Architecture Trends– Instruction-level parallelism is valuable but limited– Coarser-level parallelism, as in MPs, the most viable approach• Economics:– Today’s microprocessors have multiprocessor support eliminating theneed for designing expensive custom PEs– Lower parallel system cost.– Multiprocessor systems to offer a cost-effective replacement ofuniprocessor systems in mainstream computing.EECC756 - ShaabanEECC756 - Shaaban#3 lec # 1 Spring 2000 3-7-2000Scientific Computing DemandScientific Computing DemandEECC756 - ShaabanEECC756 - Shaaban#4 lec # 1 Spring 2000 3-7-2000Scientific Supercomputing TrendsScientific Supercomputing Trends• Proving ground and driver for innovative architectureand advanced techniques:– Market is much smaller relative to commercial segment– Dominated by vector machines starting in 70s– Meanwhile, microprocessors have made huge gains infloating-point performance• High clock rates.• Pipelined floating point units.• Instruction-level parallelism.• Effective use of caches.• Large-scale multiprocessors replace vector supercomputers– Well under way alreadyEECC756 - ShaabanEECC756 - Shaaban#5 lec # 1 Spring 2000 3-7-2000Raw Uniprocessor Performance:Raw Uniprocessor Performance:LINPACKLINPACKLINPACK (MFLOPS)ssssssuuuuuuuuuu u1101001,00010,0001975 1980 1985 1990 1995 2000sCRAYn = 100nCRAYn = 1,000uMicro n = 100lMicro n = 1,000CRAY 1sXmp/14seXmp/416YmpC90T94DEC 8200IBM Power2/990MIPS R4400HP9000/735DEC AlphaDEC Alpha AXPHP 9000/750IBM RS6000/540MIPS M/2000MIPS M/120Sun 4/260nnnnnnlllllllllllEECC756 - ShaabanEECC756 - Shaaban#6 lec # 1 Spring 2000 3-7-2000Raw Parallel Performance:Raw Parallel Performance:LINPACKLINPACKLINP ACK (GFLOPS)nCRA Y peaklMPP peakXmp /416(4)Ymp/832(8)nCUBE/2(1024)iPSC/860CM-2CM-200DeltaParagon XP/SC90(16)CM-5ASCI RedT932(32)T3DParagon XP/S MP(1024)Paragon XP/S MP(6768)nnnnllnllllllll0.11101001,00010,0001985 1987 1989 1991 1993 1995 1996EECC756 - ShaabanEECC756 - Shaaban#7 lec # 1 Spring 2000 3-7-2000General Technology TrendsGeneral Technology Trends• Microprocessor performance increases 50% - 100% per year• Transistor count doubles every 3 years• DRAM size quadruples every 3 years0204060801001201401601801987 1988 1989 1990 1991 1992Integer FPSun 4260MIPSM/120IBMRS6000540MIPSM2000HP 9000750DECalphaEECC756 - ShaabanEECC756 - Shaaban#8 lec # 1 Spring 2000 3-7-2000Clock Frequency Growth RateClock Frequency Growth Rateuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu0.11101001,00019701975198019851990199520002005Clock rate (MHz)i4004i8008i8080i8086i80286i80386Pentium100 R10000• Currently increasing 30% per yearEECC756 - ShaabanEECC756 - Shaaban#9 lec # 1 Spring 2000 3-7-2000Transistor Count Growth RateTransistor Count Growth RateTransistorsuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu1,00010,000100,0001,000,00010,000,000100,000,00019701975198019851990199520002005i4004i8008i8080i8086i80286i80386R2000Pentium R10000R3000• 100 million transistors on chip by early 2000’s A.D.• Transistor count grows much faster than clock rate- Currently 40% per yearEECC756 - ShaabanEECC756 - Shaaban#10 lec # 1 Spring 2000 3-7-2000System Attributes to PerformanceSystem Attributes to Performance• Performance benchmarking is program-mix dependent.• Ideal performance requires a perfect machine/program match.• Performance measures:– Cycles per instruction (CPI)– Total CPU time = T = C x ττ = C / f = Ic x CPI x ττ = Ic x (p + m x k) x τ τIc = Instruction count ττ = CPU cycle timep = Instruction decode cyclesm = Memory cycles k = Ratio between memory/processor cyclesC = Total program clock cycles f = clock rate– MIPS Rate = Ic / (T x 106) = f / (CPI x 106) = f x Ic /(C x 106)– Throughput Rate: Wp = f /(Ic x CPI) = (MIPS) x 106 /Ic• Performance factors: (Ic, p, m, k, ττ) are influenced by: instruction-setarchitecture, compiler design, CPU implementation and control, cacheand memory hierarchy.EECC756 - ShaabanEECC756 - Shaaban#11 lec # 1 Spring 2000 3-7-2000CPU Performance TrendsCPU Performance TrendsPerformance0.11101001965 1970 1975 1980 1985 1990 1995SupercomputersMinicomputersMainframesMicroprocessorsThe microprocessor is currently the most naturalbuilding block for multiprocessor systems interms of cost and performance.EECC756 - ShaabanEECC756 - Shaaban#12 lec # 1 Spring 2000 3-7-2000Transistorsuuuuuuuuuuuuuuuuuuuuuuuuuu uuuuuuuuuuuuuuuuuuuuuu uuu uuuuuuu uuuuuu uuu1,00010,000100,0001,000,00010,000,000100,000,0001970 1975 1980 1985 1990 1995 2000 2005Bit-level parallelism Instruction-level Thread-level (?)i4004i8008i8080i8086i80286i80386R2000Pentium R10000R3000Parallelism in Microprocessor VLSI GenerationsParallelism in Microprocessor VLSI GenerationsEECC756 - ShaabanEECC756 - Shaaban#13 lec # 1 Spring 2000 3-7-2000The Goal of Parallel ComputingThe Goal of Parallel Computing• Goal of applications in using parallel machines: Speedup Speedup (p processors) =• For a fixed problem size (input


View Full Document

RIT EECC 756 - Parallel Computer Architecture

Documents in this Course
Load more
Download Parallel Computer Architecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Parallel Computer Architecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Parallel Computer Architecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?