Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38Slide 39Slide 40Slide 41Slide 42Slide 43Slide 44Slide 45Slide 46Slide 47Slide 48Slide 49Slide 50Slide 51Slide 52Slide 53Slide 54Slide 55Slide 56Slide 57Slide 58Slide 59Slide 60Slide 61Slide 62Slide 63Slide 64Slide 65Slide 66Slide 67Slide 68Slide 69Slide 70Slide 71Slide 72Slide 73Slide 74Slide 75Slide 76Slide 77Slide 78Slide 79Slide 80Parallel Computers 1Parallel ComputersProf. Sin-Min LeeDepartment of Computer ScienceParallel Computers 2Uniprocessor SystemsImprove performance:Allowing multiple, simultaneous memory access- requires multiple address, data, and control buses (one set for each simultaneous memory access)- The memory chip has to be able to handle multiple transfers simultaneouslyParallel Computers 3Uniprocessor SystemsMultiport Memory:Has two sets of address, data, and control pins to allow simultaneous data transfers to occurCPU and DMA controller can transfer data concurrentlyA system with more than one CPU could handle simultaneous requests from two different processorsParallel Computers 4Uniprocessor SystemsMultiport Memory (cont.):CanCan- Multiport memory can handle two requests to read Multiport memory can handle two requests to read data from the same location at the same timedata from the same location at the same timeCannotCannot- Process two simultaneous requests to write data to Process two simultaneous requests to write data to the same memory locationthe same memory location- Requests to read from and write to the same - Requests to read from and write to the same memory location simultaneouslymemory location simultaneouslyParallel Computers 5MultiprocessorsI/O PortDeviceDeviceControllerCPUBusMemoryCPUCPUParallel Computers 6MultiprocessorsSystems designed to have 2 to 8 CPUsThe CPUs all share the other parts of the computerMemoryDiskSystem BusetcCPUs communicate via Memory and the System BusParallel Computers 7MultiProcessorsEach CPU shares memory, disks, etcCheaper than clustersNot as good performance as clustersOften used forSmall ServersHigh-end WorkstationsParallel Computers 8MultiProcessorsOS automatically shares work among available CPUsOn a workstation…One CPU can be running an engineering design programAnother CPU can be doing complex graphics formattingParallel Computers 9Applications of Parallel ComputersTraditionally: government labs, numerically intensive applicationsResearch InstitutionsRecent Growth in Industrial Applications236 of the top 500Financial analysis, drug design and analysis, oil exploration, aerospace and automotiveParallel Computers 10Multiprocessor SystemsFlynn’s ClassificationSingle instruction multiple data (SIMD):MainMemoryControlUnitProcessorProcessorProcessorMemoryMemoryMemoryCommunicationsNetwork• Executes a single instruction on multiple data values Executes a single instruction on multiple data values simultaneously using many processorssimultaneously using many processors• Since only one instruction is processed at any given time, it Since only one instruction is processed at any given time, it is not necessary for each processor to fetch and decode the is not necessary for each processor to fetch and decode the instructioninstruction• This task is handled by a single control unit that sends the This task is handled by a single control unit that sends the control signals to each processor.control signals to each processor.• Example: Array processorExample: Array processorParallel Computers 11Why Multiprocessors?1. Microprocessors as the fastest CPUs•Collecting several much easier than redesigning 12. Complexity of current microprocessors•Do we have enough ideas to sustain 1.5X/yr?•Can we deliver such complexity on schedule?3. Slow (but steady) improvement in parallel software (scientific apps, databases, OS)4. Emergence of embedded and server markets driving microprocessors in addition to desktops•Embedded functional parallelism, producer/consumer model•Server figure of merit is tasks per hour vs. latencyParallel Computers 12Parallel Processing IntroLong term goal of the field: scale number processors to size of budget, desired performanceMachines today: Sun Enterprise 10000 (8/00)64 400 MHz UltraSPARC® II CPUs,64 GB SDRAM memory, 868 18GB disk,tape $4,720,800 total 64 CPUs 15%,64 GB DRAM 11%, disks 55%, cabinet 16% ($10,800 per processor or ~0.2% per processor)Minimal E10K - 1 CPU, 1 GB DRAM, 0 disks, tape ~$286,700$10,800 (4%) per CPU, plus $39,600 board/4 CPUs (~8%/CPU)Machines today: Dell Workstation 220 (2/01)866 MHz Intel Pentium® III (in Minitower)0.125 GB RDRAM memory, 1 10GB disk, 12X CD, 17” monitor, nVIDIA GeForce 2 GTS,32MB DDR Graphics card, 1yr service$1,600; for extra processor, add $350 (~20%)Parallel Computers 13Major MIMD Styles1. Centralized shared memory ("Uniform Memory Access" time or "Shared Memory Processor")2. Decentralized memory (memory module with CPU) •get more memory bandwidth, lower memory latency•Drawback: Longer communication latency•Drawback: Software model more complexParallel Computers 14Organization of Multiprocessor SystemsThree different ways to organize/classify systems:• Flynn’s Classification• System Topologies• MIMD System ArchitecturesParallel Computers 15Multiprocessor SystemsFlynn’s ClassificationFlynn’s Classification:Based on the flow of instructions and data processingA computer is classified by:- whether it processes a single instruction at a time or multiple instructions simultaneously- whether it operates on one more multiple data setsParallel Computers 16Multiprocessor SystemsFlynn’s ClassificationFour Categories of Flynn’s Classification:SISD Single instruction single dataSIMD Single instruction multiple dataMISD Multiple instruction single data **MIMD Multiple instruction multiple data** The MISD classification is not practical to implement.In fact, no significant MISD computers have ever been build.It is included only for completeness.Parallel Computers 17From the beginning of time, computer scientists have been challenging computers with larger and larger problems. Eventually, computer
View Full Document