CS 147 – Parallel ProcessingMultiprocessingTypes of multiprocessing - SymmetricSlide 4Multiprocessing classification – michael j. flynnFlynn’s taxonomy – classification of computer architecturesSISD – Single instruction Single DataSIMD – Single instruction multiple dataMISD – Multiple Instruction Single dataMIMD - Multiple instruction multiple dataMapping Data and instruction set to memoryParallel Computer Memory Architecture – Shared memoryMimd Styles – uniform memory access (UMA)Uniform Memory AccessMimd Styles – non-uniform memory access (nUMA)NUMA – Non-Uniform Memory AccessNUMA - Non-Uniform Memory AccessDistributed MemoryHybrid distributed-shared MemoryReferencesSophia SoohooThe use of 2 or more central processing units in a single computer systemThe CPUS share the other components of a computerMemoryDiskSystem busSymmetricMore than one computer processor will share memory capacity and data path protocolOnly one copy or the operating system will be used to initiate all the orders executed by the processors involved in the connectionEach CPU can act independentlyAll CPUs can be equal, or some processors can be reserved for particular usesDrawback: bottleneck caused by bandwidth of the memory bus connecting the various processors, the memory, and the disk arraysProfessor at Stanford UniversityReceived his PhD from Purdue UniversityWorked for 10 years in computer organization and design.Proposed Flynn’s taxonomy in 1966Single InstructionMultiple InstructionSingle dataSISD MISDMultiple dataSIMD MIMDFlynn’s taxonomy distinguishes multi-processor computer architecture according to how they can be classified along the 2 independent dimensions of instruction and data.SISD – single instruction single dataMISD – multiple instruction single dataSIMD – single instruction multiple dataMIMD – multiple instruction multiple data•A serial (non parallel) computer•Single instruction – only one instruction steam is being acted on by any CPU during any one clock cycle•Oldest classification•Modern day uses:•Older mainframes•Minicomputers•Workstations•PCsA type of parallel computerSingle instruction – all CPUs execute the same instruction at any given clock cycleMultiple data – each CPU can operate on a different data elementSynchronous (lockstep) Since only one instruction is processed at a time, not necessary for each CPU to fetch and decode the instructionTypes: Processor arrays and vector pipelinesUses: Computers with GPUsSingle data stream is fed into CPUsEach CPU operates on the data independently through independent instruction streamsAdvantage – redundancy/failsafe; multiple CPUs perform the same tasks on the same data, which reduces the chance of incorrect results if a single CPU failsDisadvantage – expensiveUses: array processorsMost common type of parallel computingMultiple instruction – every processor may be executing a different instruction streamMultiple data – every CPU can work with a different data streamExecution can be synchronous or asynchronousExamples: super computers, multiprocessor SMPModel is divided into 3 main types of memory architectures:Shared MemoryDistributed MemoryDistributed Shared MemorySISD SIMD MISD MIMD GMSV GMMP DMSV DMMP Single data stream Multiple data streams Single instr stream Multiple instr streams Flynn’s categories John so n’ s e xpansion Shared variables Message passing Global me mory Distributed me mory Uniprocessors Rarely used Array or vector processors Multiproc’s or multicomputers Shared-memory multiprocessors Rarely used Distributed shared memory Distrib-memory multicomputersAbility for all processors to access all memory as global address spaceMultiple CPUs can operate independently but share same memory resourcesChanges in memory location affected by a CPU are visible to all other CPUsDivided into 2 main classes:UMANUMAUniform Memory AccessAll CPUs share the physical memory uniformlyAccess time is independent of which CPU makes the request or which memory chip contains the transferred dataEach CPU has a private cacheIdentical processorsCache coherent – if one processor updates a location in shared memory, all other process know about the update.In the UMA memory architecture, all processors access shared memory through a bus (or another type of interconnect)Used in multiprocessorsProvide separate memory for each CPU, avoiding performance hit when several CPUs attempt to address the same memoryProvides a performance benefit over single shared memory by a factor roughly the number of processorsMemory access time depends on the memory location relative to the processorProcessor can access its own local memory faster that non-local memoryAdvantagesGlobal address space provides a user friendly programming to memoryData sharing between tasks is fast and uniform due to proximity of memory to CPUsDisadvantagesLack of scalability between memory and CPUs. Adding more CPUs increases the traffic on shared memory CPU pathProgrammer responsibility for synchronization constructs that insure “correct” access to global memoryExpensive to design and produce shared memory machinesMemory access time varies with the location of the data to be accessed. If data resides in local memory, access is fast. If data resides in remote memory, access is slower. The advantage of the NUMA architecture as a hierarchical shared memory scheme is its potential to improve average case access time through the introduction of fast, local memory.Require a communication network to connect inter-processor memoryCPUs have their own distributed memoryMemory in one CPU does not map to another – each processor sees only its own memoryNo concept of global address spaceWhen processor needs to access data in another CPU, the programmer must define how and when data is communicatedShared memory component is usually cache coherent SMP machineCombination of both shared and distributed memoryDistributed memory component is the networking of multiple SMPsRequired to move data from one SMP to
View Full Document