DOC PREVIEW
WMU CS 6260 - Parallel System Interconnections and Communications

This preview shows page 1-2-3-4-30-31-32-33-34-62-63-64-65 out of 65 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38Slide 39Slide 40Slide 41Slide 42Slide 43Slide 44Slide 45Slide 46Slide 47Slide 48Slide 49Slide 50Slide 51Slide 52Slide 53Slide 54Slide 55Slide 56Slide 57Slide 58Slide 59Slide 60Slide 61Slide 62Slide 63Slide 64Slide 65Click to edit Master subtitle style 2/26/2009 Parallel SyStem InterconnectIonS and communIcatIonSAbdullah Algarni2/26/2009 Parallel Architectures- SISD- SIMD- MIMD-Shared memory systems-Distributed memory machinesPhysical Organization of Parallel Platforms-Ideal Parallel ComputerInterconnection Networks for Parallel Computers-Static and Dynamic Interconnection Networks-Switches -Network interfacesOutline 2/26/20092/26/2009 Network Topologies-Buses-Crossbars-Multistage Networks-Multistage Omega Network -Completely Connected Network -Linear Arrays -Meshes -Hypercubes -Tree-Based Networks-Fat Trees-Evaluating Interconnection NetworksGrid Computing Outline (con.) 2/26/20092/26/2009 SISD: Single instruction single data– Classical von Neumann architectureSIMD: Single instruction multiple dataMIMD: Multiple instructions multiple data – Most common and general parallel machineClassification of Parallel Architectures 2/26/20092/26/2009 • Also known as Array-processors• A single instruction stream is broadcasted to multiple processors, each having its own data stream– Still used in graphics cards todaySingle Instruction Multiple Data 2/26/20092/26/2009 • Each processor has its own instruction stream and input dataFurther breakdown of MIMD usually based on the memory organization– Shared memory systems– Distributed memory systemsMultiple Instructions Multiple Data 2/26/20092/26/2009 All processes have access to the same address space– E.g. PC with more than one processorData exchange between processes by writing/reading shared variablesAdvantage: Shared memory systems are easy to program– Current standard in scientific programming: OpenMPShared memory systems 2/26/20092/26/2009 • Two versions of shared memory systems available today:– Symmetric multiprocessors (SMP)– Non-uniform memory access (NUMA)Shared memory systems 2/26/20092/26/2009 • All processors share the same physical main memory• Disadvantage: Memory bandwidth per processor is limited• Typical size: 2-32 processorsSymmetric multi-processors (SMPs) 2/26/20092/26/2009 • More than one memory but some memory is closer to a certain processor than other memory◦ The whole memory is still addressable from all processorsNUMA architectures (1)(Non-uniform memory access) 2/26/20092/26/2009 • Advantage: It Reduces the memory limitation compared to SMPs• Disadvantage: More difficult to program efficiently• To reduce effects of non-uniform memory access, caches are often used• Largest example of this type:SGI Origin with10240 processors NUMA architectures (cont.) 2/26/2009Columbia Supercomputer2/26/2009  Each processor has its own address space Communication between processes by explicit data exchangeSome protocols are used: – Sockets – Message passing – Remote procedure call / remote method invocationDistributed memory machines 2/26/20092/26/2009 • Performance of a distributed memory machine strongly depends on the quality of the network interconnect and the topology of the network interconnect Two classes of distributed memory machines:1) Massively parallel processing systems (MPPs)2) ClustersDistributed memory machines(Con.) 2/26/20092/26/2009 Physical Organization of Parallel Platforms 2/26/20092/26/2009 A natural extension of the Random Access Machine (RAM) serial architecture is the Parallel Random Access Machine, or PRAM. PRAMs consist of p processors and a global memory of unbounded size that is uniformly accessible to all processors. Processors share a common clock but may execute different instructions in each cycle. Ideal Parallel Computer 2/26/20092/26/2009 Depending on how simultaneous memory accesses are handled, PRAMs can be divided into four subclasses. ◦Exclusive-read, exclusive-write (EREW) PRAM. ◦Concurrent-read, exclusive-write (CREW) PRAM. ◦Exclusive-read, concurrent-write (ERCW) PRAM. ◦Concurrent-read, concurrent-write (CRCW) PRAM. Ideal Parallel Computer 2/26/20092/26/2009 What does concurrent write mean, anyway? ◦Common: write only if all values are identical. ◦Arbitrary: write the data from a randomly selected processor. ◦Priority: follow a pre-determined priority order. ◦Sum: Write the sum of all data items. Ideal Parallel Computer 2/26/20092/26/2009 Processors and memories are connected via switches.Since these switches must operate in O(1) time at the level of words, for a system of p processors and m words, the switch complexity is O(mp).Physical Complexity of an Ideal Parallel Computer 2/26/20092/26/2009 Imagine how long it takes to complete Brain Simulation?The human brain contains 100,000,000,000 neurons each neuron receives input from 1000 othersTo compute a change of brain “state”, one requires 1014 calculationsIf each could be done in 1s s, it would take ~3 years to complete one calculation.Brain simulation 2/26/20092/26/2009 Imagine how long it takes to complete Brain Simulation?The human brain contains 100,000,000,000 neurons, each neuron receives input from 1000 othersTo compute a change of brain “state”, one requires 1014 calculationsIf each could be done in 1s s, it would take ~3 years to complete one calculation.Clearly, O(mp) for big values of p and m, a true PRAM is not realizable.Brain simulation 2/26/20092/26/2009 Important metrics:– Latency:• minimal time to send a message from one processor to another• Unit: ms, μs– Bandwidth:• amount of data which can be transferred from one processor to another in a certain time frame• Units: Bytes/sec, KB/s, MB/s, GB/s, Bits/sec, Kb/s, Mb/s, Gb/sInterconnection Networks for Parallel Computers 2/26/20092/26/2009 Important terms 2/26/20092/26/2009 Static and Dynamic Interconnection Networks Classification of interconnection networks: (a) a


View Full Document

WMU CS 6260 - Parallel System Interconnections and Communications

Download Parallel System Interconnections and Communications
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Parallel System Interconnections and Communications and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Parallel System Interconnections and Communications 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?