DOC PREVIEW
RIT EECC 756 - Parallel Computer Architecture

This preview shows page 1-2-3-4-5-6-45-46-47-48-49-50-51-91-92-93-94-95-96 out of 96 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 96 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Parallel Computer ArchitectureThe Goal of Parallel ComputingElements of Modern ComputersApproaches to Parallel ProgrammingEvolution of Computer ArchitectureProgramming ModelsFlynn’s 1972 Classification of Computer ArchitectureCurrent Trends In Parallel ArchitecturesModels of Shared-Memory MultiprocessorsSlide 10Message-Passing MulticomputersConvergence: Generic Parallel ArchitectureFundamental Design IssuesSynchronizationCommunication Cost ModelConditions of Parallelism: Data DependenceConditions of Parallelism: Data DependenceData and I/O Dependence: ExamplesConditions of ParallelismBernstein’s Conditions: An ExampleTheoretical Models of Parallel ComputersExample: sum algorithm on P processor PRAMExample: Sum Algorithm on P Processor PRAMExample: Asynchronous Matrix Vector Product on a RingLevels of Parallelism in Program ExecutionLimited Concurrency: Amdahl’s LawParallel Performance Metrics Degree of Parallelism (DOP)Example: Concurrency Profile of A Divide-and-Conquer AlgorithmSteps in Creating a Parallel ProgramSummary of Parallel Algorithms AnalysisSummary of TradeoffsGeneric Message-Passing RoutinesBlocking send( ) and recv( ) System CallsNon-blocking send( ) and recv( ) System CallsMessage-Passing Computing ExamplesSynchronous IterationBarriersMessage-Passing Local SynchronizationNetwork CharacteristicsSample Static Network TopologiesStatic Connection Networks Examples: 2D MeshStatic Connection Networks Examples: HypercubesMessage Routing Functions ExampleEmbeddings In Two DimensionsDynamic Connection NetworksDynamic Networks DefinitionsPermutationsPerfect ShuffleMulti-Stage Networks: The Omega NetworkShared Memory MultiprocessorsShared Memory Multiprocessors VariationsCaches And Cache Coherence In Shared Memory MultiprocessorsShared Memory Access Consistency ModelsSequential Consistency (SC) ModelSlide 55Further Interpretation of SCWeak (Release) Consistency (WC)TSO Weak Consistency ModelCache Coherence Using A BusWrite-invalidate Snoopy Bus Protocol For Write-Through CachesWrite-invalidate Snoopy Bus Protocol For Write-Back CachesMESI State Transition DiagramParallel System Performance: Evaluation & ScalabilityParallel Performance Metrics RevisitedParallel Performance Metrics RevisitedHarmonic Mean PerformanceEfficiency, Utilization, Redundancy, Quality of ParallelismParallel Performance Metrics Revisited: Amdahl’s LawThe Isoefficiency ConceptSpeedup Performance Laws: Fixed-Workload SpeedupAmdahl’s Law for Fixed-Load SpeedupFixed-Time SpeedupGustafson’s Fixed-Time SpeedupFixed-Memory SpeedupScalability MetricsParallel Scalability MetricsParallel System ScalabilityMPPs Scalability IssuesCost ScalingScalable Distributed Memory MachinesGeneric Distributed Memory OrganizationNetwork Latency Scaling ExamplePhysical ScalingSpectrum of DesignsScalable Cache Coherent SystemsScalable Cache CoherenceApproach #1: Hierarchical SnoopingHierarchical Snoopy Cache CoherenceScalable Approach #2: DirectoriesOrganizing DirectoriesFlat, Memory-based Directory SchemesFlat, Cache-based SchemesApproach #3: A Popular Middle GroundExample Two-level HierarchiesAdvantages of Multiprocessor NodesDisadvantages of Coherent MP NodesEECC756 - ShaabanEECC756 - Shaaban#1 Exam Review Spring2000 5-4-2000Parallel Computer ArchitectureParallel Computer Architecture•A parallel computer is a collection of processing elements that cooperate to solve large problems.•Broad issues involved:–Resource Allocation:•Number of processing elements (PEs).•Computing power of each element.•Amount of physical memory used.–Data access, Communication and Synchronization•How the elements cooperate and communicate.•How data is transmitted between processors.•Abstractions and primitives for cooperation.–Performance and Scalability:•Performance enhancement of parallelism: Speedup.•Scalabilty of performance to larger systems/problems.EECC756 - ShaabanEECC756 - Shaaban#2 Exam Review Spring2000 5-4-2000The Goal of Parallel ComputingThe Goal of Parallel Computing•Goal of applications in using parallel machines: Speedup Speedup (p processors) =•For a fixed problem size (input data set), performance = 1/time Speedup fixed problem (p processors) = Performance (p processors)Performance (1 processor)Time (1 processor)Time (p processors)EECC756 - ShaabanEECC756 - Shaaban#3 Exam Review Spring2000 5-4-2000Elements of Modern ComputersElements of Modern Computers HardwareHardwareArchitectureArchitectureOperating SystemOperating SystemApplications SoftwareApplications SoftwareComputingComputing ProblemsProblemsAlgorithmsAlgorithmsand Dataand DataStructuresStructuresHigh-levelHigh-levelLanguagesLanguagesPerformance Performance EvaluationEvaluation MappingMappingProgrammingProgrammingBindingBinding(Compile, (Compile, Load)Load)EECC756 - ShaabanEECC756 - Shaaban#4 Exam Review Spring2000 5-4-2000Approaches to Parallel ProgrammingApproaches to Parallel Programming Source code written inSource code written inconcurrent dialects of C, C++concurrent dialects of C, C++ FORTRAN, LISPFORTRAN, LISP ..ProgrammerProgrammer ConcurrencyConcurrencypreserving compilerpreserving compilerConcurrentConcurrentobject codeobject code Execution byExecution byruntime systemruntime system Source code written inSource code written insequential languages C, C++sequential languages C, C++ FORTRAN, LISPFORTRAN, LISP ..ProgrammerProgrammer ParallelizingParallelizing compilercompiler ParallelParallelobject codeobject code Execution byExecution byruntime systemruntime system (a) Implicit (a) Implicit ParallelismParallelism (b) Explicit(b) Explicit ParallelismParallelismEECC756 - ShaabanEECC756 - ShaabanEvolution of Computer Evolution of Computer ArchitectureArchitectureScalarSequential LookaheadI/E OverlapFunctionalParallelismMultipleFunc. UnitsPipeline Implicit Vector Explicit VectorMIMDSIMDMultiprocessorMulticomputer Register-to -Register Memory-to -Memory Processor ArrayAssociative Processor Massively Parallel Processors (MPPs)I/E: Instruction Fetch and ExecuteSIMD: Single Instruction stream over Multiple Data streams MIMD: Multiple Instruction streams over Multiple Data streamsEECC756 - ShaabanEECC756 - Shaaban#6 Exam Review Spring2000 5-4-2000Programming ModelsProgramming Models•Programming methodology used in coding applications.•Specifies communication and synchronization.•Examples:–Multiprogramming: No


View Full Document

RIT EECC 756 - Parallel Computer Architecture

Documents in this Course
Load more
Download Parallel Computer Architecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Parallel Computer Architecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Parallel Computer Architecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?