DOC PREVIEW
RIT EECC 756 - Introduction to Parallel Processing

This preview shows page 1-2-3-4-27-28-29-30-56-57-58-59 out of 59 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Introduction to Parallel ProcessingParallel Computer ArchitectureA Generic Parallel Computer ArchitectureThe Need And Feasibility of Parallel ComputingWhy is Parallel Processing Needed? Challenging Applications in Applied Science/EngineeringWhy is Parallel Processing Needed? Scientific Computing DemandsScientific Supercomputing TrendsUniprocessor Performance EvaluationSingle CPU Performance TrendsMicroprocessor Frequency TrendTransistor Count Growth RateParallelism in Microprocessor VLSI GenerationsCurrent Dual-Core Chip-Multiprocessor ArchitecturesMicroprocessors Vs. Vector Processors Uniprocessor Performance: LINPACKParallel Performance: LINPACKWhy is Parallel Processing Needed? LINPAK Performance TrendsComputer System Peak FLOP Rating History/Near FutureThe Goal of Parallel ProcessingSlide 19Elements of Parallel ComputingSlide 21Slide 22Slide 23Approaches to Parallel ProgrammingFactors Affecting Parallel System PerformancePowerPoint PresentationEvolution of Computer ArchitectureParallel Architectures HistoryParallel Programming ModelsFlynn’s 1972 Classification of Computer ArchitectureFlynn’s Classification of Computer ArchitectureCurrent Trends In Parallel ArchitecturesModern Parallel Architecture Layered FrameworkShared Address Space (SAS) Parallel ArchitecturesShared Address Space (SAS) Parallel Programming ModelModels of Shared-Memory MultiprocessorsSlide 37Uniform Memory Access (UMA) Example: Intel Pentium Pro QuadNon-Uniform Memory Access (NUMA) Example: AMD 8-way Opteron Server NodeUniform Memory Access Example: SUN EnterpriseDistributed Shared-Memory Multiprocessor System Example: Cray T3EMessage-Passing MulticomputersMessage-Passing AbstractionMessage-Passing Example: Intel ParagonMessage-Passing Example: IBM SP-2Message-Passing MPP Example: IBM Blue Gene/LMessage-Passing Programming ToolsData Parallel Systems SIMD in Flynn taxonomyDataflow ArchitecturesSystolic ArchitecturesSystolic Array Example: 3x3 Systolic Array Matrix MultiplicationSlide 52Slide 53Slide 54Slide 55Slide 56Slide 57Slide 58Slide 59EECC756 - ShaabanEECC756 - Shaaban#1 lec # 1 Spring 2008 3-11-2008Introduction to Parallel ProcessingIntroduction to Parallel Processing•Parallel Computer Architecture: Parallel Computer Architecture: Definition & Broad issues involvedDefinition & Broad issues involved–A Generic Parallel Computer ArchitectureA Generic Parallel Computer Architecture•The Need And Feasibility of Parallel ComputingThe Need And Feasibility of Parallel Computing–Scientific Supercomputing TrendsScientific Supercomputing Trends–CPU Performance and Technology Trends, Parallelism in Microprocessor GenerationsCPU Performance and Technology Trends, Parallelism in Microprocessor Generations–Computer System Peak FLOP Rating History/Near FutureComputer System Peak FLOP Rating History/Near Future•The Goal of Parallel ProcessingThe Goal of Parallel Processing•Elements of Parallel Computing Elements of Parallel Computing •Factors Affecting Parallel System PerformanceFactors Affecting Parallel System Performance•Parallel Architectures HistoryParallel Architectures History–Parallel Programming ModelsParallel Programming Models–Flynn’s 1972 Classification of Computer ArchitectureFlynn’s 1972 Classification of Computer Architecture•Current Trends In Parallel ArchitecturesCurrent Trends In Parallel Architectures–Modern Parallel Architecture Layered FrameworkModern Parallel Architecture Layered Framework•Shared Address Space Parallel ArchitecturesShared Address Space Parallel Architectures•Message-Passing Multicomputers: Message-Passing Programming ToolsMessage-Passing Multicomputers: Message-Passing Programming Tools•Data Parallel SystemsData Parallel Systems•Dataflow ArchitecturesDataflow Architectures•Systolic Architectures: Systolic Architectures: Matrix Multiplication Systolic Array Example PCA Chapter 1.1, 1.2EECC756 - ShaabanEECC756 - Shaaban#2 lec # 1 Spring 2008 3-11-2008Parallel Computer ArchitectureParallel Computer Architecture A parallel computer (or multiple processor system) is a collection of communicating processing elements (processors) that cooperate to solve large computational problems fast by dividing such problems into parallel tasks, exploiting Thread-Level Parallelism (TLP).•Broad issues involved:–The concurrency and communication characteristics of parallel algorithms for a given computational problem (represented by dependency graphs)–Computing Resources and Computation Allocation:•The number of processing elements (PEs), computing power of each element and amount/organization of physical memory used.•What portions of the computation and data are allocated or mapped to each PE.–Data access, Communication and Synchronization•How the processing elements cooperate and communicate.•How data is shared/transmitted between processors.•Abstractions and primitives for cooperation/communication.•The characteristics and performance of parallel system network (System interconnects). –Parallel Processing Performance and Scalability Goals:•Maximize performance enhancement of parallelism: Maximize Speedup.–By minimizing parallelization overheads and balancing workload on processors•Scalability of performance to larger systems/problems.Processor = Programmable computing element that runs stored programs written using pre-defined instruction setProcessing Elements = PEs = Processors i.e Parallel ProcessingEECC756 - ShaabanEECC756 - Shaaban#3 lec # 1 Spring 2008 3-11-2008 A A Generic Parallel Computer ArchitectureGeneric Parallel Computer ArchitectureProcessing Nodes: Each processing node contains one or more processing elements (PEs) or processor(s), memory system, plus communication assist: (Network interface and communication controller)Parallel machine network (System Interconnects).Function of a parallel machine network is to efficiently (reduce communication cost) transfer information (data, results .. ) from source node to destination node as needed to allow cooperation among parallel processing nodes to solve large computational problems divided into a number parallel computational tasks.MemNetworkP$Communicationassist (CA)Processing NodesA processing nodeParallel Machine Network(Custom or industry standard)One or more processing elements or processorsper node: Custom or commercial microprocessors. Single or multiple processors per chip Homogenous or heterogonous


View Full Document

RIT EECC 756 - Introduction to Parallel Processing

Documents in this Course
Load more
Download Introduction to Parallel Processing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction to Parallel Processing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction to Parallel Processing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?