DOC PREVIEW
RIT EECC 756 - Introduction to Parallel Processing

This preview shows page 1-2-3-4-27-28-29-30-56-57-58-59 out of 59 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 59 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

EECC756 EECC756 --ShaabanShaaban#1 lec # 1 Spring 2008 3-11-2008Introduction to Parallel ProcessingIntroduction to Parallel Processing••Parallel Computer Architecture: Parallel Computer Architecture: Definition & Broad issues involvedDefinition & Broad issues involved––A Generic Parallel Computer ArchitectureA Generic Parallel Computer Architecture••The Need And Feasibility of The Need And Feasibility of Parallel ComputingParallel Computing––Scientific Supercomputing TrendsScientific Supercomputing Trends––CPU Performance and Technology Trends, CPU Performance and Technology Trends, Parallelism in Microprocessor GenerationsParallelism in Microprocessor Generations––Computer System Peak FLOP Rating History/Near FutureComputer System Peak FLOP Rating History/Near Future••The Goal of Parallel ProcessingThe Goal of Parallel Processing••Elements of Parallel Computing Elements of Parallel Computing ••Factors Affecting Parallel System PerformanceFactors Affecting Parallel System Performance••Parallel Architectures HistoryParallel Architectures History––Parallel Programming ModelsParallel Programming Models––Flynn’s 1972 Classification of Computer ArchitectureFlynn’s 1972 Classification of Computer Architecture••Current Trends In Current Trends In Parallel ArchitecturesParallel Architectures––Modern Parallel Architecture Layered FrameworkModern Parallel Architecture Layered Framework••Shared Address Space Parallel ArchitecturesShared Address Space Parallel Architectures••MessageMessage--Passing Passing MulticomputersMulticomputers: Message: Message--Passing Programming ToolsPassing Programming Tools••Data Parallel SystemsData Parallel Systems••Dataflow ArchitecturesDataflow Architectures••Systolic Architectures: Systolic Architectures: Matrix Multiplication Systolic Array Example PCA Chapter 1.1, 1.2EECC756 EECC756 --ShaabanShaaban#2 lec # 1 Spring 2008 3-11-2008Parallel Computer ArchitectureParallel Computer ArchitectureA parallel computer (or multiple processor system) is a collection of communicating processing elements (processors) that cooperate to solve large computational problems fast by dividing such problems into parallel tasks, exploiting Thread-Level Parallelism (TLP).• Broad issues involved:– The concurrency and communication characteristics of parallel algorithms for a given computational problem (represented by dependency graphs)– Computing Resources and Computation Allocation:• The number of processing elements (PEs), computing power of each element and amount/organization of physical memory used.• What portions of the computation and data are allocated or mapped to each PE.– Data access, Communication and Synchronization• How the processing elements cooperate and communicate.• How data is shared/transmitted between processors.• Abstractions and primitives for cooperation/communication.• The characteristics and performance of parallel system network (System interconnects). – Parallel Processing Performance and Scalability Goals:• Maximize performance enhancement of parallelism: Maximize Speedup.– By minimizing parallelization overheadsand balancing workload on processors• Scalability of performance to larger systems/problems.Processor = Programmable computing element that runs stored programs written using pre-defined instruction setProcessing Elements = PEs = Processorsi.e Parallel ProcessingEECC756 EECC756 --ShaabanShaaban#3 lec # 1 Spring 2008 3-11-2008A A Generic Parallel Computer ArchitectureGeneric Parallel Computer ArchitectureProcessing Nodes: Each processing node contains one or more processing elements (PEs) or processor(s), memory system, plus communication assist: (Network interface and communication controller)Parallel machine network (System Interconnects).Function of a parallel machine network is to efficiently (reduce communication cost) transfer information (data, results .. ) from source node to destination node as needed to allow cooperation among parallel processing nodes to solve large computational problems divided into a number parallel computational tasks.Mem° ° °NetworkP$Communicationassist (CA)Processing NodesA processing nodeParallel Machine Network(Custom or industry standard)One or more processing elements or processorsper node: Custom or commercial microprocessors. Single or multiple processors per chipHomogenous or heterogonous Network Interface(custom or industry standard)Operating SystemParallel ProgrammingEnvironmentsParallel Computer = Multiple Processor SystemAKA Communication AssistEECC756 EECC756 --ShaabanShaaban#4 lec # 1 Spring 2008 3-11-2008The Need And Feasibility of The Need And Feasibility of Parallel ComputingParallel Computing• Application demands: More computing cycles/memory needed– Scientific/Engineering computing: CFD, Biology, Chemistry, Physics, ...– General-purpose computing: Video, Graphics, CAD, Databases, Transaction Processing, Gaming…– Mainstream multithreaded programs, are similar to parallel programs• Technology Trends:– Number of transistors on chip growing rapidly. Clock rates expected to continue to go up but only slowly. Actual performance returns diminishing due to deeper pipelines.– Increased transistor density allows integrating multiple processor cores per creating Chip-Multiprocessors (CMPs) even for mainstream computing applications (desktop/laptop..).• Architecture Trends:– Instruction-level parallelism (ILP) is valuable (superscalar, VLIW) but limited.– Increased clock rates require deeper pipelines with longer latencies and higher CPIs. – Coarser-level parallelism (at the task or thread level, TLP), utilized in multiprocessor systems is the most viable approach to further improve performance.• Main motivation for development of chip-multiprocessors (CMPs) • Economics:– The increased utilization of commodity of-the-shelf (COTS) components in high performance parallel computing systems instead of costly custom components used in traditional supercomputers leading to much lower parallel system cost.• Today’s microprocessors offer high-performance and have multiprocessor support eliminating the need for designing expensive custom Pes. • Commercial System Area Networks (SANs) offer an alternative to custom more costly networks DrivingForceEECC756 EECC756 --ShaabanShaaban#5 lec # 1 Spring 2008 3-11-2008Why is Parallel Processing


View Full Document

RIT EECC 756 - Introduction to Parallel Processing

Documents in this Course
Load more
Download Introduction to Parallel Processing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction to Parallel Processing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction to Parallel Processing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?