Unformatted text preview:

Parallel Computer Architecture A parallel computer is a collection of processing elements that cooperate to solve large problems Broad issues involved Resource Allocation Number of processing elements PEs Computing power of each element Amount of physical memory used Data access Communication and Synchronization How the elements cooperate and communicate How data is transmitted between processors Abstractions and primitives for cooperation Performance and Scalability Performance enhancement of parallelism Speedup Scalabilty of performance to larger systems problems EECC756 Shaaban 1 Exam Review Spring2000 The Goal of Parallel Computing Goal of applications in using parallel machines Speedup Speedup p processors Performance p processors Performance 1 processor For a fixed problem size input data set performance 1 time Speedup fixed problem p processors Time 1 processor Time p processors EECC756 Shaaban 2 Exam Review Spring2000 Elements of Modern Computers Computing Problems Algorithms and Data Structures Mapping Hardware Architecture Programming High level Languages Operating System Binding Compile Load Applications Software Performance Evaluation EECC756 Shaaban 3 Exam Review Spring2000 Approaches to Parallel Programming Programmer Programmer Source code written in sequential languages C C FORTRAN LISP Source code written in concurrent dialects of C C FORTRAN LISP Parallelizing compiler Concurrency preserving compiler Parallel object code Execution by runtime system a Implicit Parallelism Concurrent object code b Explicit Parallelism Execution by runtime system EECC756 Shaaban 4 Exam Review Spring2000 Scalar Sequential Evolution of Computer Architecture Lookahead Functional Parallelism I E Overlap Multiple Func Units I E Instruction Fetch and Execute Pipeline Implicit Vector SIMD Single Instruction stream over Multiple Data streams MIMD Multiple Instruction streams over Multiple Data streams Associative Processor Explicit Vector Memory to Memory SIMD Register to Register MIMD Processor Multicomputer Array Massively Parallel Processors MPPs Multiprocessor EECC756 Shaaban Programming Models Programming methodology used in coding applications Specifies communication and synchronization Examples Multiprogramming No communication or synchronization at program level Shared memory address space Message passing Explicit point to point communication Data parallel More regimented global actions on data Implemented with shared address space or message passing EECC756 Shaaban 6 Exam Review Spring2000 Flynn s 1972 Classification of Computer Architecture Single Instruction stream over a Single Data stream SISD Conventional sequential machines Single Instruction stream over Multiple Data streams SIMD Vector computers array of synchronized processing elements Multiple Instruction streams and a Single Data stream MISD Systolic arrays for pipelined execution Multiple Instruction streams over Multiple Data streams MIMD Parallel computers Shared memory multiprocessors Multicomputers Unshared distributed memory message passing used instead EECC756 Shaaban 7 Exam Review Spring2000 Current Trends In Parallel Architectures The extension of computer architecture to support communication and cooperation OLD Instruction Set Architecture NEW Communication Architecture Defines Critical abstractions boundaries and primitives interfaces Organizational structures that implement interfaces hardware or software Compilers libraries and OS are important bridges today EECC756 Shaaban 8 Exam Review Spring2000 Models of Shared Memory Multiprocessors The Uniform Memory Access UMA Model The physical memory is shared by all processors All processors have equal access to all memory addresses Distributed memory or Nonuniform Memory Access NUMA Model Shared memory is physically distributed locally among processors The Cache Only Memory Architecture COMA Model A special case of a NUMA machine where all distributed main memory is converted to caches No memory hierarchy at each processor EECC756 Shaaban 9 Exam Review Spring2000 Models of Shared Memory Multiprocessors Uniform Memory Access UMA Model I O devices Mem Mem Mem Mem I O ctrl I O ctrl Interconnect Interconnect Processor Interconnect Bus Crossbar Multistage network P Processor M Memory C Cache D Cache directory Processor Network Network M P M P M D D D C C C P P P P Distributed memory or Nonuniform Memory Access NUMA Model Cache Only Memory Architecture COMA EECC756 Shaaban 10 Exam Review Spring2000 Message Passing Multicomputers Comprised of multiple autonomous computers nodes Each node consists of a processor local memory attached storage and I O peripherals Programming model is more removed from basic hardware operations Local memory is only accessible by local processors A message passing network provides point to point static connections among the nodes Inter node communication is carried out by message passing through the static connection network Process communication achieved using a message passing programming environment EECC756 Shaaban 11 Exam Review Spring2000 Convergence Generic Parallel Architecture A generic modern multiprocessor Network Communication assist CA Mem P Node processor s memory system plus communication assist Network interface and communication controller Scalable network Convergence allows lots of innovation now within framework Integration of assist with node what operations how efficiently EECC756 Shaaban 12 Exam Review Spring2000 Fundamental Design Issues At any layer interface contract aspect and performance aspects Naming How are logically shared data and or processes referenced Operations What operations are provided on these data Ordering How are accesses to data ordered and coordinated Replication How are data replicated to reduce communication Communication Cost Latency bandwidth overhead occupancy Understand at programming model first since that sets requirements Other issues Node Granularity How to split between processors and memory EECC756 Shaaban 13 Exam Review Spring2000 Synchronization Mutual exclusion locks Ensure certain operations on certain data can be performed by only one process at a time Room that only one person can enter at a time No ordering guarantees Event synchronization Ordering of events to preserve dependencies e g Producer Consumer of data Three main types Point to point Global Group EECC756 Shaaban 14 Exam Review Spring2000 Communication Cost Model Comm Time per message Overhead Assist Occupancy Network


View Full Document

RIT EECC 756 - Parallel Computer Architecture

Documents in this Course
Load more
Loading Unlocking...
Login

Join to view Parallel Computer Architecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Parallel Computer Architecture and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?