Unformatted text preview:

Generic Multiprocessor Architecture Network Communication assist CA Mem P Node processor s memory system plus communication assist Network interface and communication controller Scalable network Function of a parallel machine network is to efficiently transfer information from source node to destination node in support of network transactions that realize the programming model Network performance should scale up as its size is increased EECC756 Shaaban 1 lec 10 Spring2002 4 23 2002 Cost of Communication Given amount of comm inherent or artifactual goal is to reduce cost Cost of communication as seen by process n C f o l tc overlap B f frequency of messages o overhead per message at both ends l network delay per message n data sent for per message B bandwidth along path determined by network NI assist tc cost induced by contention per message overlap amount of latency hidden by overlap with comp or comm Portion in parentheses is cost of a message as seen by processor That portion ignoring overlap is latency of a message Goal reduce terms in latency and increase overlap EECC756 Shaaban 2 lec 10 Spring2002 4 23 2002 Network Representation Characteristics A parallel machine interconnection network is a graph V switches and nodes connected by communication channels or links C V V Each channel has width w bits and signaling rate f 1 is clock cycle time Channel bandwidth b wf bits sec Phit physical unit data transferred per cycle usually channel width w Flit basic unit of flow control minimum data unit transferred across a link Number of input output channels is switch or node degree Sequence of switches and links followed by a message in the network is a route Routing Distance number of links or hops on route A network is generally characterized by Topology Flow Control Mechanism Routing Algorithm Switching Strategy EECC756 Shaaban 3 lec 10 Spring2002 4 23 2002 Network Characteristics Topology Physical interconnection structure of the network graph Node Degree Number of channels per node Network diameter Longest minimum routing distance between any two nodes in hops Average Distance between all pairs of nodes Bisection width Minimum number of links whose removal disconnects the graph and cuts it in half Symmetry The property that the network looks the same from every node Homogeneity Whether all the nodes and links are identical or not Type of interconnection Static or Direct Interconnects Nodes connected directly using static point to point links Dynamic or Indirect Interconnects Switches are usually used to realize dynamic links between nodes Each node is connected to specific subset of switches e g Multistage Interconnection Networks MINs Blocking or non blocking permutations realized Shared broadcast or bus based connections e g Ethernet based EECC756 Shaaban 4 lec 10 Spring2002 4 23 2002 Network Characteristics Routing Algorithm and Functions The set of paths that messages may follow Request message combining capabilities Switching Strategy Circuit switching vs packet switching Flow Control Mechanism When a message or portions of it moves along its route Store Forward Routing Cut Through or Worm Hole Routing What happens when traffic is encountered at a node Link Node Contention handling Deadlock prevention Broadcast and Multicast Capabilities Communication Latency Link bandwidth EECC756 Shaaban 5 lec 10 Spring2002 4 23 2002 Network Characteristics Hardware software implementation complexity cost Network throughput Total number of messages handled by network per unit time Aggregate Network bandwidth Similar to network throughput but given in total bits sec Network hot spots Form in a network when a small number of network nodes links handle a very large percentage of total network traffic and become saturated Network scalability The feasibility of increasing network size determined by Performance scalability Relationship between network size in terms of number of nodes and the resulting network performance Cost scalability Relationship between network size in terms of number of nodes links and network cost complexity EECC756 Shaaban 6 lec 10 Spring2002 4 23 2002 Network Requirements For Parallel Computing Minimum network latency even when approaching network capacity High sustained bandwidth that matches or exceeds the communication requirements for given computational rate High network throughput Network should support as many concurrent transfers as possible Low Protocol overhead Minimum network cost Maximum Network Scalabilty Network performance should scale up with network size Scalable Interconnection Network network interface CA M CA P M P EECC756 Shaaban 7 lec 10 Spring2002 4 23 2002 Communication Network Performance Network Latency Unloaded Network Latency routing delay channel occupancy Time to transfer n bytes from source to destination Time n s d overhead routing delay channel occupancy contention delay channel occupancy n ne b b channel bandwidth bytes sec n payload size ne packet envelope header trailer EECC756 Shaaban 8 lec 10 Spring2002 4 23 2002 Flow Control Mechanisms Store Forward Vs Cut Through Routing Cut Through Routing Store Forward Routing Source Dest 32 1 0 3 2 1 0 3 2 1 3 2 3 Dest 0 1 0 2 1 0 3 2 1 0 3 2 1 3 2 3 0 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 3 2 1 0 1 0 2 1 0 3 2 1 0 3 2 1 Time 0 Unloaded network latency for n byte packet h n b vs n b h h distance in hops switch delay EECC756 Shaaban 9 lec 10 Spring2002 4 23 2002 Communication Network Performance Network Latency For an unloaded network no contention delay the network latency to transfer an n byte packet including packet envelope across the network Unloaded Network Latency routing delay channel occupancy For store and forward routing Unloaded Network Latency Tsf n h h n b For cut through routing Unloaded Network Latency Tct n h n b h h distance in hops switch delay EECC756 Shaaban 10 lec 10 Spring2002 4 23 2002 Reducing Network Latency Use cut through routing Unloaded Network Latency Tsf n h h n b Reduce number of hops h in route Map communication patterns to network topology e g nearest neighbor on mesh and ring all to all Applicable to networks with static or direct point to point interconnects Ideally network topology matches problem communication patterns Increase link bandwidth b Reduce switch routing delay EECC756 Shaaban 11 lec 10 Spring2002 4 23 2002 Available Bandwidth Factors affecting local bandwidth available to a single node Accounting for Packet density b x n n ne Also Accounting


View Full Document

RIT EECC 756 - Generic Multiprocessor Architecture

Documents in this Course
Load more
Loading Unlocking...
Login

Join to view Generic Multiprocessor Architecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Generic Multiprocessor Architecture and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?