U of U CS 6810 - Lecture 25 - Interconnection Networks

Unformatted text preview:

1Lecture 25: Interconnection Networks• Topics: communication latency, centralized anddecentralized switches, routing, deadlocks (Appendix E)• Review session, Wednesday Dec 1st, 10-12, LCR (MEB 3147)• Final exam reminders• Come early, 10:35 – 12:15• Same rules as first midterm, open books/notes/…, • Can use calculators and laptops (no search or internet)• 20% from first midterm material; remaining 80% fromcaches, multiprocs, TM• 20% new problems• Attempt every question2Topologies• Internet topologies are not very regular – they grewincrementally• Supercomputers have regular interconnect topologiesand trade off cost for high bandwidth• Nodes can be connected with centralized switch: all nodes have input and outputwires going to a centralized chip that internallyhandles all routing decentralized switch: each node is connected to aswitch that routes data to one of a few neighbors3Centralized Crossbar SwitchP1P2P3P4P5P6P7P0Crossbarswitch4Centralized Crossbar SwitchP1P2P3P4P5P6P7P05Crossbar Properties• Assuming each node has one input and one output, acrossbar can provide maximum bandwidth: N messagescan be sent as long as there are N unique sources andN unique destinations• Maximum overhead: WN2internal switches, where W isdata width and N is number of nodes• To reduce overhead, use smaller switches as buildingblocks – trade off overhead for lower effective bandwidth6Switch with Omega NetworkP1P2P3P4P5P6P7P0000001010011100101110111 1111101011000110100010007Omega Network Properties• The switch complexity is now O(N log N)• Contention increases: P0  P5 and P1  P7 cannothappen concurrently (this was possible in a crossbar)• To deal with contention, can increase the number oflevels (redundant paths) – by mirroring the network, wecan route from P0 to P5 via N intermediate nodes, whileincreasing complexity by a factor of 28Tree Network• Complexity is O(N)• Can yield low latencies when communicating with neighbors• Can build a fat tree by having multiple incoming and outgoing linksP0 P3P2P1 P4 P7P6P59Bisection Bandwidth• Split N nodes into two groups of N/2 nodes such that thebandwidth between these two groups is minimum: that isthe bisection bandwidth• Why is it relevant: if traffic is completely random, theprobability of a message going across the two halves is½ – if all nodes send a message, the bisectionbandwidth will have to be N/2• The concept of bisection bandwidth confirms that thetree network is not suited for random traffic patterns, butfor localized traffic patterns10Distributed Switches: Ring• Each node is connected to a 3x3 switch that routesmessages between the node and its two neighbors• Effectively a repeated bus: multiple messages in transit• Disadvantage: bisection bandwidth of 2 and N/2 hops onaverage11Distributed Switch Options• Performance can be increased by throwing more hardwareat the problem: fully-connected switches: every switch isconnected to every other switch: N2wiring complexity,N2/4 bisection bandwidth• Most commercial designs adopt a point between the twoextremes (ring and fully-connected): Grid: each node connects with its N, E, W, S neighbors Torus: connections wrap around Hypercube: links between nodes whose binary namesdiffer in a single bit12Topology ExamplesGridHypercubeTorusCriteria Bus Ring 2Dtorus 6-cube Fully connectedPerformanceBisection bandwidthCostPorts/switchTotal links13Topology ExamplesGridHypercubeTorusCriteria Bus Ring 2Dtorus 6-cube Fully connectedPerformanceBisection bandwidth1 2 16 32 1024CostPorts/switchTotal links 131285192725664208014k-ary d-cube• Consider a k-ary d-cube: a d-dimension array with kelements in each dimension, there are links betweenelements that differ in one dimension by 1 (mod k)• Number of nodes N = kdNumber of switches :Switch degree :Number of links :Pins per node :Avg. routing distance:Diameter :Bisection bandwidth :Switch complexity :Should we minimize or maximize dimension?15k-ary d-Cube• Consider a k-ary d-cube: a d-dimension array with kelements in each dimension, there are links betweenelements that differ in one dimension by 1 (mod k)• Number of nodes N = kdNumber of switches :Switch degree :Number of links :Pins per node :Avg. routing distance:Diameter :Bisection bandwidth :Switch complexity :N2d + 1Nd2wdd(k-1)/2d(k-1)2wkd-1Should we minimize or maximize dimension?(2d + 1)2(with no wraparound)16Routing• Deterministic routing: given the source and destination,there exists a unique route• Adaptive routing: a switch may alter the route in order todeal with unexpected events (faults, congestion) – morecomplexity in the router vs. potentially better performance• Example of deterministic routing: dimension order routing:send packet along first dimension until destination co-ord(in that dimension) is reached, then next dimension, etc.17Deadlock• Deadlock happens when there is a cycle of resourcedependencies – a process holds on to a resource (A) andattempts to acquire another resource (B) – A is notrelinquished until B is acquired18Deadlock ExamplePackets of message 1Packets of message 2Packets of message 3Packets of message 44-way switchOutput portsEach message is attempting to make a left turn – it must acquire anoutput port, while still holding on to a series of input and output portsInput ports19Deadlock-Free Proofs• Number edges and show that all routes will traverse edges in increasing (ordecreasing) order – therefore, it will be impossible to have cyclic dependencies• Example: k-ary 2-d array with dimension routing: first route along x-dimension,then along y1 2 32 1 01 2 32 1 01 2 32 1 01 2 32 1 017181918171620Breaking Deadlock I• The earlier proof does not apply to tori because ofwraparound edges• Partition resources across multiple virtual channels• If a wraparound edge must be used in a torus, travel onvirtual channel 1, else travel on virtual channel 021Breaking Deadlock II• Consider the eight possible turns in a 2-d array (note thatturns lead to cycles)• By preventing just two turns, cycles can be eliminated• Dimension-order routing disallows four turns• Helps avoid deadlock even in adaptive routingWest-First North-Last Negative-First Can allowdeadlocks22Title•


View Full Document

U of U CS 6810 - Lecture 25 - Interconnection Networks

Documents in this Course
Caches

Caches

13 pages

Pipelines

Pipelines

14 pages

Load more
Download Lecture 25 - Interconnection Networks
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 25 - Interconnection Networks and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 25 - Interconnection Networks 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?