CS258 S991NOW Handout Page 1Interconnection Network TopologyDesign Trade-offsCS 258, Spring 99David E. CullerComputer Science DivisionU.C. Berkeley3/19/99 CS258 S99 2Real Machines• Wide links, smaller routing delay• Tremendous variation3/19/99 CS258 S99 3Interconnection Topologies• Class networks scaling with N• Logical Properties:– distance, degree• Physcial properties– length, width• Fully connected network– diameter = 1– degree = N– cost?» bus => O(N), but BW is O(1) - actually worse» crossbar => O(N2) for BW O(N)• VLSI technology determines switch degree3/19/99 CS258 S99 4Linear Arrays and Rings• Linear Array– Diameter?– Average Distance?– Bisection bandwidth?– Route A -> B given by relative address R = B-A• Torus?• Examples: FDDI, SCI, FiberChannel ArbitratedLoop, KSR1Linear ArrayTorusTorus arranged to use short wires3/19/99 CS258 S99 5Multidimensional Meshes and Tori• d-dimensional array– n = kd-1 X ...X kO nodes– described by d-vector of coordinates (id-1, ..., iO)• d-dimensional k-ary mesh: N = kd– k = d√N– described by d-vector of radix k coordinate• d-dimensional k-ary torus (or k-ary d-cube)?2D Grid3D Cube3/19/99 CS258 S99 6Properties• Routing– relative distance: R = (b d-1 - a d-1, ... , b0 - a0 )– traverse ri = b i - a i hops in each dimension– dimension-order routing• Average Distance Wire Length?– d x 2k/3 for mesh– dk/2 for cube• Degree?• Bisection bandwidth? Partitioning?– k d-1 bidirectional links• Physical layout?– 2D in O(N) space Short wires– higher dimension?CS258 S992NOW Handout Page 23/19/99 CS258 S99 7Real World 2D mesh• 1824 node Paragon: 16 x 114 array3/19/99 CS258 S99 8Embeddings in two dimensions• Embed multiple logical dimension in onephysical dimension using long wires6 x 3 x 23/19/99 CS258 S99 9Trees• Diameter and ave distance logarithmic– k-ary tree, height d = logk N– address specified d-vector of radix k coordinates describingpath down from root• Fixed degree• Route up to common ancestor and down– R = B xor A– let i be position of most significant 1 in R, route up i+1 levels– down in direction given by low i+1 bits of B• H-tree space is O(N) with O(√N) long wires• Bisection BW?3/19/99 CS258 S99 10Fat-Trees• Fatter links (really more of them) as you go up,so bisection BW scales with NFat Tree3/19/99 CS258 S99 11Butterflies• Tree with lots of roots!• N log N (actually N/2 x logN)• Exactly one route from any source to any dest• R = A xor B, at level i use ‘straight’ edge if ri=0,otherwise cross edge• Bisection N/2 vs n (d-1)/d0123416 node butterfly0 1 0 10 1 0 10 1building block3/19/99 CS258 S99 12k-ary d-cubes vs d-ary k-flies• degree d• N switches vs N log N switches• diminishing BW per node vs constant• requires locality vs little benefit to locality• Can you route all permutations?CS258 S993NOW Handout Page 33/19/99 CS258 S99 13Benes network and Fat Tree• Back-to-back butterfly can route all permutations– off line• What if you just pick a random mid point?16-node Benes Network (Unidirectional)16-node 2-ary Fat-Tree (Bidirectional)3/19/99 CS258 S99 14Hypercubes• Also called binary n-cubes. # of nodes = N = 2n.• O(logN) Hops• Good bisection BW• Complexity– Out degree is n = logNcorrect dimensions in order– with random comm. 2 ports per processor0-D 1-D 2-D 3-D4-D5-D !3/19/99 CS258 S99 15Relationship BttrFlies to Hypercubes• Wiring is isomorphic• Except that Butterfly always takes log n steps3/19/99 CS258 S99 16Toplology Summary• All have some “bad permutations”– many popular permutations are very bad for meshs(transpose)– ramdomness in wiring or routing makes it hard to find a badone!Topology Degree Diameter Ave Dist Bisection D (D ave) @ P=10241D Array 2 N-1 N / 3 1 huge1D Ring 2 N/2 N/4 22D Mesh 4 2 (N1/2 - 1) 2/3 N1/2N1/263 (21)2D Torus 4 N1/21/2 N1/22N1/232 (16)k-ary n-cube 2n nk/2 nk/4 nk/4 15 (7.5) @n=3Hypercube n =log N n n/2 N/2 10 (5)3/19/99 CS258 S99 17How Many Dimensions?• n = 2 or n = 3– Short wires, easy to build– Many hops, low bisection bandwidth– Requires traffic locality• n >= 4– Harder to build, more wires, longer average length– Fewer hops, better bisection bandwidth– Can handle non-local traffic• k-ary d-cubes provide a consistent framework forcomparison– N = kd– scale dimension (d) or nodes per dimension (k)– assume cut-through3/19/99 CS258 S99 18Traditional Scaling: Latency(P)• Assumes equal channel width– independent of node count or dimension– dominated by average distance0204060801001201400 5000 10000Machine Size (N)Ave Latency T(n=40)d=2d=3d=4k=2n/w0501001502002500 2000 4000 6000 8000 10000Machine Size (N)Ave Latency T(n=140)CS258 S994NOW Handout Page 43/19/99 CS258 S99 19Average Distance• but, equal channel width is not equal cost!• Higher dimension => more channels01020304050607080901000 5 10 15 20 25DimensionAve Distance2561024163841048576ave dist = d (k-1)/23/19/99 CS258 S99 20In the 3D world• For n nodes, bisection area is O(n2/3 )• For large n, bisection bandwidth is limited toO(n2/3 )– Bill Dally, IEEE TPDS, [Dal90a]– For fixed bisection bandwidth, low-dimensional k-ary n-cubesare better (otherwise higher is better)– i.e., a few short fat wires are better than many long thin wires– What about many long fat wires?3/19/99 CS258 S99 21Equal cost in k-ary n-cubes• Equal number of nodes?• Equal number of pins/wires?• Equal bisection bandwidth?• Equal area? Equal wire length?What do we know?• switch degree: d diameter = d(k-1)• total links = Nd• pins per node = 2wd• bisection = kd-1 = N/k links in each directions• 2Nw/k wires cross the middle3/19/99 CS258 S99 22Latency(d) for P with Equal Width• total links(N) = Nd0501001502002500 5 10 15 20 25DimensionAverage Latency (n = 40, ∆ = 2)25610241638410485763/19/99 CS258 S99 23Latency with Equal Pin Count• Baseline d=2, has w = 32 (128 wires per node)• fix 2dw pins => w(d) = 64/d• distance up with d, but channel time down0501001502002503000 5 10 15 20 25Dimension (d)Ave Latency T(n=40B)256 nodes1024 nodes16 k nodes1M nodes0501001502002503000 5 10 15 20 25Dimension (d)Ave Latency T(n= 140 B)256 nodes1024 nodes16 k nodes1M nodes3/19/99 CS258 S99 24Latency with Equal Bisection Width• N-node hypercubehas N bisectionlinks• 2d torus has 2N 1/2• Fixed bisection =>w(d) = N 1/d / 2 = k/2• 1 M nodes, d=2
View Full Document