UNLV ECG 702 - Fault-Tolerant Routing Schemes - D1102325

Home> Schools> University of Nevada, Las Vegas> Electrical And Computer Engineering (ECG) > ECG 702> Fault-Tolerant Routing Schemes

DOC PREVIEW

UNLV ECG 702 - Fault-Tolerant Routing Schemes

School name University of Nevada, Las Vegas

Course Ecg 702- Interconnection Networks for Parallel Processing Applications

Pages 6

This preview shows page 1-2 out of 6 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Fault-Tolerant Routing Schemes in RDT(2,2,1)/α-Based Interconnection Network for Networks-on-Chip Designs Mei Yang†, Tao Li‡, Yingtao Jiang†, and Yulu Yang‡ † Dept. of Electrical & Computer Engineering University of Nevada, Las Vegas Las Vegas, NV 89154, USA {meiyang, yingtao}@egr.unlv.edu ‡ Dept. of Computer Science and Technology Nankai University Tianjing, 300071, China [email protected], [email protected] Abstract It has been well recognized that the fault-tolerance capability is vital for a NoC system, since one faulty link/processor may isolate a large fraction of processors. Continuing from a previous paper [13] where a RDT(2,2,1)/α-based interconnection network for NoC designs was proposed, in this paper, we investigate fault-tolerant routing schemes on NoC systems featuring a RDT-based interconnection network. In specific, we propose two fault-tolerant routing schemes in the presence of either single link/node failure or multiple link/node failures. The proposed routing schemes are based on deterministic routing. Alternative routes are discovered by properly selecting the intermediate nodes between the source and the destination nodes on the rank tori. As of the single link/node failure case, we show that the number of routers on the detoured route generated by the proposed routing scheme is at most 2 more than the number of routers on the original route. 1. Introduction Advances in VLSI technology will soon allow a single chip to contain more than one billion transistors [8], indicating that a large number of processing units (such as CPU, DSP, multimedia processor) shall be integrated into one packaged chip. In these new systems, communication resources are competed by the vast volume computational resources. With a communication-centric design style, Networks-on-Chip (NoC) [1] was proposed to mitigate the complex communication problems. A NoC system is composed of a large number of processing units communicating to other units through routers across the interconnection network. Packets travel through the network by passing one or more hops between the source and the destination tiles. As semiconductor technology scales down to nanometer domain, processing units (PUs), routers (nodes), and interconnect links (links) are all subject to new types of malfunctions and failures that are harder to predict and avoid [5]. Particularly, failures of nodes/links may isolate a large number of fault-free PUs. A major challenge in NoC designs thus is to provide fault tolerance under link/node failure(s). In [13], we proposed an interconnection network architecture for NoC based on a special Recursive Diagonal Torus (RDT) structure [12], named as RDT(2,2,1)/α. In RDT(2,2,1)/α, each node has links to form base torus (rank-0) and 2 upper tori with the cardinal number 2. RDT(2,2,1)/α provides a promising solution for interconnection networks of NoC designs with its distinct advantages: 1) high scalability due to its recursive structure, 2) low power consumption due to its smaller diameter and average distance, 3) architectural customizability with its embedded mesh/torus topologies, 4) fault-tolerance capability with a constant node degree of 8, and 5) feasibility of layout compatible with current and future VLSI technologies [7]. In this paper, we attempt to investigate various fault-tolerant routing schemes applicable to the RDT(2,2,1)/α-based interconnection network. As pointed out in [4][7], adaptive routing which employs virtual channels [2][3] is infeasible for NoCs due to the need of large buffers and lookup tables as well as complex shortest-path algorithms. Instead, we consider the deterministic routing based fault-tolerant routing schemes which reroute the packets by properly selecting the intermediate routers between the source and the destination nodes. The proposed fault-tolerance routing schemes have no adverse impacts on routing in the absence of faults. The rest of the paper is organized as follows. In Section 2, we give an overview of the RDT structure. In Section 3, we describe the NoC architecture, the delay model, and the routing algorithms when no faults are present in the interconnection network. In Section 4, we present the fault-tolerant routing algorithm under single link/node failure. In Section 5, the fault-tolerant routing algorithm under multiple link/node failures is presented. Section 6 concludes the paper. 2. RDT(2,2,1)/α Structure The RDT structure [12] is constructed by recursively overlaying 2-D diagonal meshes (tori). The base torus is a two-dimensional square array of N by N nodes, each of which is numbered with a two-dimensional number as follows: (0, 0) (1, 0) (2, 0) … (N-1, 0) (0, 1) (1, 1) (2, 1) … (N-1, 1) (0, 2) (1, 2) (2, 2) … (N-1, 2) . . . . . . . . . . . . . . . (0, N-1) (1, N-1) (2, N-1) … (N-1, N-1) where N = nk and both n and k are natural numbers. The torus network is formed with four links between node(x, y) and its neighboring four nodes: (mod(x±1, N), y) and (x, mod(y±1, N)). This base torus is also called rank-0 torus. On the rank-0 torus, a new torus-like network (rank-1 torus) is formed by adding four links between node (x, y) and nodes (x±n, y±n). The direction of the new torus-like network is at an angle of 45 degrees to the original torus. On the rank-1 torus, anothertorus-like network (rank-2 torus) can be formed by adding four links in the same manner. Similarly, a rank-(r+1) torus can be formed upon rank-r torus. An independent torus on rank-i is one that does not have links to other tori on rank-i. RDT(n, R, m) is a class of networks in which each node has links to form base torus (rank-0) and m upper tori (the maximum rank is R) with the cardinal number n. Thus, the degree of the RDT(n, R, m) is 4(m+1) [12]. One of the upper rank torus is assigned to each node in the RDT(n, R, m) after n, R, and m are set. Thus, the structure of the RDT(n, R, m) also varies with different assignments for upper rank tori to each node. This assignment is called torus assignment. A network in which every node has links to form all possible upper rank tori (i.e. RDT(n, R, R)) is called a perfect RDT (PRDT(n, R)), where n is the cardinal number, and R is the maximum rank. We refer a perfect torus as one that contains all links

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2 out of 6 pages.

UNLV ECG 702 - Fault-Tolerant Routing Schemes

Sign up for free to view:

Please select your school