DOC PREVIEW
Low-Swing Crossbar and Link Generator for Low-Power Networks-on-Chip

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

A Low-Swing Crossbar and Link Generator forLow-Power Networks-on-ChipChia-Hsin Owen Chen1, Sunghyun Park2, Tushar Krishna1, Li-Shiuan Peh1Dept. of Electrical Engineering and Computer Science, Massachusettes Institute of Technology, Cambridge, MA 021391{owenhsin, tushar, peh}@csail.mit.edu,[email protected]—Networks-on-Chip (NoCs) are emerging as the an-swer to non-scalable buses for connecting multiple cores inChip Multi Processors (CMPs), and multiple IP blocks in MultiProcessor Systems-on-Chip (MPSoCs). These networks requirean extremely low-power datapath to ensure sustained scalability,and higher performance/watt. Crossbars and links form the coreof a network datapath, and integrating low-swing links withinthese will reduce power significantly. Low-swing links howeverrequire significant custom circuit design effort to deliver goodpower efficiency and high bit rate, in the face of noise. As a result,low-swing links have not been able to make it to mainstream chipswhich rely on crossbar and link generators from RTL. In thispaper, we present a datapath generator that creates automatedlayouts for crossbars with noise-robus t low-swing links withinthem. To the best of our knowledge, this is the first crossbargenerator that (1) creates layouts, instead of generating justsynthesizable RTL ; and (2) integrates noise-robust low-swinglinks in an automated manner. We demonstrate our generateddatapath in a fully-synthesized NoC router, and observe 50%power reduction on datapath.I. INTRODUCTIONContinued transistor scaling has enabled more computeand storage units to be added on the same chip. However,power limitations have f orced designers to go parallel and torealize sustained throughputs with simpler computing blocksconnected together. In the processor domain, the power limi-tations have resulted in the emergence of CMPs, while in theembedded domain, MPSoCs have started becoming popular.These trends put the interconnection fabric into limelightto enable fast and low-power communication between theseprocessing units. On-chip buses are not scalable beyond afew cores, since they are limited by wire-delay and band-width [1]. There has been a trend towards using NoCs tomanage wires more efficiently. For some systems, this networkmight comprise only a crossbar [2], while for others, aninterconnection of packet-switched routers is used [3], [4] witheach router comprised of buffers, arbiters, and a crossbar toenable sharing of links. In both kinds of systems, a crossbaris the fundamental building block that connects input ports tooutput ports.A 1-bit N × M crossbar consists of N × M interconnectedwires that are controlled by switches and enable any port toconnect to any other port. The outputs of a crossbar connectto links that then connect to an IP block or a router. Thecrossbar and links thus together form the datapath of a NoC.This datapath has been found to dominate the NoC powerconsumption. Fabricated chips from academia, such as MITRAW [5] and UT TRIPS [6], use RTL synthesis to generatethe datapath, and the ratio of datapath power consumption andthe total on-chip network power consumption are reported tobe 69% and 64%, respectively. Intel TERAFLOPs [4] usesa custom-designed double-pumped crossbar with a locationbased channel driver to reduce the channel area and peakchannel driver current [7] and is thus able to reduce datapathpower to 32% of the total on-chip network power. Othercircuit techniques that have been proposed to reduce this powerconsumption involve dividing the crossbar wires into multiplesegments and partially activating selected segments [8], [9]based on the input and output ports. These circuit techniquespresent only the capacitance between the input and outputport, and disable/reduce other capacitances. They are thussuccessful in reducing wasteful power consumption. However,they still require complete charging/discharging of the longwires from the input port to the output port and the core-corelinks, which are significant power consumers.Low-swing signaling techniques can help mitigate the wirepower consumption. The energy benefits of low-swing signal-ing have been demonstrated on-chip from 10mm equalizedglobal wires [10], through 1-2mm core-to-core links [11],to less than 1mm within crossbars [12]–[14]. However, suchlow-swing signaling circuits, which can be viewed as analogcircuits, require full custom design, resulting in substantialdesign time overhead. Circuit designers have to manuallydesign schematic/netlists, optimize logic gates for each timingpath, and size individual transistors. Moreover, layout engi-neers have to manually place all the transistors and routetheir nets with careful consideration of circuit symmetry andnoise coupling. This custom design process leads to highdevelopment cost, long and uncertain verification timescales,and poor interface to other parts of a many-core chip, whichare mostly RTL-based.In the past, designers faced similar challenges while inte-grating low-power memory circuits with the VLSI CAD flow,with their sense amplifiers, self-timed circuits and dynamiccircuits. Memory compilers, which are now commonplace,have solved the problem and enabled these sophisticated ana-log circuits to be automatically generated, subject to variableconstraints specified by the users. This paper proposes tosimilarly automate and generate low-swing signaling circuitsas part of the datapath (crossbar and links) of a NoC, therebyintegrating such circuits within the CAD flow of many-corechips, enabling their broad adoption.Since crossbars and links are such an essential componentof on-chip networks, there have been efforts in the past toautomate their generation. Sredojevic and Stojanovic [15]presented a framework for design-space exploration of equal-ized links, and a tool that generates an optimized transistorschematic. However, they rely on custom-design for the actuallayout. ARM AMBA [16], STMicroelectronics STBus [17],Sonics MicroNetworks [18], and IBM CoreConnect [19] areexamples of on-chip bus generators; DX-Gt [20] is a crossbargenerator; and ×pipes [21] is a network interface, switch andlink generator. These tools are aimed at application specificnetwork-on-chip (NoC) component generation, but they allstop at the synthesizable HDL level, i.e. they generate RTL,and then rely on synthesis and place-and-route tools to gen-erate the final design. This is not the most efficient way todesign crossbars, as we show later in Section IV,


Low-Swing Crossbar and Link Generator for Low-Power Networks-on-Chip

Download Low-Swing Crossbar and Link Generator for Low-Power Networks-on-Chip
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Low-Swing Crossbar and Link Generator for Low-Power Networks-on-Chip and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Low-Swing Crossbar and Link Generator for Low-Power Networks-on-Chip 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?