View Full Document

Low-Swing Crossbar and Link Generator for Low-Power Networks-on-Chip



View the full content.
View Full Document
View Full Document

18 views

Unformatted text preview:

A Low Swing Crossbar and Link Generator for Low Power Networks on Chip Chia Hsin Owen Chen1 Sunghyun Park2 Tushar Krishna1 Li Shiuan Peh1 Dept of Electrical Engineering and Computer Science Massachusettes Institute of Technology Cambridge MA 02139 1 owenhsin tushar peh csail mit edu 2 pshking mit edu Abstract Networks on Chip NoCs are emerging as the answer to non scalable buses for connecting multiple cores in Chip Multi Processors CMPs and multiple IP blocks in Multi Processor Systems on Chip MPSoCs These networks require an extremely low power datapath to ensure sustained scalability and higher performance watt Crossbars and links form the core of a network datapath and integrating low swing links within these will reduce power significantly Low swing links however require significant custom circuit design effort to deliver good power efficiency and high bit rate in the face of noise As a result low swing links have not been able to make it to mainstream chips which rely on crossbar and link generators from RTL In this paper we present a datapath generator that creates automated layouts for crossbars with noise robust low swing links within them To the best of our knowledge this is the first crossbar generator that 1 creates layouts instead of generating just synthesizable RTL and 2 integrates noise robust low swing links in an automated manner We demonstrate our generated datapath in a fully synthesized NoC router and observe 50 power reduction on datapath I I NTRODUCTION Continued transistor scaling has enabled more compute and storage units to be added on the same chip However power limitations have forced designers to go parallel and to realize sustained throughputs with simpler computing blocks connected together In the processor domain the power limitations have resulted in the emergence of CMPs while in the embedded domain MPSoCs have started becoming popular These trends put the interconnection fabric into limelight to enable fast and low power communication between these processing units On chip buses are not scalable beyond a few cores since they are limited by wire delay and bandwidth 1 There has been a trend towards using NoCs to manage wires more efficiently For some systems this network might comprise only a crossbar 2 while for others an interconnection of packet switched routers is used 3 4 with each router comprised of buffers arbiters and a crossbar to enable sharing of links In both kinds of systems a crossbar is the fundamental building block that connects input ports to output ports A 1 bit N M crossbar consists of N M interconnected wires that are controlled by switches and enable any port to connect to any other port The outputs of a crossbar connect to links that then connect to an IP block or a router The crossbar and links thus together form the datapath of a NoC This datapath has been found to dominate the NoC power consumption Fabricated chips from academia such as MIT RAW 5 and UT TRIPS 6 use RTL synthesis to generate the datapath and the ratio of datapath power consumption and the total on chip network power consumption are reported to be 69 and 64 respectively Intel TERAFLOPs 4 uses a custom designed double pumped crossbar with a location based channel driver to reduce the channel area and peak channel driver current 7 and is thus able to reduce datapath power to 32 of the total on chip network power Other circuit techniques that have been proposed to reduce this power consumption involve dividing the crossbar wires into multiple segments and partially activating selected segments 8 9 based on the input and output ports These circuit techniques present only the capacitance between the input and output port and disable reduce other capacitances They are thus successful in reducing wasteful power consumption However they still require complete charging discharging of the long wires from the input port to the output port and the core core links which are significant power consumers Low swing signaling techniques can help mitigate the wire power consumption The energy benefits of low swing signaling have been demonstrated on chip from 10mm equalized global wires 10 through 1 2mm core to core links 11 to less than 1mm within crossbars 12 14 However such low swing signaling circuits which can be viewed as analog circuits require full custom design resulting in substantial design time overhead Circuit designers have to manually design schematic netlists optimize logic gates for each timing path and size individual transistors Moreover layout engineers have to manually place all the transistors and route their nets with careful consideration of circuit symmetry and noise coupling This custom design process leads to high development cost long and uncertain verification timescales and poor interface to other parts of a many core chip which are mostly RTL based In the past designers faced similar challenges while integrating low power memory circuits with the VLSI CAD flow with their sense amplifiers self timed circuits and dynamic circuits Memory compilers which are now commonplace have solved the problem and enabled these sophisticated analog circuits to be automatically generated subject to variable constraints specified by the users This paper proposes to similarly automate and generate low swing signaling circuits as part of the datapath crossbar and links of a NoC thereby integrating such circuits within the CAD flow of many core chips enabling their broad adoption II BACKGROUND In this section we present background on crossbars lowswing links and the limitations of the current synthesis flow A Crossbar A N M crossbar connects N inputs to M outputs with no intermediate stages where any inputs can send data to any non busy outputs Figure 1 shows the schematic of a 2 bit a Port sliced organization Fig 1 a Fig 2 Dout3 1 Dout2 1 Dout1 1 Dout0 1 Dout3 0 Dout2 0 Dout1 0 Din3 1 Dout0 0 Din2 1 Din3 1 Dout3 1 Din1 1 Din3 0 Dout3 0 Din0 1 Din2 1 Dout2 1 Din3 0 Din2 0 Dout2 0 Din2 0 Din1 1 Dout1 1 Din1 0 Din1 0 Dout1 0 Din0 0 Din0 1 Dout0 1 Din0 0 Dout0 0 Since crossbars and links are such an essential component of on chip networks there have been efforts in the past to automate their generation Sredojevic and Stojanovic 15 presented a framework for design space exploration of equalized links and a tool that generates an optimized transistor schematic However they rely on custom design for the actual layout ARM AMBA 16 STMicroelectronics STBus 17 Sonics


Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view Low-Swing Crossbar and Link Generator for Low-Power Networks-on-Chip and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Low-Swing Crossbar and Link Generator for Low-Power Networks-on-Chip and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?