CS250 VLSI Systems Design L2 Design Representations John Wawrzynek Krste Asanovic with John Lazzaro and Yunsup Lee TA Lecture 2 Design Representations CS250 UC Berkeley Fall 2009 Engineering Challenge Application Gap usually too large to bridge in one step but there are exceptions Physics Lecture 2 Design Representations 2 CS250 UC Berkeley Fall 2009 Magnetic Compass Application Physics Lecture 2 Design Representations 3 CS250 UC Berkeley Fall 2009 Design Abstraction Stack Application Unit Transaction Level UTL Register Transfer Level RTL Gates Circuits Devices Transistors Physics n oxi p n Conduction Band Eg Valence Band Lecture 2 Design Representations 4 CS250 UC Berkeley Fall 2009 Properties of a Useful Abstraction Hides less important details e g for RTL don t worry how combinational logic is decomposed into logic gates Allows control of more important details e g RTL designer still controls how much logic is performed bet ween any t wo registers If done right provides portable efficiency i e same RTL can be implemented as custom logic standard cells FPGA or even vacuum tube logic with reasonably good results Lecture 2 Design Representations 5 CS250 UC Berkeley Fall 2009 CS250 Design Abstractions Application Primary Unit Transaction Level UTL Design Abstractions Register Transfer Level RTL Interface to Technology Gates Circuits UCB EE141 241 Devices Transistors UCB EE130 230 Physics Lecture 2 Design Representations 6 CS250 UC Berkeley Fall 2009 CS250 Design Refinement Application C C Architecture Design Manual UTL C C Micro Architecture Design Manual RTL Verilog Synthesis Place Route Automated Gates Stdcell Library Lecture 2 Design Representations 7 CS250 UC Berkeley Fall 2009 Course Prerequisites B in CS150 for UCB undergrads or equivalent for incoming grad students This means you should have seen RTL and Verilog VHDL before We won t be covering Verilog coding details in lecture but some coverage in section handouts Lecture 2 Design Representations 8 CS250 UC Berkeley Fall 2009 RTL Representation Combinational Logic Combinational Logic Clock When writing Verilog be sure to separate RTL code into pure state and pure logic Lecture 2 Design Representations 9 CS250 UC Berkeley Fall 2009 Application to RTL in One Step Modern hardware systems have complex functionality graphics chips video encoders wireless communication channels but sometimes designers try to map directly to an RTL cycle level architecture in one step Requires detailed cycle level design of each sub unit Significant design effort required before clear if design will meet goals Interactions bet ween units becomes unclear if arbitrary circuit connections allowed bet ween units with possible cycle level timing dependencies Increases complexity of unit specifications Removes degrees of freedom for unit designers Reduces possible space for architecture exploration Difficult to document intended operation therefore difficult to verify Lecture 2 Design Representations 10 CS250 UC Berkeley Fall 2009 Example Difficult Design Problem The humble shift register For today s lecture we ll assume clock distribution is not an issue Lecture 2 Design Representations 11 CS250 UC Berkeley Fall 2009 First Complication Output Stall Shift register should only move data to right if output ready to accept next item Ready What complication does this introduce Need to fan out to enable signal on each flop Lecture 2 Design Representations 12 CS250 UC Berkeley Fall 2009 Stall Fan Out Example Ready Enable 200 bits per shift register stage 16 stages 3200 flip flops How many fanout of four gate delays to buffer up ready signal Log4 3200 5 82 This doesn t include any penalty for driving enable signal wiring Lecture 2 Design Representations 13 CS250 UC Berkeley Fall 2009 Loops Prevent Arbitrary Resizing Shift Register Module Receiving Module Ready Ready Logic We could increase size of gates in ready logic block to reduce fan out required to drive ready signal to flop enables BUT this increases load on flops so they have to get bigger a vicious circle Lecture 2 Design Representations 14 CS250 UC Berkeley Fall 2009 Second Complication Bubbles Sender doesn t have valid data every clock cycle empty bubbles inserted into pipeline Ready Valid Valid Stage 1 Would like to squeeze bubbles out of pipeline Stage 2 Stage 3 Stage 4 Time Lecture 2 Design Representations Ready 15 CS250 UC Berkeley Fall 2009 Logic to Squeeze Bubbles Can move one stage to right if Ready asserted or there is any bubble in stages to right of current stage Ready Enable Valid Valid Fan in of number of valid signals grows with number of pipeline stages Fan out of each stage s valid signal also grows with number of pipeline stages Results in slow combinational paths as number of pipeline stages grows Lecture 2 Design Representations 16 CS250 UC Berkeley Fall 2009 Decoupled Design Discipline The shift register is a simple example that illustrates the control complexity problems of any large synchronous pipeline Usually there are even more complex interactions bet ween stages Combinational Logic Combinational Logic Clock To avoid these problems and many others designers will use a decoupled design discipline where moderate size synchronous units 10 100K gates are connected by decoupling FIFOs or channels Lecture 2 Design Representations 17 CS250 UC Berkeley Fall 2009 Decoupled Architectures and Unit Transaction Level Design Lecture 2 Design Representations 18 CS250 UC Berkeley Fall 2009 CS250 Design Refinement Application C C Architecture Design Manual UTL C C Architecture Design Manual RTL Verilog Synthesis Place Route Automated Gates Stdcell Library Lecture 2 Design Representations 19 CS250 UC Berkeley Fall 2009 Unit Transaction Level Design Arch State Arch State Arch State Unit 1 Unit 3 Unit 2 Shared Memory Unit Model design as messages flowing through FIFO buffers bet ween units containing architectural state Each unit can independently perform an operation or transaction that may consume messages update local state and send further messages Transaction and or communication might take many cycles Have to design RTL of unit microarchitecture during design refinement Lecture 2 Design Representations 20 CS250 UC Berkeley Fall 2009 Unit Architectural State Arch State Architectural state is any state that is visible to an external agent i e architectural state can be obser ved by sending strings of packets into input queues and looking at values returned at outputs High level
View Full Document