TIMING DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING BARIS TASKIN JOHN WOOD IVAN S KOURTEV February 28 2005 Research Objective Objective Electronic design automation and synchronization of digital IC systems with rotary resonant clocking technology Clocking at GHz Problems Low skew low jitter uncharacteristic Timing violations Power dissipation Some solutions Multi domain clocking Skew tolerant multi phase clocking Alternative technologies Optical clocking Transmission line based clocking Resonant Clocking 1 Oscillator Type Phase Voltage Coupled LC Constant Constant Standing Wave Constant Variable Traveling Wave Variable Constant 1 Abstract from IBM Research http www reseach ibm com compsci project spotlight vlsi Transmission Line Long interconnect L variation with process 1 C Variation with process 30 Vp variation with process 15 Mobius Termination Shunt connected inverters between lines fosc 1 L 2 laps to complete 360o phase Rotary Clock Waveforms Waveforms for line voltage and line current at 2 4GHz Rotary Clocking Low jitter 6ps for 2 4GHz 0 25um 1 of clock period Non sinusoidal clock signal 20ps rise and fall times 0 25um 5 of the clock period 16GHz theoretical upper limit in 0 25um Rotary Cycles 360o Phase ring Multi phase No distribution generated across the die Energy preserving Self replenishing ASIC Implementation Capacitive Loading Reduce propagation velocity Independent of parasitic capacitance Increase current in wires but no CV2f power Rotary Wires for ASIC Rotary distribution Synchronous components Modes of Operation ASIC drive Global rotary clock to synchronize any number of Derived clocks Other global signals Reset Enable Step Scan Retain standard FFs Minimal flow impact Direct Drive Maximum power benefit One high frequency clock grid over whole chip directly driving all FFs Custom FFs for lowest power Modified flow DFF Load High internal capacitance High dynamic power consumption Direct drive Rotary clock drives Nfet and Pfet pass devices directly Latch Load Less clocked C save CV2f power No need to gate clock only data Rotary Modes CAD Extraction and Simulation RLC extraction for rotary RC for data Fast SPICE for confirmation Internal STA engine Physical Design Flow D E S IG N E N T R Y Partitioning P A R T IT IO N IN G R O A S IZ E P A R T IT IO N IN G R E G IS T E R IN S E R T IO N NO YES CSS R O A F E A S IB L E C L O C K S K E W S C H E D U L IN G C S S o n P A R T IT IO N I C S S o n P A R T IT IO N N CSS on TO P BLO CK NO YES C S S F E A S IB L E Placement PLACEM ENT R E G IS T E R M A P P IN G L O G IC P L A C E M E N T CAD Placement Route 1 Select rotary rings Physical implementation CAD Placement Route 2 Clock Pin Identify communicating register toregister paths Partitioning Static timing analysis Clock skew scheduling 28 average improvement Parallelization CSS Parallelization 10k registers 25k local paths 2 5 hours 10 10 rotary clocking 150 registers 500 paths 2 secs Speed up 44X without parallelization 1286X with parallelization Sub optimality CAD Placement Route 3 45o 225o 0o 180o 270o 90o T 4 delay 315o 135o Pre place register banks Map registers to phase Proceed with logic synthesis Conclusions Look ahead to next generation Rotary clocking Non zero clock skew Parallelization Implementation results to follow TIMING DRIVEN PHYSICAL DESIGN FOR DIGITAL SYNCHRONOUS VLSI CIRCUITS USING RESONANT CLOCKING QUESTIONS DESIGN AND TIMING ANALYSIS OF LEVEL SENSITIVE DIGITAL INTEGRATED CIRCUITS BACKUP SLIDES Clock Period Minimization Problem 1 Objective function min T Problem variables For each register Ri Earliest latest arrival times ai Ai Earliest latest departure times di Di Clock signal delay ti Clock Period Minimization Problem 2 Problem Parameters For each register Ri Clock to output delay DCQ Data to output DDQ Setup time Si Hold time Hi For each local data path Ri Rj Data propagation time DPif Practical Causes of Clock Skew Size Mismatches Buffer Size Interconnect length Process Variations Leff Tox etc Temperature Gradients Power Supply Voltage Drop Rotary Implementation Odd number of crossovers Multi phase Relative phase information on ring Non zero clock skew Cross coupled inverters Low power Capacitive Loading of the Rotary Ring 4 5 pF each side 0 13u x 10834 e 18 micron sq 12 7 fF on each gate Assume 10 fF on each line for wiring cap of spur 22 7 fF 200 loads 4 5 pF each side Benefits of Rotary Clock Architecture No practical upper frequency limitation No practical size limitation Negates the dynamic clock power Guaranteed near zero skew Precise skew scheduling possible Negligible jitter Benefits of Rotary Clock Architecture cont d Largely independent of Process variations Temperature variations Supply voltage Inherently low noise No SSN generated by clock Differential Greater immunity to noise Less generation of noise Benefits of Rotary Clock Architecture cont d Works for all existing IC processes Short and predictable design cycle Automated CAD tooling
View Full Document
Unlocking...