VLSI Systems Design CS250 Fall 2020 John Wawrzynek with Arya Reais Parsi Lecture 03 Recon gurable Architectures 1 CS250 UC Berkeley Fall 20 Implementation Alternatives All circuits transistors layouts optimized for application Arrays of small function blocks gates FFs automatically placed and routed Partially prefabricated wafers customized with metal layers or vias Prefabricated chips customized with loadable latches or fuses Instruction set interpreter customized through software Special instruction set interpreters ex DSP NP GPU TPU Full custom Standard cell Gate array structured ASIC FPGA Microprocessor Domain Speci c Processor By ASIC most people mean Standard cell based implementation What are the important metrics of comparison Lecture 03 Recon gurable Architectures 1 2 CS250 UC Berkeley Fall 20 The Important Distinction Instruction Binding Time When do we decide what operation needs to be performed General Principles Earlier the decision is bound the less area delay energy required for the implementation Later the decision is bound the more flexible the device A DeHon Lecture 03 Recon gurable Architectures 1 3 CS250 UC Berkeley Fall 20 Full Custom Circuit styles and transistors are custom sized and drawn to optimize die size power performance High NRE non recurring engineering costs Time consuming and error prone layout Optimizing for small die can result in low per unit costs extreme low power or extreme high performance Common for analog design Requires full set of custom masks High NRE usually restricts use to high volume applications markets or highly constrained and cost insensitive markets Lecture 03 Recon gurable Architectures 1 4 CS250 UC Berkeley Fall 20 Each cell comes complete with Standard Cell Based around a set of pre designed and veri ed cells Ex NANDs NORs Flip Flops counters bu ers layout perhaps for di erent technology nodes and processes Simulation delay power models Chip layout is automatic reducing NREs usually no hand layout Requires full set of masks nothing prefabricated Non optimal use of area and power leading to higher per die costs than full custom Commonly used with other predesigned blocks large memories I O blocks etc Lecture 03 Recon gurable Architectures 1 5 CS250 UC Berkeley Fall 20 Modern ASIC Methodology and Flow RTL Synthesis Based HDL speci es design as RTL Verilog VHDL cell instantiations combinational logic state elements Cell instantiations needed for blocks not inferred by synthesis typically RAM Event simulation veri es RTL Formal veri cation compares logical structure of gate netlist to RTL Place route generates layout Timing and power checked statically or dynamically Layout veri ed with LVS and formal veri cation logic synthesis event simulator gate netlist cell place route GDS GDRC LVS other checks timing power analysis GDRC Lecture 03 Recon gurable Architectures 1 6 CS250 UC Berkeley Fall 20 Speci cation Semi Custom Chip Implementations Ex standard practice in microprocessors was that data paths were full custom and control instruction decode pipeline control in standard cells Now all generated with standard cells Control random logic di cult to regularize Relatively small percentage of die area power Permits late binding of design changes Lecture 03 Recon gurable Architectures 1 7 CS250 UC Berkeley Fall 20 Gate Array Store prefabricated wafers of active gate layers local interconnect comprising primarily rows of transistors Customize as needed with back end metal processing contact cuts metal wires Could use a di erent factory CAD software understands how to make gates but also possible to customize at the transistor circuit level Lecture 03 Recon gurable Architectures 1 8 CS250 UC Berkeley Fall 20 Gate Array Shifts large portion of design and mask NRE to vendor Shorter design and processing times reduced time to market Highly structured layout with fixed size transistors leads to large sub circuits ex Flip flops and higher per die costs Memory arrays are particularly inefficient so often prefabricated also Sea of gates structured ASIC master slice Lecture 03 Recon gurable Architectures 1 9 CS250 UC Berkeley Fall 20 Field Programmable Gate Arrays n Two dimensional array of simple logic and interconnection blocks n Typical architecture LUTs implement any function of n inputs n 3 in this case n Optional Flip flop with each LUT Fuses EPROM or Static RAM cells are used to store the con guration Here it determines function implemented by LUT selection of Flip op and interconnection points Many FPGAs include special circuits to accelerate adder carry chain and many special cores RAMs MAC Enet PCI SERDES Lecture 03 Recon gurable Architectures 1 10 CS250 UC Berkeley Fall 20 FPGA versus ASIC FPGA ASIC total cost FPGAs cost effective ASICs cost effective volume ASIC Higher NRE costs 10 s of M Relatively Low cost per die 10 s of or less FPGAs Low NRE costs Relatively low silicon efficiency high cost per part 10 s of to 1000 s of Cross over volume from cost effective FPGA design to ASIC was often in the 100K range But there s more to the story What s the value of recon gurability Lecture 03 Recon gurable Architecture 1 11 CS250 UC Berkeley Fall 20 System on chip SOC Brings together standard cell blocks custom analog blocks processor cores memory blocks embedded FPGAs Standardized on chip buses or hierarchical interconnect permit easy integration of many blocks Ex AMBA Sonics IP Block business model Hard or soft cores available from third party designers ARM inc is the shining example Hard and synthesizable RISC processors ARM and other companies provide Ethernet USB controllers analog functions memory blocks Qualcomm Snapdragon Pre veri ed block designs standard bus interfaces or adapters ease integration lower NREs shorten TTM Lecture 03 Recon gurable Architectures 1 12 CS250 UC Berkeley Fall 20 FPGA Overview Basic idea two dimensional array of logic blocks and ip ops with a means for the user to con gure program 1 the interconnection between the logic blocks 2 the function of each block Simpli ed version of FPGA internal architecture Lecture 03 Recon gurable Architectures 1 13 CS250 UC Berkeley Fall 20 US4870302 drawings page 6 png PNG Image 2320 3408 pi https patentimages storage googleapis com 55 b2 45 5c7fbd5 Invented in 1985 by Ross Freeman after founding Xilinx Original FPGA Lecture 03 Recon gurable Architectures 1 14 CS250 UC Berkeley Fall 20 xc2064 64 con gurable logic blocks 58 user input outputs xc2064 Lecture 03 Recon
View Full Document