Slide 1Real power saving implies specialized hardwareEconomic relevanceSoC Trajectory: more application specific blocksMaking hardware design easierIP Reuse sounds wonderful until you try it ...Bluespec promotes composition through guarded interfacesBluespec: A new way of expressing behavior using Guarded Atomic ActionsBluespec Tool flowRecent ApplicationsImportance of Publishing Bluespec DesignsMulti-radio OFDM workbenchIP Reuse via parameterized modules Example OFDM based protocols802.11a Architectural Exploration (Only the IFFT block is changing) [MEMOCODE 2006]Video Codec: H.264H.264 Video DecoderSequential code from ffmpegParallelizing the C code First step towards hardware generation from CH.264 LearningsH.264 Design ExplorationBluespec for System Modeling and SynthesisA typical SoC modelModeling ConcurrencyModular refinementOther ongoing collaborative projectsHardware synthesis: C-based tools vs BluespecCurrent researchBluespec promotes good Design methodologyFebruary 13, 2008 L04-1http://csg.csail.mit.edu/6.375Bluespec: The need for a new design methodologyArvindComputer Science & Artificial Intelligence Lab.Massachusetts Institute of TechnologyFebruary 13, 2008February 13, 2008L04-2http://csg.csail.mit.edu/6.375Real power saving implies specialized hardwareH.264 implementations in software vs hardware the power/energy savings could be 100 to 1000 foldbut our mind set is that hardware design isDifficult, riskyIncreased time-to-market Inflexible, brittle, error prone, ...How to deal with changing standards, errorsNew design flows and tools can change this mind setFebruary 13, 2008L04-3http://csg.csail.mit.edu/6.375Economic relevanceCell phones, PDAs, sensors, ... Demand a much greater variety of chipsCost of development, business risks, ... Forces us towards specialization primarily through softwareNew tools can enable a much greater variety of chipsFebruary 13, 2008L04-4http://csg.csail.mit.edu/6.375SoC Trajectory:more application specific blocksOn-chip memory banksStructured on-chip networksGeneral-purpose processorsApplication-specific processing unitsCan we rapidly produce high-quality chips and surrounding systems and software?February 13, 2008L04-5http://csg.csail.mit.edu/6.375Making hardware design easierExtreme IP reuseMultiple instantiations of a block for different performance and application requirementsPackaging of IP so that the blocks can be assembled easily to build a large system (black box model)Whole system simulation to enable concurrent hardware-software development Need new methods and tools to accomplish this goal“Intellectual Property”February 13, 2008L04-6http://csg.csail.mit.edu/6.375data_inpush_req_npop_req_nclkrstndata_outfullemptyIP Reuse sounds wonderful until you try it ...Example: Commercially available FIFO IP blockThese constraints are spread over many pages of the documentation...No machine verification of such informal constraints is feasibleBluespec can change all thisFebruary 13, 2008L04-7http://csg.csail.mit.edu/6.375Bluespec promotes compositionthrough guarded interfacesnot fullnot emptynot emptynnrdyenabrdyenabrdyenqdeqfirstFIFOtheModuleAtheModuleBtheFifo.enq(value1);theFifo.deq();value2 = theFifo.first();theFifo.enq(value3);theFifo.deq();value4 = theFifo.first();theFifoEnqueue arbitration controlDequeue arbitration controlSelf-documenting interfaces; Automatic generation of logic to eliminate conflicts in use.February 13, 2008L04-8http://csg.csail.mit.edu/6.375Bluespec: A new way of expressing behavior using Guarded Atomic Actions Formalizes composition Modules with guarded interfacesCompiler manages connectivity (muxing and associated control)Powerful static elaboration facilityPermits parameterization of designs at all levelsTransaction level modelingAllows C and Verilog codes to be encapsulated in Bluespec modules Smaller, simpler, clearer, more correct code not just simulation, synthesis as wellBluespecFebruary 13, 2008L04-9http://csg.csail.mit.edu/6.375Bluespec Tool flowBluespec SystemVerilog sourceVerilog 95 RTLVerilog simVCD outputDebussyVisualizationBluespec CompilerRTL synthesisgatesCBluesimCycleAccurateFPGAPower estimation toolPower estimation toolWorks in conjunction with exiting tool flowsFebruary 13, 2008L04-10http://csg.csail.mit.edu/6.375Recent ApplicationsMultiradio OFDM: From WiFi to WiMax802.11a and 802.16 from the same sourceH.264 Decoder Baseline profile, 720p X ~75 framesFPGA implementation workingOther examples: Processors, Cache Coherence Protocols, IP Lookup, ...Research sponsors have agreed to publish all designs done at MIT under the MIT open source licenseFebruary 13, 2008L04-11http://csg.csail.mit.edu/6.375Importance of Publishing Bluespec DesignsEnables whole community to undertake much more ambitious projectsWe already see the effects in 6.375 projectsEnables derivative designs, specializations and variety at a fraction of the development costFebruary 13, 2008 L04-12http://csg.csail.mit.edu/6.375Multi-radio OFDM workbench[MEMOCODE 2006, MEMOCODE 2007]February 13, 2008L04-13http://csg.csail.mit.edu/6.375IP Reuse via parameterized modulesExample OFDM based protocolsMACMACstandard specificpotential reuseScramblerFECEncoderInterleaver MapperPilot &GuardInsertionIFFTCPInsertionDe-ScramblerFECDecoderDe-InterleaverDe-MapperChannelEstimaterFFT SynchronizerTXControllerRXControllerS/PD/AA/DDifferent algorithmsDifferent throughput requirementsReusable algorithm with different parameter settingsWiFi: 64pt @ 0.25MHzWiMAX: 256pt @ 0.03MHzWUSB: 128pt 8MHz85% reusable code between WiFi and WiMAXFrom WiFi to WiMAX in 4 weeks(Alfred) Man Chuek Ng, …WiFi:x7+x4+1WiMAX:x15+x14+1WUSB:x15+x14+1ConvolutionalReed-SolomonTurboFebruary 13, 2008L04-14http://csg.csail.mit.edu/6.375802.11a Architectural Exploration(Only the IFFT block is changing) [MEMOCODE 2006]IFFT Design Area (mm2)Symbol Latency (CLKs)ThroughputLatency(CLKs/sym)Min. Freq RequiredAverage Power(mW)Pipelined 5.25 12 04 1.0 MHz 4.92Combinational 4.91 10 04 1.0 MHz 3.99Folded(16 Bfly-4s)3.97 12 04 1.0 MHz 7.27Super-Folded(8 Bfly-4s)3.69 15 06 1.5 MHz 10.9SF(4 Bfly-4s) 2.45 21 12 3.0 MHz 14.4SF(2 Bfly-4s) 1.84 33 24 6.0 MHz 21.1SF (1 Bfly4) 1.52 57 48 12 MHZ 34.6TSMC .18 micron; numbers reported are before place and route.(DesignCompiler), Power numbers are from Sequence PowerTheaterThese designs were done in ~ 3 man-daysFebruary 13, 2008 L04-15http://csg.csail.mit.edu/6.375Video Codec:
View Full Document