Lecture 10b: Implementing DSP Functionality: AlternativesSystem Implementation ChoicesMaking a Successful Comparison - 1Making a Successful Comparison - 2Making a Successful Comparison - 3Making a Successful Comparison - 4Making a Successful Comparison - 5Viterbi AlgorithmViterbi Decoders in digital communication systemsConvolutional Coder and Trellis diagramACS recursion for M = 2Viterbi Decoder block diagramCharacteristic of a 2-bit step-at-zero quantizerArchitectureNode parallel ACS architectureAlternative ImplementationsButterfly trellis structure and resource sharing for the K = 3, rate 1/2 codeSurvivor Memory UnitREA hardware architectureDecoded Sequence: 0 0 ... 0 1 0Viterbi Project ConstraintsViterbi Decoder Implementation on an ARMARM OverviewAlgorithm TweakingReducing Memory FootprintSimulation ResultsSummaryConclusion/ThanksViterbi Decoder Implementation on a TI C54xIntroductionViterbi Decoder SpecificationsC54x CapabilitiesHelpful Instructions for the Viterbi DecoderButterfly ImplementationTI TMS320VC5402 DSPDataflowImplementation AnalysisImplementation ResultsPower CalculationArea EstimateDevelopment CostConclusionACS TIE Extension with State (ACS)Tensilica Viterbi ImplementationTensilica FlowXtensa ArchitectureViterbi ArchitectureTIE SetupBMreg (ACS)ACS TIE Extension (ACS)Slide 50TIE Zmask (TraceBack)DesignsPerformanceEnergy Dissipationn(s*J)/BitDie AreaConclusionsSoft Core Viterbi DecoderHigh Level ArchitectureBranch & Path Metric GenerationACS ArchitectureTraceback ArchitectureDesign FlowSynthesis and SRAM GenerationSimulation ModelsBER Simulation ResultsSRAMSRAM: Power NumbersSlide 69SRAM: Timing NumbersPlace and RouteWiring StatisticsFinal Placement and RoutingStatic Timing ChecksStatic Power ChecksDelay and Energy ScalingPerformance ResultsSummary NORMALIZED (100kbs)Summary MAX PERFORMANCE1Kurt KeutzerLecture 10b: Implementing DSP Functionality:AlternativesPrepared by: Professor Kurt KeutzerComputer Science 252, Spring 2000With contributions from:Prof. Heinrich Meyr, University of AachenPhilip Chong, David Chinnery, Rhett Davis, Paul Husted,Niraj Shah, Chris Taylor, Scott Weber, Ning Zhang2Kurt KeutzerSystem Implementation ChoicesDSP CoreProgramROMCoefficientROMControlEMBEDDEDCORE µP/DSPOFF-THESHELF µP/DSPDSPAPPLICATIONSPECIFIC µP (ASIP)ASICSystem FunctionalitySystem FunctionalityASIP CoreProgramROMCoefficientROMControl3Kurt KeutzerMaking a Successful Comparison - 1Find an interesting application kernelviterbi decoding for speech processing (not a full modem!)Find realistic constraints native to the application n=2, K=7, QPSK, 100KBS, BER= 10^-4Find architectures/implementations that are promising for the application TI TMS320C54, Tensilica XtensaWhat are the relevant features of this architecture that support this application?Fix application constraints across all implementations (above)Fix key parameters for implementation comparisonperformance (constraint)areapower4Kurt KeutzerMaking a Successful Comparison - 2Identify how key parameters will be measuredperformance - instruction set simulator, eval boardarea - data sheets, gate estimatespower - eval board, TI application noteImplement your application kernelExamine different algorithmsStart with code downloaded from the web - multimedia benchmarks etc. Build your software development/evaluation environment:http://www.ti.com/sc/docs/tools/dsp/6ccsfreetool.htm5Kurt KeutzerMaking a Successful Comparison - 3Implement your application kernel (cont)Phase 0: ResearchFind application notes, research reports for your own or comparable architecturesPhase 1: Estimation Develop a quick estimate based on initial codeIntegrate research findings Do a quick back-of-envelope reality checkPhase 2: Real implementation/Tuning Tailor algorithm, implementation to architecture Do your very best! Have a contest with your partnerPhase 3: EvaluationApply evaluation tools to key parameters Evaluate and compare results - return to 2If your life depended on choosing the right part - what would you do?6Kurt KeutzerMaking a Successful Comparison - 4Final evaluation and comparison - compare all implementationsTo evaluate for a product - everything is fair gameTo evaluate principally the architectures - need to consider: Fab differences - TSMC vs. IBM (10-20% faster)process differences - .35 micron vs. .25 (50% faster)power supply differences 3.0V vs. 1.5V asic vs. custom implementations - (2x faster)Now evaluate - if I was the architect of this processor/implementor of this system on a chip, what would I do differently? cache sizes register availability additional instructions on chip memory7Kurt KeutzerMaking a Successful Comparison - 5Just for fun …In addition to primary constraints (speed, cost, power)final real world considerations business relationships (joint partnership with Lucent) Time-to-market issues time to configure?software development environmentlibrary/application software support application engineering support8Kurt KeutzerViterbi AlgorithmProf. Heinrich MeyrUniversity of Aachen9Kurt KeutzerViterbi Decoders in digital communication systemsSignal Source Source CoderConvolutional orTrellis Coder &MapperModulatorChannelViterbi DecoderSource Decoder DemodulatorSignal Sinkinformation bits channel symbols ckreceived symbols ykdecoded bits10Kurt KeutzerConvolutional Coder and Trellis diagram0 k k+1 Tx0123ss0,k 0,k+1s s3,k 3,k+1z-1z-1++ukcodesymbolsMapperchannelsymbolsmodulo 2additionxx1,k 0,kkyknown startstate X =00 Tadditivewhitenoise nCONVOLUTIONAL CODERVITERBI DECODERCHANNELkinformationbitsuk-1uk-2T-1BPSKkckbkb = 1ikb = 0iSurvivor Memoryknown endstate X =0decoded bitsdecisions11Kurt KeutzerACS recursion for M = 2Max { , }(1,i)ksurvivor pathcompeting path (1,i)kZ(0,i),k-1Z(1,i),k-1(1,i)k i,kd = 1i,k (1,i)k(0,i)kZ(0,i),k-1(0,i)k (0,i)kZ(1,i),k-112Kurt KeutzerViterbi Decoder block diagramTMU ACSU SMULatchchannelsymbols ykbranchmetricsstatemetricskdecisionbitsdecodedbits u13Kurt KeutzerCharacteristic of a 2-bit step-at-zero quantizerQ=-2Q=-1Q=0Q=1 saturationsaturation-2-112normalizedinputlevelInterpretation12-1-214Kurt KeutzerArchitecture15Kurt KeutzerNode parallel ACS architecture(0,i)kShuffle-ExchangeNetwork0,k1,kN-1,k(1,i)kACSACSACS01N-1TMURegisterSMUdecisionsdec(i,k)16Kurt KeutzerACSACSACSACSMMMMbutterfly
View Full Document