DOC PREVIEW
Berkeley COMPSCI 152 - Introduction to Architectures for Digital Signal Processing

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 152 Computer Architecture and Engineering Introduction to Architectures for Digital Signal ProcessingProcessor ApplicationsThe Processor Design SpaceWorld’s Cellular SubscribersMultimedia I/O ArchitectureEmbedded applicationsRequirements of the Embedded ProcessorsArea of processor cores = CostAnother figure of merit Computation per unit areaNational Semiconductor - Embedded Processor FamilyCode sizeExample application (single chip system)The DSP Module (DSPM)The National DSP Module ArchitectureThe 486 “Embedded Processor”The “Embedded” Features of the 486 GXPower = C V2 fclockCharacterizing programs for their energy consumptionAn architecture optimized for multiply-accumulate1CS 152 Computer Architecture and EngineeringIntroduction to Architectures for Digital Signal ProcessingNov. 12, 1997Bob Brodersen (http://infopad.eecs.berkeley.edu)2Processor Applications•General Purpose - high performance–Pentiums, Alpha’s, SPARC–Used for general purpose software –Heavy weight OS - UNIX, NT–Workstations, PC’s•Embedded processors and processor cores–ARM, 486SX, Hitachi SH7000, NEC V800–Single program–Lightweight, often realtime OS–DSP support–Cellular phones, consumer electronics (e.g. CD players) •Microcontrollers –Extremely cost sensitive–Small word size - 8 bit common–Highest volume processors by far–Automobiles, toasters, thermostats, ... IncreasingCostIncreasingvolume3The Processor Design SpaceCostPerformanceMicroprocessorsPerformance iseverything& Software rulesEmbeddedprocessorsMicrocontrollersCost is everythingApplication specific architecturesfor performanceWorld’s Cellular Subscribers01002003004005006007001993 1994 1995 1996 1997 1998 1999 2000 2001MillionsYearDigitalAnalogSource: Ericsson Radio Systems, Inc.Will providea ubiquitousinfrastructurefor wirelessdata as wellas voice5Multimedia I/O Architecture Low Power BusRadioModemEmbedded ProcessorFifoVideoDecompVideoAudioFBFifoGraphicsPenSched ECC PactInterfaceDataFlowSRAM6Embedded applications•Future chips will be a mix of processors, memory and dedicated hardware for specific algorithms and I/OµPDSPComsVideo UnitcustomMemoryUplink RadioDownlink RadioGraphics OutVideo I/OVoice I/OPen InE.g. Multimedia terminal electronics7Requirements of the Embedded Processors•Optimized for a single program - code often in on-chip ROM or off chip EPROM•Minimum code size (one of the motivations initially for Java)•Performance obtained by optimizing datapath•Low cost–Lowest possible area–Technology behind the leading edge–High level of integration of peripherals (reduces system cost)•Fast time to market–Compatible architectures (e.g. ARM) allows reuseable code–Customizable core•Low power if application requires portability8Area of processor cores = CostNintendo processorCellular phones9Another figure of meritComputation per unit areaNintendo processorCellular phones???10National Semiconductor - Embedded Processor Family•Simple architecture•3 stage pipeline - fetch - decode - execute•Minimum power and size–Short pipeline avoids branch prediction and bypass–Versions range from 8-64 bit - choose minimum that meets requirements11Code size•If a majority of the chip is the program stored in ROM, then code size is a critical issue•The Piranha has 3 sized instructions - basic 2 byte, and 2 byte plus 16 or 32 bit immediate12Example application (single chip system)13The DSP Module (DSPM)•Vector instructions directly supported•Pipelined datapath supprts single cycle: Multiply, Add, Shift, Load/Store and Pointer adjustment• Operates in parallel to processor core•Saturation, overflow and rounding for ALU operations•Automatic support for cyclic buffers (modulo arithmetic)14The National DSP Module ArchitectureSingle cycle MAC support is typical for DSP acceleration Three simultaneous addressesZero overhead repeatX Y Z15The 486 “Embedded Processor”16The “Embedded” Features of the 486 GX•Said to be designed “for embedded battery-operated and hand-held applications” (???)•Fully static design (clock can stop and all state is kept)•“Auto Clock Freeze” stops circuits which are not being used in a given instruction (gated clocks)•Stop Clock (60 W), Stop Grant - clock runs but no program execution (40-85 mW)•Split power supply - 2.0-3.3 Volt core, 3.3V. I/O,17Power = C V2 fclock 130 mW 350 mW 430 mW 290 mW 190 mW 540 mW 490 mW 730 mW 17 mW 23 mW 30 mW 20 mW PowerNote the clock rates18Characterizing programs for their energy consumptionProcess Subframe 330WComputeLag 107WIFilterCodebook 63WQuantizeGains 46WCodebookSearch 44WComputeWeightedInput 22WUpdateFilterState 8WOrthogonalizeCodebook 6WThetaToCodeword 8WComputeLag(...){R=dotprod(res,res);for (lag=0..127){lp=getLT(lt);G = dotprod(lp, lp);}}Top four functions account for 90 % of the power65% of power dissipation in dot-vector products(data obtained from profiling of C++-code, weighted withestimated instruction energy costs)19An architecture optimized for multiply-accumulate Energy/Flexibility Tradeoff’sArm 6 core (5V, 20 MHz):.02 MIPS/mWZSP DSP Superscaler (3V, 200 MHz).3 MOPS/mWReconfigurable Dot-Vector Processor(1.5V, 30 MHz)5.9 MIPS/mW* MOPS = millions of operations/sec = millions of MACS/secAddressGen AddressGenMemory MemoryMAC MACControlProcessorL


View Full Document

Berkeley COMPSCI 152 - Introduction to Architectures for Digital Signal Processing

Documents in this Course
Quiz 5

Quiz 5

9 pages

Memory

Memory

29 pages

Quiz 5

Quiz 5

15 pages

Memory

Memory

29 pages

Memory

Memory

35 pages

Memory

Memory

15 pages

Quiz

Quiz

6 pages

Midterm 1

Midterm 1

20 pages

Quiz

Quiz

12 pages

Memory

Memory

33 pages

Quiz

Quiz

6 pages

Homework

Homework

19 pages

Quiz

Quiz

5 pages

Memory

Memory

15 pages

Load more
Download Introduction to Architectures for Digital Signal Processing
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction to Architectures for Digital Signal Processing and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction to Architectures for Digital Signal Processing 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?