DSP Enabled Processor Design T H A D D GROUP TOM DUAN HELEN YU ANDY LEE DANNY HUANG DAWEY HUANG Agenda Datapath Design Memory Subsystem Power Optimization Performance 5 Stage Pipeline IF ID PIPELINE REG ID EX PIPELINE REG EX MEM PIPELINE REG MEM WB PIPELINE REG INSTRUCTION REGISTER DATA CACHE FILE CACHE BRANCH LOGIC JUMP LOGIC ALU MAC MAC STAGE 1 STAGE 2 Multiply and Accumulate 2 stage pipeline multiplier No stalling when LW followed by MAC ID EX PIPELINE REG EX MEM PIPELINE REG MEM WB PIPELINE REG REGISTER FILE MULTIPIER 1 MULTIPIER 2 Critical Path WB stage MEM WB PIPELINE REG 16 16 MAC FROM DATA MEMORY STAGE 2 32 32 32 32 TO REGISTER FILE M U X 32 Memory Subsystem 2x clock rate of processor 3 controllers sdram instruction block data block asynchronous component interface arbitrator Clock Divider CLK ICLK Counter CLK2X Memory Subsystem Diagram DATA CACHE BLOCK INSTRUCTION CACHE BLOCK CACHE data CONTROL CONTROL miss address ready ARBITRATOR miss address MAIN VICTIM CACHE CACHE B U F F E R data ready address SDRAM BLOCK CONTROLLER SDRAM GIVEN ready Cache Organization Organization Block size Cache size Replacement Policy Write Policy Victim Cache Instruction Cache Data Cache Direct mapped 2 way Set Associative 4 words 4 words 5 blocks 20 words 7 blocks 28 words None Random toggle None Write through w buffer 5 blocks 20 words Instruction Cache Controller ADDRESS Controller FSM ADDRESS CLK 5 BLOCKS EACH 4 WORDS DATA HIT IDLE WORD READ WRITE DISABLE CHECK SDRAM READY SDRAM DATA Cache Blocks MISS MISS SDRAM ADDRESS DOUT Data Cache Data Cache Victim Cache Power Reduction Methods Limiting VHDL sensitivity list Balance input arrival Enable Disable components Eliminate unnecessary control signals data buses Minimize execution time to lower supply voltage Power Consumption of Components component ifidreg exmemreg shftadd mul2 comparator alu idexreg pc reg write buf mul1 icache m32x4 fwunit dcache controller mux32x2 hazard regfile sdram Total power energy uJ 10 28 12 69 13 78 15 12 17 27 17 82 25 69 30 83 35 60 45 68 50 46 53 54 62 01 74 43 84 14 95 26 158 62 174 37 45 18 1107 98 Supply voltage 2 5Volts instr cache instr control 32 88 15 55 victim cache data cache data control 12 59 11 37 50 46 sdram sdram control 44 56 0 62 Component Optimization Results 1 800 00 Supply voltage 2 5Volts 699 87 700 00 600 00 energy consumed uJ 505 96 500 00 466 20 400 00 300 00 262 48 196 04 200 00 158 56 149 35 100 00 74 42 45 18 48 39 174 37 62 01 0 00 SDRAM INSTRUCTION MEM DATA MEM FORW ARDING HAZARD UNIT REGISTER FILE Component Optimization Results 2 180 00 174 47 Supply voltage 2 5Volts 157 99 160 00 140 00 energy consumed uJ 120 00 116 71 100 00 80 00 76 28 60 00 47 14 37 72 40 00 20 00 5 45 0 11 0 00 REGISTER ALU ADDER MUX SHIFTER Supply Voltage Reduction Results 180 0 174 4 160 5 158 6 160 0 140 0 120 0 110 4 energy consumed uJ 110 03 100 0 86 56 84 1 80 0 60 0 62 0 60 80 58 0 57 1 48 42 45 18 44 81 38 82 40 0 29 7 20 43 21 89 registers multiplier 22 3 17 24 20 0 mux instr block data block sdram fwunit controller hazard regfile component mux registers multiplier instr block data block sdram fw unit controller hazard regfile 2 5V 160 5 86 56 60 80 48 42 110 03 45 18 62 0 84 1 158 6 174 4 1 5V 58 0 20 43 21 89 17 24 38 82 44 81 22 3 29 7 57 1 110 4 diff 64 76 64 64 65 1 64 65 64 37 Supply Voltage Comparison 2 5V Total pow er uJ 1 059 0 Cycle time ns 32 0 Execution time us 411 0 1 5V Diff 397 7 62 52 0 63 651 7 59 Design Challenges what we learned power optimization concepts what surprised us component interface timing what challenged us reducing cache miss Conclusion A Very Rewarding Project Excellent Performance Can Sleep Again
View Full Document
Unlocking...