DOC PREVIEW
AUBURN ELEC 7770 - Reducing Power through Multicore Parallelism

This preview shows page 1-2-23-24 out of 24 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 24 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

ELEC 7770 Advanced VLSI Design Spring 2007 Reducing Power through Multicore ParallelismPower Dissipation in CMOS Logic (0.25µ)Low-Power Datapath ArchitectureA Reference DatapathA Parallel ArchitectureLevel Converter: L to HLevel Converter: H to LControl Signals, N = 4PowerVoltage vs. SpeedIncreasing MultiprocessingExtreme Cases: Vt = 0Example: Multiplier CoreA Multicore DesignHow Many Cores?Design TradeoffsPower Reduction in ProcessorsParallel ArchitecturePipeline ArchitectureApproximate TrendMulticore ProcessorsSlide 22Cell - Cell Broadband Engine ArchitectureCell’s Nine-Processor ChipSpring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)11ELEC 7770ELEC 7770Advanced VLSI DesignAdvanced VLSI DesignSpring 2007Spring 2007Reducing Power through Multicore ParallelismReducing Power through Multicore ParallelismVishwani D. AgrawalVishwani D. AgrawalJames J. Danaher ProfessorJames J. Danaher ProfessorECE Department, Auburn UniversityECE Department, Auburn UniversityAuburn, AL 36849Auburn, AL [email protected]@eng.auburn.eduhttp://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr07http://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr07Spring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)22Power Dissipation in CMOS Power Dissipation in CMOS Logic (0.25µ)Logic (0.25µ)%75 %5%20PPtotaltotal (0→1) = (0→1) = CCLL V VDDDD22 + + ttscscVVDDDD I Ipeakpeak ++ VVDDDDIIleakageleakageCLVDDVDDSpring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)33Low-Power Datapath ArchitectureLow-Power Datapath ArchitectureLower supply voltageLower supply voltageThis slows down circuit speedThis slows down circuit speedUse parallel computing to gain the speed backUse parallel computing to gain the speed backWorks well when threshold voltage is also lowered.Works well when threshold voltage is also lowered.About 60% reduction in power obtainable.About 60% reduction in power obtainable.Reference: A. P. Chandrakasan and R. W. Brodersen, Reference: A. P. Chandrakasan and R. W. Brodersen, Low Power Digital CMOS DesignLow Power Digital CMOS Design, Boston: Kluwer , Boston: Kluwer Academic Publishers (Now Springer), 1995.Academic Publishers (Now Springer), 1995.Spring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)44A Reference DatapathA Reference DatapathCombinationallogicOutputInputRegisterRegisterCKSupply voltage = VrefTotal capacitance switched per cycle = CrefClock frequency = fPower consumption: Pref= CrefVref2fCrefSpring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)55A Parallel ArchitectureA Parallel ArchitectureComb.LogicCopy 1Comb.LogicCopy 2Comb.LogicCopy NRegisterRegisterRegisterRegisterN to 1 multiplexerMultiphaseClock gen. and muxcontrolInputOutputCKff/Nf/Nf/NEach copy processes every Nth input, operates at reduced voltageSupply voltage:VN ≤ V1 = VrefN = Deg. of parallelismSpring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)66Level Converter: L to HLevel Converter: L to HVin_LVout_HVDDHVDDLTransistors with thicker oxide and longer channelsN. H. E. Weste and D. Harris, CMOS VLSI Design, ThirdEdition, Section 12.4.3, Addison-Wesley, 2005.Spring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)77Level Converter: H to LLevel Converter: H to LVin_HVout_LVDDLTransistors with thicker oxide and longer channelsN. H. E. Weste and D. Harris, CMOS VLSI Design, ThirdEdition, Section 12.4.3, Addison-Wesley, 2005.Spring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)88Control Signals, N = 4Control Signals, N = 4CKPhase 1Phase 2Phase 3Phase 4Spring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)99PowerPowerPN = Pproc + PoverheadPproc = N(Cinreg+ Ccomb)VN2f/N + CoutregVN2f= (Cinreg+ Ccomb+Coutreg)VN2f= CrefVN2fPoverhead= CoverheadVN2f ≈ δCref(N – 1)VN2fPN= [1 + δ(N – 1)]CrefVN2fPN VN2── = [1 + δ(N – 1)] ───P1 Vref2Spring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)1010Voltage vs. SpeedVoltage vs. Speed CLVref CLVrefDelay of a gate, T ≈ ──── = ────────── I k(W/L)(Vref – Vt)2 where I is saturation currentk is a technology parameterW/L is width to length ratio of transistorVt is threshold voltageSupply voltageNormalized gate delay, T4.03.02.01.00.0VtVref =5VV2=2.9VN=1N=2V3N=31.2μ CMOSVoltage reduction slows down as we get closer to VtSpring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)1111Increasing MultiprocessingIncreasing MultiprocessingPN/P11 2 3 4 5 6 7 8 9 10 11 121.00.80.60.40.20.0Vt=0V (extreme case)Vt=0.4VVt=0.8VN1.2μ CMOS, Vref = 5VSpring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)1212Extreme Cases: VExtreme Cases: Vtt = 0 = 0Delay, T α 1/ VrefFor N processing elements, delay = NT → VN = Vref/NPN1── = [1+ δ (N – 1)] ── → 1/NP1N2For negligible overhead, δ→0PN 1── ≈ ──P1N2For Vt > 0, power reduction is less and there will be an optimum value of N.Spring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)1313Example: Multiplier CoreExample: Multiplier CoreSpecification:Specification:200MHz Clock200MHz Clock15W dissipation @ 5V15W dissipation @ 5VLow voltage operation, VLow voltage operation, VDDDD ≥ 1.5 volts ≥ 1.5 volts (V(VDDDD – 0.5) – 0.5)22 Relative clock rate = Relative clock rate = ────────────── 20.2520.25Problem:Problem:Integrate multiplier core on a SOCIntegrate multiplier core on a SOCPower budget for multiplier ~ 5WPower budget for multiplier ~ 5WSpring 07, Feb 20Spring 07, Feb 20ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal)1414A Multicore DesignA Multicore DesignMultiplierCore


View Full Document

AUBURN ELEC 7770 - Reducing Power through Multicore Parallelism

Download Reducing Power through Multicore Parallelism
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Reducing Power through Multicore Parallelism and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Reducing Power through Multicore Parallelism 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?