DOC PREVIEW
MIT 6 893 - Issue Logic and Power/Performance Tradeoffs

This preview shows page 1-2-20-21 out of 21 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 21 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Issue Logic and Power/Performance TradeoffsThe need for low-power architecturesA couple alternativesOther power throttling mechanismsMethodologyIssue Window ScalingPowerPoint PresentationSlide 8Slide 9Bounded RUU Impact on PerformanceBounded RUU impact on PowerPower/PerformanceAnalysisAdding a separate coreSlide 15Slide 16AM5x86 vs. K6Crusoe’s Voltage Scaling & Coast and BurnSlide 19Big ProvisoSlide 21Issue Logic and Power/Performance TradeoffsEdwin OlsonAndrew MenardDecember 5, 2000The need for low-power architecturesLow performance - PIMsHigh performance – video decoding/MP3 playbackAnd increasingly, both.–How do you design an architecture that can do both?A couple alternativesHigh performance processor that can be lobotomized–Modify Issue Logic–Change structure sizesTwo separate cores–A high performance/high-power core–A low performance/low-power coreOther power throttling mechanismsVoltage scaling–Huge power savings–There’s a limit & high performance designs are pushing towards low voltage– which doesn’t leave much room for throttling.Burn & Coast–Compute at full speed, and then go into a sleep mode. –Simple linear power/performance throttling.MethodologySimpleScalar/Wattch–Widely used but little/no verification. Several power models available, but very large margins of error. –Still, the size of structures is correlated to power consumption.Industry survey–Look at real-world processors with the range of characteristics of interest.SpecInt95–Substantially reduced input sets to make simulation feasible.Issue Window ScalingPopular idea- it’s a highly active chip structure. Window responsible for 20% of non-clock power (Alpha 21264 & Wattch agree)Does it work?–Let’s look at RUU usageWhat’s an upper bound on the useful size?How do smaller sizes impact performance and power?RUU size upper boundsModified SimpleScalar, let RUU be arbitrarily big.4-issue00.20.40.60.811.20 16 32 48 64RUU OccupancyFraction of Cyclesli perl compress mk88sim8-issue00.20.40.60.811.20 16 32 48 64RUU OccupancyFraction of Cyclesli perl compress mk88simEffect of bounded RUU sizeThe RUU’s occupancy “saturates” as one would expect. RUU Usage - li00.20.40.60.811.20 4 8 12 16 20 24 28 32RUU OccupancyC y c l e s16 Entry RUUUnlimited RUUEffect of Bounded RUU Sizemk88sim on 4-issue00.20.40.60.811.20 2 4 6 8 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6RUU SizeFraction of cycles4 8 16 32 64mk88sim on 8-issue00.20.40.60.811.20 2 4 6 8 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6RUU SizeFraction of cycles8 16 32 64Bounded RUU Impact on PerformancePerformance rapidly approaches maximum.8-issue needs a slightly larger RUU, as expected.IPC vs RUU size for 4-issue00.20.40.60.811.21.41.61.820 8 16 24 32 40 48 56 64RUU C apac ityliperlcompressm88ksimIPC vs RUU size for 8-issue00.20.40.60.811.21.41.61.822.22.40 8 16 24 32 40 48 56 64RUU C apacityliperlcompressm88ksimBounded RUU impact on PowerPower consumption increased in RUU as size increasesPow er Consum ption Breakdow ns for 4 issue on li0510152025304x4 li 4x8 li 4x16 li 4x32 li 4x64 liConfigurationPower (W)clockresultbusaludcache2dcacheicacheregfilelsqwindowbpredrenamePower/PerformanceThere’s a minimum! And it’s pretty much where maximum performance is. Hmmm.Structure 8x8 8x16 8x32 8x64Energy/Inst (li)13.8 12.5 13.4 14.9Energy/Inst (perl)15.1 14.7 15.8 17.6Energy/inst(compress)12.4 11.4 11.9 13.3Energy/inst(m88ksim)13.0 12.1 12.9 14.4AnalysisSome groups have advocated a variable 16-32 capacity RUU. Even if scaling is perfect, there’s little to be gained.A power-conscious architect is likely to be cornered into just one reasonable RUU size.Adding a separate coreIf we can’t lobotomize, perhaps we can add a completely separate CPU.Sounds like a good idea–Intuition: a simple in-order processor should have lower energy/instruction than a complex out-of-order one.–Small area overhead, around 1mm^2.Opportunity for more energy savings–Smaller register file–No issue window–Separate low-power caches (though this increases area)MethodologySimpleScalar/Wattch is all but useless–Availability of only one parameterizable power model (Wattch) and we don’t know what trade-offs the designer made.–Wattch doesn’t support sim-inorder–E.g., Cacti cache model uses 10x greater energy than Krste.Industry SurveyPowerPC StatisticsPPC440 is 2-issue, out of orderPPC405 is single issue, in-orderBoth use same technologyThe 440 is twice as fast, but uses only 1.66 times the power!AM5x86 vs. K65x86 is in-orderK6 is out-of-order, 6 issue, 24 entry windowK6 has slightly better power/performance–But it’s on a newer process (0.25um rather than 0.35)Crusoe’s Voltage Scaling & Coast and BurnCrusoe’s Voltage Scaling & Coast and BurnBig ProvisoCPUs available today, even the “low power” ones, are still after speed.–Low power IA32 is just a slower, high-power IA32.If you designed your simple core for super-low power (without very little regard for speed), how might this change?ConclusionSmaller issue windows are not a win on power; they lower the amount of ILP found by too much.Multiple cores are not a win on power; the faster core tends to be more energy


View Full Document

MIT 6 893 - Issue Logic and Power/Performance Tradeoffs

Documents in this Course
Toolkits

Toolkits

16 pages

Cricket

Cricket

29 pages

Quiz 1

Quiz 1

8 pages

Security

Security

28 pages

Load more
Download Issue Logic and Power/Performance Tradeoffs
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Issue Logic and Power/Performance Tradeoffs and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Issue Logic and Power/Performance Tradeoffs 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?