Execution Slack and CriticalityExecution Slack and CriticalityNikos HardavellasThe Computer Architecture Lab at Carnegie Mellonhttp://www.ece.cmu.edu/CALCM2Execution Slack & Criticality: Context Execution Slack & Criticality: Context Out of order processors are highly parallely overlap computation Technological constraintsy wire delay, power, complexity Non-uniform designsy clustering, multi-frequency FU’s, memory hierarchies Criticality & slacky what is it?3Execution Slack & Criticality: Why do I care?Execution Slack & Criticality: Why do I care? Exploit to guide control policies Examples here do :1. Resource Arbitration2. Misspeculation Reduction3. Power Management4. Cache Management4PapersPapers1. Focusing processor policies via critical-path predictionFramework to identify instruction criticality and design to increase instruction scheduling locality in a cluster2. Slack: maximizing performance under technological constraintsFramework to identify variability in instruction execution criticality and application in slowing down execution to save power3. Locality vs. CriticalityIs sacrificing locality for criticality a good idea?4. Non-vital loadsIdentify instructions that can tolerate longer cache hit latencies and remove them from higher-level caches5Evaluation/OutlineEvaluation/Outline Overview of each technique Present results Present pros & cons6PapersPapers1. Focusing processor policies via critical-path predictionFramework to identify instruction criticality and design to increase instruction scheduling locality in a cluster2. Slack: maximizing performance under technological constraintsFramework to identify variability in instruction execution criticality and application in slowing down execution to save power3. Locality vs. CriticalityIs sacrificing locality for criticality a good idea? 4. Non-vital loadsIdentify instructions that can tolerate longer cache hit latencies and remove them from higher-level caches7Paper 1: ContextPaper 1: Context Out of order processors are highly parallel Critical-path analysis identifies dominating instructionsy Micro-architectural critical path Focus on optimizing critical instructionsy Better arbitration of scarce resourcesy Reduce misspeculation8Paper 1: ContributionsPaper 1: Contributions Dependence-graph model of micro-architectural critical path Hardware predictor to approximate instruction criticality Resource arbitration Misspeculation reduction9Paper 1: Critical Path ModelPaper 1: Critical Path Model10Paper 1: HardwarePaper 1: Hardware11Paper 1: Model EvaluationPaper 1: Model Evaluation12Paper 1: Model EvaluationPaper 1: Model Evaluation13Paper 1: Instruction Scheduling and Steering Paper 1: Instruction Scheduling and Steering14Paper 1: Value PredictionPaper 1: Value Prediction15Paper 1: CritiquePaper 1: CritiqueProns Multipurpose framework Adaptive Practical algorithmic approach Yields interesting resultsCons Wrong path instructions, criticality at memory system ? Coarse Classification imbalance Practical hardware realization16PapersPapers1. Focusing processor policies via critical-path predictionFramework to identify instruction criticality and design to increase instruction scheduling locality in a cluster2. Slack: maximizing performance under technological constraintsFramework to identify variability in instruction execution criticality and application in slowing down execution to save power3. Locality vs. CriticalityIs sacrificing locality for criticality a good idea? 4. Non-vital loadsIdentify instructions that can tolerate longer cache hit latencies and remove them from higher-level caches17Paper 2: ContextPaper 2: Context Technological constraintsy wire delay, power, complexity Non-uniform designsy clustering, multi-frequency FU’s Control policies Slack18Paper 2: ContributionsPaper 2: Contributions Model and characterize slacky local, global, apportioned Slack predictiony explicit, implicit Application in power saving hardwarey fast/slow pipeline19Paper 2: Computing SlackPaper 2: Computing Slack20Paper 2: Slack PotentialPaper 2: Slack Potential21Paper 2 : Paper 2 : MicroarchitectureMicroarchitecture22Paper 2: Performance ImpactPaper 2: Performance Impact23Paper 2: CritiquePaper 2: CritiqueProns: Easy to reason Adaptive Fine grainedCons Sampling constraints Criticality at memory system ? Weakly defined apportioned slack24PapersPapers1. Focusing processor policies via critical-path predictionFramework to identify instruction criticality and design to increase instruction scheduling locality in a cluster2. Slack: maximizing performance under technological constraintsFramework to identify variability in instruction execution criticality and application in slowing down execution to save power3. Locality vs. CriticalityIs sacrificing locality for criticality a good idea? 4. Non-vital loadsIdentify instructions that can tolerate longer cache hit latencies and remove them from higher-level caches25Paper 3: ContextPaper 3: Context Caches exploit spatial and temporal locality Locality based schemes ignore nature of misses Is criticality a strong enough property to warrant a change in memory hierarchies? Classify loads as criticaly feed mispredicted branch, feed into missed load, few independent instructions following26Paper 3: ContributionsPaper 3: Contributions Provides an answer to the question posedy Potential is therey Implementations fail27Paper 3: L1 Criticality PotentialPaper 3: L1 Criticality Potential28Paper 3: Exploiting L1 CriticalityPaper 3: Exploiting L1 Criticality29Paper 3: Exploiting L2 CriticalityPaper 3: Exploiting L2 Criticality30Paper 3: Criticality Guided Paper 3: Criticality Guided PrefetchingPrefetching31Paper 3: CritiquePaper 3: CritiqueProns Raises important question Provides a limit study: the potential is thereCons Is classification really complete? Coverage? Accuracy?32PapersPapers1. Focusing processor policies via critical-path predictionFramework to identify instruction criticality and design to increase instruction scheduling locality in a cluster2. Slack: maximizing performance under technological constraintsFramework to identify variability in instruction execution criticality and application in slowing down execution to save power3. Locality vs. CriticalityIs sacrificing locality for criticality a good idea?4. Non-vital loadsIdentify
View Full Document