Locality Phase PredictionXipeng ShenYutao Zhong Chen DingDepartment of Computer ScienceUniversity of Rochester1A Lesson from NatureSpringSummerFallWinter. . . SpringSummerFallWinter People adapt, for example, by changing thewardrobe before winter Need to predict when winter comes2Motivation Memory performance largely determines cost and energy Adaptation is increasingly used dynamic data and computation reordering dynamic cache resizing dynamic memory remapping Need to predict locality phases manual phase marking in past3Locality Phase A period of a program execution that hasstable or slowly changing data localityinside but disruptive transition periodsbetween phases [Batson & Madison, SIGMetrics, 1976] We want phases with repetitive behavior not necessarily uniform behavior4Locality Phase Examples Unstructured mesh the computation sweeps through the mesh structure ateach time step e.g. the aging of an airplane input-dependent memory behavior repeat or slow-changing across time steps Other scientific & commercial applications structural, mechanical, and molecular simulations5Outline Locality-based phase detection to find repetitive memory behavior Program phase marking to mark phases for all inputs Run-time prediction to predict machine-dependent behavior Evaluation Summary6Locality Metric: Reuse Distance Defined on each element of a memory-access trace It is the number of distinct elementsaccessed between this and the previousaccess to the same elementTime: 1 2 3 4 5 6 7 8 9 10 . . .Accesses: b a b b c c b c a c . . .7Locality Metric: Reuse Distance Defined on each element of a memory-access trace It is the number of distinct elementsaccessed between this and the previousaccess to the same elementTime: 1 2 3 4 5 6 7 8 9 10 . . .Accesses: b a b b c c b c a c . . .7Locality Metric: Reuse Distance Defined on each element of a memory-access trace It is the number of distinct elementsaccessed between this and the previousaccess to the same elementTime: 1 2 3 4 5 6 7 8 9 10 . . .Accesses: b a b b c c b c a c . . .8Reuse Distance as a Signal Defined on each element of a memory-access trace It is the number of distinct elementsaccessed between this and the previousaccess to the same elementTime: 1 2 3 4 5 6 7 8 9 10 . . .Accesses: b a b b c c b c a c . . .Reuse distance: ∞ ∞ 1 0 ∞ 0 1 1 2 1 . . .9Reuse Distance SignalReuse distance Time One execution of Tomcatv from Spec959Reuse Distance SignalReuse distance Time One execution of Tomcatv from Spec9510Basic Problem & Solution A search problem: to find a handful of phasechanging points from billions of points Solution: to apply signal processingtechniques to find disruptive behavior11Variable-length Sampling Sample only long reuse distances ofrepresentative memory locations for example, 30 thousand samples out ofbillions of memory accesses12Wavelet Analysis The wavelet transformgives temporal-frequency information. FFT, in comparison, gives onlyfrequency information12Wavelet Analysis The wavelet transformgives temporal-frequency information. FFT, in comparison, gives onlyfrequency information13Locality Phase DetectionProgram binary13Locality Phase Detectionvariable-lengthsamplingProgram binary13Locality Phase Detectionvariable-lengthsamplingProgram binaryEach sampledvariable yieldsa signal13Locality Phase Detectionvariable-lengthsamplingwaveletProgram binaryEach sampledvariable yieldsa signal13Locality Phase Detectionvariable-lengthsamplingwaveletProgram binaryEach sampledvariable yieldsa signalFinddisruptions oneach signal aspossible phaseboundaries13Locality Phase Detectionvariable-lengthsamplingwaveletcombinetimeProgram binaryEach sampledvariable yieldsa signalFinddisruptions oneach signal aspossible phaseboundaries13Locality Phase Detectionvariable-lengthsamplingwaveletcombinetimeProgram binaryEach sampledvariable yieldsa signalFinddisruptions oneach signal aspossible phaseboundariesCandidate phaseboundaries13Locality Phase Detectionvariable-lengthsamplingwaveletcombinetimetimeoptimalphase partition(described in paper)Program binaryEach sampledvariable yieldsa signalFinddisruptions oneach signal aspossible phaseboundariesCandidate phaseboundaries13Locality Phase Detectionvariable-lengthsamplingwaveletcombinetimetimeoptimalphase partition(described in paper)Program binaryEach sampledvariable yieldsa signalFinddisruptions oneach signal aspossible phaseboundariesCandidate phaseboundariesActual phaseboundaries14Phase HierarchyReuse distance Time One execution of Tomcatv from Spec9515Phase HierarchyReuse distance Time One execution of Tomcatv from Spec9516Cache Miss Rate hardware independent defined on each point,no need for windows a distance at eachaccess hardware dependent interval-based,defined on windows hit or miss at eachaccessVs.Reuse Distance17Outline Locality-based phase detection find repetitive memory behavior Program phase marking mark phases for all inputs Run-time prediction predict machine-dependent behavior Evaluation Summary18Phase Marking Objective: to find a basic block that uniquelymarks each phase boundary.Basic block tracePhases from locality analysisProgramwithphasemarkersFrequency-based filtering19Run-Time Prediction When an instrumented program runs, wemeasure the behavior of the first fewinstances of a phase and use them asprediction for the later instances We assume instances of a phase behavethe same in one execution20Outline Locality-based phase detection Program phase marking Run-time prediction Evaluation Phase behavior consistency Prediction accuracy Example uses Summary21Cache Miss Rates of Tomcatv (5250 instances of 7 phases)21Cache Miss Rates of Tomcatv (5250 instances of 7 phases)750 instancesof Phase 2,whose lengthare about 2.5Minstructions22TOMCATV Miss Rate Distribution (2493 Intervals)22TOMCATV Miss Rate Distribution (2493 Intervals)2.97%executioninside this BBVphase boundingbox23Prediction Accuracy Example benchmark: Mesh dynamic mesh structure
or
We will never post anything without your permission.
Don't have an account? Sign up