DOC PREVIEW
UCSD CSE 231 - Pin: Building Customized Program Analysis Tools

This preview shows page 1-2-19-20 out of 20 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Pin: Building Customized Program Analysis Tools with Dynamic InstrumentationWhat is Program Instrumentation?Uses of Program InstrumentationPin System LayoutSlide 5Slide 6Slide 7Slide 8Slide 9Simplified InstrumentationTrace LinkingTrace Linking (Indirect)Function CloningRegister BindingsOptimization – Inlined Analysis RoutinesOptimization – eflags Register LivenessOptimization – Call SchedulingBasic Pin OverheadEffectiveness of OptimizationsQuestions?San Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCPin: Building Customized Program Analysis Tools with Dynamic InstrumentationC.K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V.J. Reddi, K. HazelwoodPresented by: Michael LaurenzanoSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCWhat is Program Instrumentation?• Inserting extra code into an application to observe its behavior– Example: Cache Simulation for (int i = 0; i < LENGTH; i++) {CacheSim(&A[i]); A[i] = (double)i;CacheSim(&B[i]); B[i] = (double)i;CacheSim(&C[i]); C[i] = (double)i; }San Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCUses of Program Instrumentation• Code Profiles– Basic block/Instruction count– Operation results• Microarchitectural study– Branch outcomes– Memory addresses• Bug checking– Memory leaks– Uninitialized dataSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCPin System LayoutSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCPin System LayoutThe code beinganalyzedSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCPin System LayoutThe code beinganalyzedTells us where and howto perform analysisSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCPin System LayoutThe code beinganalyzedTells us where and howto perform analysisCombines applicationand pintool code tocreate instrumentedcodeSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCPin System LayoutThe code beinganalyzedTells us where and howto perform analysisCombines applicationand pintool code tocreate instrumentedcodeStores theInstrumented codecreated by the JITSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCPin System LayoutThe code beinganalyzedTells us where and howto perform analysisCombines applicationand pintool code tocreate instrumentedcodeStores theInstrumented codecreated by the JITControls execution,maintains datastructures, tracksprogram stateSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCSimplified Instrumentation• Transfer control to VM at an application control transfer• Look for instrumented version of branch target in code cache– If found: execute instrumented code– If not: compile the code, insert into code cache, execute new code• RepeatSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCTrace Linking• Transfer control directly between traces–Branch target must be known statically–Target trace must be present in code cacheSequence 1Trace 1Trace 2Virtual MachineTrace 1Trace 2Sequence 2Regular ExecutionPin w/o Trace LinkingPin w/ Trace LinkingSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCTrace Linking (Indirect)• “Unknown” targets are usually somewhat predictable– Function typically returns to a few locations (few call sites)– Indirect Jump usually goes to a few locations• Try several predicted targets to see if we can avoid VM intervention– Short target lists are maintained for each indirect branch– If we exhaust this list, use the VMSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCFunction Cloning• Most common indirect control transfer is a function return• Create a function instance for each call site– Return address is then unique and known for each function instance– Turns this indirect control transfer into a direct control transfer–Code bloat!• Implemented by keeping a call stack for each instrumented instruction sequence– Keep last 4 in call stack– Call stack represented as a 64-bit integerSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCRegister Bindings• Register re-allocation occurs so that Pin can use registers–The register bindings can be different from one trace to the next• When compiling, keep register bindings from the previous trace if possible• When linking traces, modify the register bindings before going to the next trace– Usually only a few registers are mismatched in practiceSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCOptimization – Inlined Analysis RoutinesWithout Inlining With InliningApplicationApplicationBridge RoutineBridge RoutineAnalysisRoutine- 2 fewer calls and 2 fewer returnsApplicationBridge CodeAnalysis CodeBridge CodeApplication-Other optimizations: constantfolding, code relocationSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCOptimization – eflags Register Liveness• The x86 eflags register is treated as a bit-vector containing state information– This register can be modified as a side-effect of some instructions•eflags might not be live when we reach analysis routine– If this is the case, we do not need to save/restore itSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCOptimization – Call Scheduling• User can specify that the routine be put anywhere in the particular scope – Anywhere in instruction, basic block, function, program, etc.• Pin can schedule the call according to best performance– Perhaps at a point where few registers need to be saved– How well will this actually work?San Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCBasic Pin OverheadSan Diego Supercomputer CenterPerformance Modeling and Characterization LabPMaCEffectiveness of OptimizationsSan Diego Supercomputer CenterPerformance Modeling and Characterization


View Full Document

UCSD CSE 231 - Pin: Building Customized Program Analysis Tools

Download Pin: Building Customized Program Analysis Tools
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Pin: Building Customized Program Analysis Tools and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Pin: Building Customized Program Analysis Tools 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?