Dynamic Binary TranslationLecture OutlineMotivation: preventing buffer overrunsPreventing buffer overrun attacksDynamic buffer overrun preventionA different ideaCommercially interestingWhat is Binary Translation?When does the translation happen?Why? Translation Allows Program ModificationApplications, in more detailDynamic Program ModifiersIn more detailSlide 14HP Dynamo-RIOSystem I: Basic InterpreterTrick I: Adding a Code CacheExample Basic Block FragmentRuntime System with Code CacheLinking a Basic Block FragmentTrick II: LinkingPerformance Effect of Basic Block Cache with direct branch linkingIndirect Branch HandlingIndirect Branch LinkingTrick III: Efficient Indirect Branch HandlingPerformance Effect of indirect branch linkingTrick IV: Picking TracesPicking TracesPicking hot tracesAlternative 1: Edge profilingAlternative 2: Bit-tracing path profilingAlternative 3: Next Executing Tail (NET)NET (continued)Spec2000 Performance on Windows (w/o trace optimizations)Spec2000 Performance on Linux (w/o trace optimizations)Performance on Desktop ApplicationsPerformance BreakdownTrace optimizationsMaintaining Control (in the real world)Ras Bodik CS 164 Lecture 24 1Dynamic Binary TranslationLecture 24acknowledgement: E. Duesterwald (IBM), S. Amarasinghe (MIT)Ras Bodik CS 164 Lecture 242Lecture Outline•Binary Translation: Why, What, and When.•Why: Guarding against buffer overruns•What, when: overview of two dynamic translators:–Dynamo-RIO by HP, MIT–CodeMorph by Transmeta•Techniques used in dynamic translators–Path profilingRas Bodik CS 164 Lecture 243Motivation: preventing buffer overrunsRecall the typical buffer overrun attack:1. program calls a method foo()2. foo() copies a string into an on-stack array:–string supplied by the user–user’s malicious code copied into foo’s array –foo’s return address overwritten to point to user code3. foo() returns –unknowingly jumping to the user codeRas Bodik CS 164 Lecture 244Preventing buffer overrun attacksTwo general approaches:•static (compile-time): analyze the program –find all array writes that may outside array bounds –program proven safe before you run it•dynamic (run-time): analyze the execution–make sure no write outside an array happens–execution proven safe (enough to achieve security)Ras Bodik CS 164 Lecture 245Dynamic buffer overrun preventionthe idea, again:•prevent writes outside the intended array–as is done in Java–harder in C: must add “size” to each array•done in CCured, a Berkeley projectRas Bodik CS 164 Lecture 246A different ideaperhaps less safe, but easier to implement:–goal: detect that return address was overwritten.instrument the program so that –it keeps an extra copy of the return address:1. store aside the return address when function called (store it in an inaccessible shadow stack)2. when returning, check that the return address in AR matches the stored one; 3. if mismatch, terminate programRas Bodik CS 164 Lecture 247Commercially interesting•Similar idea behind the product by determina.com•key problem: –reducing overhead of instrumentation•what’s instrumentation, anyway?–adding statements to an existing program–in our case, to x86 executables•Determina uses binary translationRas Bodik CS 164 Lecture 248What is Binary Translation?•Translating a program in one binary format to another, for example:–MIPS x86 (to port programs across platforms)•We can view “binary format” liberally:–Java bytecode x86 (to avoid interpretation)–x86 x86 (to optimize the executable)Ras Bodik CS 164 Lecture 249When does the translation happen?•Static (off-line): before the program is run–Pros: no serious translation-time constraints •Dynamic (on-line): while the program is running–Pros: •access to complete program (program is fully linked)•access to program state (including values of data struct’s)•can adapt to changes in program behavior•Note: Pros(dynamic) = Cons(static)Ras Bodik CS 164 Lecture 2410Why? Translation Allows Program ModificationProgram Compiler Linker Loader Runtime SystemStatic Dynamic• Instrumenters• Load time optimizers • Shared library mechanism• Debuggers• Interpreters• Just-In-Time Compilers• Dynamic Optimizers• Profilers• Dynamic Checkers• instrumenters• Etc.Ras Bodik CS 164 Lecture 2411Applications, in more detail•profilers: –add instrumentation instructions to count basic block execution counts (e.g., gprof)•load-time optimizers:–remove caller/callee save instructions (callers/callees known after DLLs are linked)–replace long jumps with short jumps (code position known after linking)•dynamic checkers–finding memory access bugs (e.g., Rational Purify)Ras Bodik CS 164 Lecture 2412Dynamic Program ModifiersRunning ProgramDynamic Program Modifier:Observe/Manipulate Every Instruction in the Running ProgramHardware PlatformRas Bodik CS 164 Lecture 2413In more detailcommon setupCPUOSDLLapplicationCodeMorphOSDLLapplicationCPU=VLIWCodeMorph(Transmeta)Dynamo-RIO (HP, MIT)CPU=x86DLLapplicationDynamoOSRas Bodik CS 164 Lecture 2414Dynamic Program ModifiersRequirements::Ability to intercept execution at arbitrary pointsObserve executing instructionsModify executing instructionsTransparency - modified program is not specially preparedEfficiency - amortize overhead and achieve near-native performanceRobustnessMaintain full control and capture all code - sampling is not an option (there are security applications)Ras Bodik CS 164 Lecture 2415HP Dynamo-RIO•Building a dynamic program modifier•Trick I: adding a code cache•Trick II: linking•Trick III: efficient indirect branch handling•Trick IV: picking traces•Dynamo-RIO performance•Run-time trace optimizationsRas Bodik CS 164 Lecture 2416next VPCInstruction InterpreterSystem I: Basic Interpreter decodefetch next instruction executeexception handling update VPC Intercept execution Observe & modify executing instructions Transparency Efficiency? - up to several 100 X slowdownRas Bodik CS 164 Lecture 2417context switchBASIC BLOCK CACHEnon-control-flow instructionsTrick I: Adding a Code Cachenext VPCfetch block at VPClookup VPC emitblockexception handlingexecuteblockRas Bodik CS 164 Lecture 2418add %eax, %ecxcmp $4, %eaxjle $0x40106fadd %eax, %ecxcmp $4, %eaxjle <stub1>jmp <stub2>mov %eax, eax-slot # spill eaxmov &dstub1, %eax # store ptr to stub
View Full Document