Slide 1MotivationMotivationMotivationMotivationMotivationMotivationMotivationPossible SolutionsLeak PruningLeak PruningLeak PruningLeak PruningLeak PruningLeak PruningLeak PruningLeak PruningLeak PruningLeak PruningLeak PruningState Diagram for Leak PruningOBSERVE StateOBSERVE StateSELECT StatePRUNE StateIntercepting Accesses to Pruned ReferencesEvaluationApplication + Collection OverheadGarbage Collection OverheadCompilation OverheadTolerating LeaksSlide 32Slide 33Slide 34DiscussionLeak PruningMichael Bond Kathryn McKinleyThe University of Texas at AustinPresented by Na MengMost of the slides are from Mike’s original talk. Many thanks go to the authors.MotivationMemory bugsMemory corruption: dangling pointers, double frees, buffer overflowsMemory leaks▪Lost objects: unreachable but not freed▪Useless objects: reachable but not used2Managed languagesManaged languagesMotivationMemory leaks are a real problemManaged languages do not eliminate them3ReachableUnreachableMotivationMemory leaks are a real problemManaged languages do not eliminate them4LiveLiveReachableDeadMotivationMemory leaks are a real problemManaged languages do not eliminate them5LiveLiveReachableDeadMotivationMemory leaks are a real problemManaged languages do not eliminate them6LiveLiveReachableDeadMotivationMemory leaks are a real problemManaged languages do not eliminate themSlow & crash real programs7LiveLiveDeadMotivationMemory leaks are a real problemManaged languages do not eliminate themSlow & crash real programsFixing leaks is hardLeaks take time to materializeFailure far from cause8Leaks exist in production softwareLeak PruningLeak PruningPossible SolutionsPrecisely determine liveness of objectsLiveness is in general undecidableApproximately treat stale objects as dead9Leak Pruning10LiveLiveReachableDeadLeak Pruning11LiveLiveReachableDeadLeak Pruning12LiveLiveReachableDeadLeak Pruning13LiveLiveDeadThrow OOM errorThrow OOM errorOut of memory!Out of memory!ReachableLeak Pruning14LiveLiveDeadOut of memory!Out of memory!Throw OOM errorThrow OOM errorReclaim some objectsReclaim some objectsReachableLeak PruningReclaim predicted dead objects15ReclaimedLiveLiveLeak PruningReclaim predicted dead objects16ReclaimedLiveLiveaabbLeak PruningReclaim predicted dead objectsPoison references to reclaimed objects17LiveLiveaa?Leak PruningReclaim predicted dead objectsPoison references to reclaimed objects18LiveLiveaaLeak PruningReclaim predicted dead objectsPoison references to reclaimed objects19LiveLiveaaXThrow InternalError with OOMError attachedThrow InternalError with OOMError attachedLeak PruningReclaim predicted dead objectsPoison references to reclaimed objects20LiveLiveaaXThrow InternalError with OOMError attachedThrow InternalError with OOMError attachedWorst case: defers fatal errorsWorst case: defers fatal errorsBest case: keeps leaky programs running indefinitelyBest case: keeps leaky programs running indefinitelyPreserves semanticsPreserves semanticsState Diagram for Leak Pruning21INACTIVEOBSERVESELECTPRUNEHeap not nearly full<50% Heap filled>50%Heap not fullHeap nearly full>90%Heap still nearly fullOBSERVE StateTracking stalenesso.staleCounter increments from k to k + 1 after 2k garbage collectionsRead barrier22o.staleCounterHeader001b = a.f; //Application codeif (b & 0x1){ // Read barrier //out-of-line code path t = b; // Save ref b &= ~0x1; // Clear lowest bit a.f = b;[iff a.f == t] // Atomic b.staleCounter = 0x0; //Atomic}How does staleCounter’s increment work?OBSERVE StateMaintaining edge table23a0a0b11b11srcclass tgtclassmaxStaleUsebytesUsedb22b22A B20b = a.f; //Application codeif (b & 0x1){ // Read barrier //out-of-line code path t = b; // Save ref b &= ~0x1; // Clear lowest bit a.f = b;[iff a.f == t] // Atomic b.staleCounter = 0x0; //Atomic} if (b.staleCounter > 1){//set maxStaleUse edgeTable[a.class->b.class].maxStaleUse = max(edgeTable[a.class->b.class].maxStaleUse, b.staleCounter);} if (b.staleCounter > 1){//set maxStaleUse edgeTable[a.class->b.class].maxStaleUse = max(edgeTable[a.class->b.class].maxStaleUse, b.staleCounter);}SELECT StateTransitive ClosurePhase I: in-use transitive closurePhase II: stale transitive closure24srcclass tgtclassmaxStaleUsebytesUsedB C 0E C 2a10a10b10b10c13c13rootse10e10b20b20c23c23d13d13d23d23b30b30c33c33▪Enque candidate ref if tgt.staleCounter > 2 + ref.maxStaleUse▪Compute the bytes reachable from each stale candidate6080PRUNE StateIn-use transitive closureCollector poisons each reference25a10a10b10b10c13c13rootse10e10b20b20c23c23d13d13d23d23b30b30c33c33srcclass tgtclassmaxStaleUsebytesUsedB C 0 80E C 2???Intercepting Accesses to Pruned References26a10a10b10b10rootse10e10b20b20b30b30c33c33Read barrier checks for poisoned referencesb = a.f; //Application codeif (b & 0x1){ // Read barrier //out-of-line code path if (b.staleCounter > 1){ edgeTable[a.class->b.class].maxStaleUse = max(edgeTable[a.class->b.class].maxStaleUse, b.staleCounter);} if (b & 0x2){//Check if poisoned InternalError error = new InternalError(); err.initCause(avertedOutofMemoryError); throw err; if (b & 0x2){//Check if poisoned InternalError error = new InternalError(); err.initCause(avertedOutofMemoryError); throw err;XThrow InternalError ???Evaluation27Leaking pruning added to Jikes RVM 2.9.2http://www.jikesrvm.org/Research+ArchiveGenerational Mark-Sweep in MMTkPerformance stress testNon-leaking programs: Dacapo & SPECReplay compilationLeak tolerance testLeaking programsApplication + Collection Overhead28SELECT State5% overhead on Pentium 43% overhead on Core 2Why overhead is negative for some benchmarks ?Garbage Collection OverheadOBSERVE State 5%SELECT State 14%29Compilation OverheadInsert read barrier17% on average, 34% at mostNegligible compared with overall execution time30Tolerating Leaks31Leak Leak pruning’s effectEclipse “Diff” Tolerates until 24-hr limit (>200X longer)ListLeak Tolerates until 24-hr limit (>25,000X longer)SwapLeak Tolerates until 24-hr limit (>2,200X longer)Eclipse “Copy-Paste” Most dead but some live (81X
View Full Document