The Science of Garbage CollectionAxis: Reference Counting and TracingSynthesis 1: Duality of RC and TracingRC versus TracingRC versus Tracing (continued)Big Picture 1: FixpointsSynthesis 2: HybridizationBig Picture 2: HybridsSynthesis 3: Three Big Design QuestionsQuestions1 / 11“A unified theory of garbage collection,” Bacon, Cheng, andRajan, OOPSLA ’04Presented by Donald NguyenMay 4, 2009The Science of Garbage Collection2 / 11■ Algorithms:◆ Baker’s Real-time, Generational, Older-first, Beltway, Semi-space,Mark-sweep, Mark-sweep-compact, Mark-copy, Immix, DeferredReference Counting, Ulterior Reference Counting, Cycle detection,Concurrent mark-sweep, Metronome, Boehm collector, Hoard■ Themes:◆ Tracing, Reference Counting, Generational hypothesis, increments,belts, concurrent collectors, real-time collectors, compaction, cycledetection, correctness, memory leaks, static analysis for garb agecollection, locality■ And. . .Axis: Reference Counting and Tracing3 / 11■ Naive reference counting and tracing are qualitatively differentTracing Reference CountingCollection style Batch IncrementalCost per mutationNone HighThroughputHigh LowPause timesLong ShortReal time?No YesCollects cycles?Yes No■ But, we want the best of both worlds!!!!1111!◆Why?◆ How?Synthesis 1: Duality of RC and Tracing4 / 11■ Consider instead generalizations of the algorithms:◆ Reference Counting+: decrements are batched instead of appliedimmediately◆ Tracing+: reconstruct reference counts instead of marked bits■ RC increments count whenever a pointer is installed; begins collectionwith a list of subtractions to apply■ Tracing starts with count of zero and begins collection with a list ofadditions to apply (the rootset)■ RC and Tracing are duals:◆ RC over dead data; Tracing over live data◆ For RC, initial reference count (at collection time) is anover-approx imation; for Tracing, an under-approximation◆ Over-approximation implies incremental RCRC versus Tracing5 / 11scan-by-tracing(W ): scan-by-counting(W ):while W 6= ∅: while W 6= ∅:w ← W .remove() w ← W .remove()ρ(w) ← ρ(w) + 1 ρ(w) ← ρ(w) − 1if ρ(w) = 1: if ρ(w) = 0W ← W·∪[v : (w, v) ∈ E] W ← W·∪[v : (w, v) ∈ E]sweep-for-tracing(): sweep-for-counting():for each v ∈ V : for each v ∈ V :if ρ(v) = 0: if ρ(v) = 0V F ← V F ∪ v V F ← V F ∪ vρ(v) ← 0new(x): new(x):ρ(x) ← 0 ρ(x) ← 0RC versus Tracing (continued)6 / 11dec(x):W ← W·∪ [x]inc(x):ρ(x) ← ρ(x) + 1assign(a, p):l ← haihai ← pdec(l)inc(p)Big Picture 1: Fixpoints7 / 11■ RC and Tracing compute the greatest and least fixpoint, respectively, ofthe following:ρ(x) = |[x : x ∈ R]| + |[(w, x) : (w, x) ∈ E ∧ ρ(w) > 0]| (1)VF= [v ∈ V : ρ(v) = 0] (2)Synthesis 2: Hybridization8 / 11■ Many algorithms are a hybrid of RC and Tracing applied over (potentiallylogical) segments of the heap◆ Remembered sets, write barriers, and “train”-style collectors are typesof reference counting■ Unidirectional write barriers are bidirectional reference countswhere the reverse edge represents an aggregate “may reference”relationship◆ Cycle detection and backup tracing are a form of segmentation■ If you can partition objects into a segment such that the incomingcount to the segment is zero, you can collect it independently■ RC gives incremental collection with potentially high overhead and theinability to collect cycles■ Tracing gives high-throughput at the cost of incremental collectionBig Picture 2: Hybrids9 / 11Figure 1: Generational CollectorFigure 2: Beltway CollectorSynthesis 3: Three Big Design Questions10 / 111. How to segment memory?■Pros and cons of different strategies?2. How to traverse each partition?■Ditto.3. When to apply different time-space trade-offs? E.g., semi-space versussliding compaction, pointer reversal versus stack traversal.■What other time-space mechanisms are there?Questions11 / 11■ What ideas do not fit in this abstraction?■ What other design points would be interesting to consider?■ The paper goes into examples of collection algorithms outside the contextof memory management (i.e., log-structured file systems, partial tracing).What other examples are there and how do those contexts differ from thatof memory
View Full Document