CSc 520Principles of Programming Languages39: Garbage Collection — Mark and SweepChristian CollbergDepartment of Computer ScienceUniversity of [email protected] 2005 Christian CollbergApril 22, 2005Garbage Collection Issues1 Finding the Object GraphFinding the roots: The dynamic objects in a program form a graph. Most GC algorithms need to traversethis graph. The roots of the graph can be in1. global variables2. registers3. local variables/formal parameters on the stack.Hence, the compiler must communicate to the GC which registers/variables contain roots.2 Finding the Object Graph. . .Finding internal pointers: Structured variables (arrays, records, objects) may contain internal point-ers. These must be known to the GC so that it can traverse the graph. Hence, the compiler mustcommunicate to the GC the type of each dynamic object and the internal structure of each type.Finding the beginning of objects: What happens if the only pointer to an object points somewhere inthe middle of the object? We must either be able to find the beginning of the object, or make sure thecompiler does not generate such code.13 Finding the Object Graph. . .AC%r8ExecutionStackforARPforARQforARRtemplateGGlobalsPtrs: [12]Size:96Template for PPtrs: [0]Size:4Template for MAINSize:32Ptrs: [4,12]Template for T1Ptrs: [8]Size: 24Template for T2HeapO: templateO: template8:12:O: template8:8:4 Pointer Maps• The internal structure of activation records & structured variables is described by run-time templates.• Every run-time object has an extra word that points to a type descriptor (or Temaplate), a structuredescribing which words in the object are pointers. This map is constructed at compile-time and storedstatically in the data segment of the executable.5 Pointer Maps. . .• When the GC is invoked, registers may also contain valid pointers. The compiler must therefore alsogenerate (for every point where the GC may be called) a pointer map that describes which registershold live pointers at this point. For this reason, we usually only allow the GC to run at certain points,often the points where new is called.• We must also provide pointer maps for every function call point. A function P may call Q which callsnew which invokes the GC. We need to know which words in P ’s activation record that at this pointcontain live pointers.6 Pointer Maps. . .• How does the GC look up which pointer map belongs to a particular call to procedure P at a particularaddress a? The pointer maps are indexed by the return address of P ! So, to traverse the stack ofactivation records, the GC looks at each frame, extracts the return address, finds the pointer map forthat address, and extracts each pointer according to the map.The Mark-and-Sweep Algorithm27 Algorithm: Mark and Sweep• The basic idea behind Mark-and-Sweep is to traverse and mark all the cells that can be reached fromthe root cells.• A root cell is any pointer on the stack or in global memory which points to objects on the heap.• Once all the live cells (those which are pointed to by a global variable or some other live cells) havebeen marked, we scan through the heap and separate the live data from the garbage.– If we are dealing with equal size objects only (this is the case in LISP, for example) the we scanthe heap and link all the unmarked objects onto the free list. At the same time we can unmarkthe live cells.8 Algorithm: Mark and Sweep. . .• – If we have cells of different sizes, just linking the freed objects together may result in heap frag-mentation. Instead we need to compact the heap, by collecting live cells together in a contiguousmemory area on the heap and doing the same with the garbage cells in another area.9 Algorithm: Mark and Sweep. . .Marking Phase:1. Mark all objects unmarked.2. Find all roots, i.e. heap pointers in stack, regs & globals.3. Mark reachable blocks using a depth first search starting at the roots.(a) DFS may run out of stack space!(b) Use non-recursive (Deutsch-Schorr-Waite) DFS.Scanning Phase:same-size-cells Scan heap and put un-marked (non-reachable) cells back on free-list.different-size-cells Compact the heap to prevent fragmentation.10 Marking Phase• A straight-forward implementation of mark and sweep may run into memory problems itself! A depth-first-search makes use of a stack, and the size of the stack will be the same as the depth of the objectgraph.• Remember that the stack and the heap share the same memory space, and may even grow towardseachother.• So, if we’re out of luck we might run into this situation:– the heap is full (otherwise we wouldn’t be gc:ing!),– the object graph is deep,– we run out of stack space during the marking phase.We’re now out of memory alltogether. Difficult to recover from!311 Marking Phase. . .• Fortunately, there is a smart algorithm for marking in constant space, called the Deutsch-Schorr-Waitealgorithm. Actually, it was developed simultaneously by Peter Deutsch and by Herbert Schorr andW. M. Waite.• The basic idea is to store the DFS stack in the object graph itself. When a new node (object) isencountered1. we set the “marked”-bit to 1,2. the node (object) is made to point to the previous node,3. two global variables current and previous are updated.current points to the current cell, previous to the previously visited cell.12 Marking: “Look Ma, No Stack!”• Usepointer reversal to encode the DFS stack in the object graph itself.• When the DFS reaches a new cell, change a pointer in the cell to point back to the DFS parent cell.When we can go no deeper, return, following the back links, restoring the links.(1)(2)/(3)/(1)M(2)(3)//BBMMcurrentpreviousB = Back PointerM = Marked(4)(5)//(4)(5)//M13 Marking: “Look Ma, No Stack!”. . .LOOPCASE 1: current’s fields are not Donei := next field of currentthat’s not Done;next := current↑ .fi;IF next↑ isn’t marked THENcurrent↑ .fi:= previous;previous := current;current := next;ENDIF;CASE 2: current’s fields are Donenext := current;current := previous;i := next field of currentthat’s not Done;previous := current↑ .fi;current↑ .fi:= next;ENDLOOP414 Marking: “Look Ma, No Stack!”. . .f2Done:2Marked:2√Done:2√f1f1f2Done:2f1f2f3Done:2Done:2Done:2Marked:2√Done:2√Marked:2√f1f2Done:2Done:2Marked:2nextcurrentpreviousf1f2f3Done:2Done:2Done:2Marked:2√previousf1f2Done:2Done:2Marked:2√current⇓ Case 115 Marking: “Look Ma, No Stack!”. .
View Full Document