This page describes a two-space garbage collector that can deal with cycles.
In Starlark, this pattern is used both when doing a real garbage collection, and when freezing. For both cases, it starts out with a memory block, which has pointers referring to things inside it, and ends up with a new memory block with equivalent pointers inside it. However, only pointers reachable from outside the original memory block are available in the new memory block. The garbage collector can deal with cyclic data structures and the time spent is proportional to the amount of live data in the heap (memory that is dropped is not even visited).
Given a heap with the following layout:
X := Data("world")
Y := Data("hello", X, Y)
Z := Data("universe")
All of X
, Y
and Z
are memory locations. The Y
memory location has both
some data of its own ("hello"
) and two pointers (X
and Y
itself).
The pointers from outside the heap into the heap are known as roots.
Assuming, in the above example, that Y
is the only root, then, since Y
is
used from outside, Y
must be moved to the new memory block. Consequently, the
data X
needs to be copied, but Z
can be dropped.
Following are the required steps for using a garbage collector:
-
To copy
Y
, allocate a value in the new heapA
with a sentinel value in it (that that sentinel is called aBlackhole
). Then, turnY
into aForward(A)
pointer, so that if anyone else in this cycle tries to collectY
they immediately "forward" to the new value and the data fromY
is grabbed so its pointers can be traversed. That results in the following:X := Data("world") Y := Forward(A) Z := Data("universe") A := Blackhole
With
Data("hello", X, Y)
as the current item being processed. -
Walk the pointers of the current value, performing a garbage collection on each of them. To copy
Y
, it can be seen thatY
points at aForward(A)
node, so there's no need to do anything. To copyX
, follow the process starting at step 1, but forX
(which ends up atB
). Performing that move leads to the following:X := Forward(B) Y := Forward(A) Z := Data("universe") A := Blackhole B := Data("world")
-
Replace all the pointers with the forwarded value, and write it back over the
Blackhole
inA
. This gives the following:X := Forward(B) Y := Forward(A) Z := Data("universe") A := Data("hello", B, A) B := Data("world")
-
Adjust any roots pointing at
Y
to point atA
and throw away the original heap, which produces the following:A := Data("hello", B, A) B := Data("world")
These above four steps successfully garbage collects a cyclic data structure, while preserving the cycles and getting rid of the unused data.