Could anyone point me to a good source on how to implement garbage collection?
There’s a lot of advanced material about garbage collection out there. The Garbage Collection Handbook is great. But I found there was precious little basic introductory information so I wrote some articles about it. Prototyping a mark-sweep garbage collector describes a minimal mark-sweep GC written in F#. The Very Concurrent Garbage Collector describes a more advanced concurrent collector. HLVM is a virtual machine I wrote that includes a stop-the-world collector that handles threading.
The simplest way to implement a garbage collector is:
-
Make sure you can collate the global roots. These are the local and global variables that contain references into the heap. For local variables, push them on to a shadow stack for the duration of their scope.
-
Make sure you can traverse the heap, e.g. every value in the heap is an object that implements a
Visitmethod that returns all of the references from that object. -
Keep the set of all allocated values.
-
Allocate by calling
mallocand inserting the pointer into the set of all allocated values. -
When the total size of all allocated values exceeds a quota, kick off the mark and then sweep phases. This recursively traverses the heap accumulating the set of all reachable values.
-
The set difference of the allocated values minus the reachable values is the set of unreachable values. Iterate over them calling
freeand removing them from the set of allocated values. -
Set the quota to twice the total size of all allocated values.