Post by Nic FerrierI am in the middle of building a scheme implementation that uses the
Boehm collector for GC.
The idea is that the C code that implements the scheme just uses
Boehm's malloc to allocate memory for the objects.
I made an incremental, precise, non-relocating, three-color collector,
in about 400 lines of profusely commented C code. I'm pleased with it;
it handles soft pointers gracefully and it's reasonably unintrusive.
Actually, I guess technically it's a five-color collector. Remember
the basic three-color algorithm? You start by putting your root items
into a queue and calling it the "tovisit" queue. Everything that
survived the previous round of GC is live data, and goes in the
"unvisited" queue for the current round. Then one at a time, you
take items off the tovisit queue, look at their pointers, and if
they point at anything in the unvisited queue, you move the thing
they point at to the "tovisit" queue. Then you move the things
whose pointers you've been iterating over to the "visited" queue
to get them out of the way. When you come to the end of the
tovisit queue, anything that is still in the "unvisited" queue
is garbage so you reap it and start over by moving root items
into a new "tovisit" queue. So, classically speaking, there are
three colors at any moment; visited, unvisited, and tovisit.
I added two extra colors. The first is "root," and I put root
elements in it instead of back in the visited queue after visiting
them. This means when I'm ready for the new generation, the
rootlist becomes the tovisit list and I don't have to worry
about separating them out. The second extra color is the "stale"
list, where garbage is held for one generation before being
reaped. When I'm taking things off the "tovisit" list, I can
check its soft pointers for references into the "stale" list
and set them to NULL. This way I can guarantee that soft pointers
in live data will be set to NULL before the things they point at
get reaped.
I think soft pointers are important. The problem I came up with
that I couldn't get around without them was symbols. If you don't
have soft pointers, you can't reap any symbol. Ever. Or at least
I couldn't figure out how.
The problem is that you have to have some way of finding pointers
to the symbols so you can guarantee eq? behavior and resolve
variable references; but if you store ordinary pointers to them
in a structure the GC knows about, then none of them can ever be
reaped. The solution is a root-element table of soft pointers
to symbols. If there's nothing that refers to a symbol, it goes
stale, then soft pointers to it get set NULL, then it gets
reaped.
Bear