This is the mail archive of the
mailing list for the GCC project.
Re: GCC/JIT and precise garbage collection support?
- From: David Malcolm <dmalcolm at redhat dot com>
- To: Basile Starynkevitch <basile at starynkevitch dot net>
- Cc: jit at gcc dot gnu dot org, gcc at gcc dot gnu dot org
- Date: Thu, 09 Jul 2015 21:53:44 -0400
- Subject: Re: GCC/JIT and precise garbage collection support?
- Authentication-results: sourceware.org; auth=none
- References: <559EF2F1 dot 6000000 at starynkevitch dot net>
On Fri, 2015-07-10 at 00:17 +0200, Basile Starynkevitch wrote:
> Hello All,
> (this is triggered by a question on the Ocaml mailing list asking about
> SystemZ backend in Ocaml; SystemZ is today a backend for GCC & probably
> We might want to support better good garbage collection schemes in GCC,
> particularily in GCCJIT. This is a
> thing that LLVM is known to be weak at, and we might aim to do much
> better. If we did, good frontends for
> good functional languages (e.g. F#, Ocaml, Haskell) might in the future
> could profit.
FWIW PyPy (an implementation of Python) defaults to using true GC, and
could benefit from GC support in GCC; currently PyPy has a nasty hack
for locating on-stack GC roots, by compiling to assembler, then carving
up the assembler with regexes to build GC metadata.
(IIRC this is the --gcrootfinder=asmgcc option here:
> A good GC is very probably a precise (sometimes generational copying) GC
> with write barriers
> (read the http://gchandbook.org/ for more, or at least the wikipage
> about garbage collection). So a good GC is changing pointers.
> So we need to know where, and provide a mechanism for, pointer values
> are located in the call stack (of the GCCJIT generated code), and
> probably provide some write barrier machinery.
> In my incomplete understanding, this requires cooperation between GCC
> backend and middle-end; it perhaps mean in the GIMPLE level that we mark
> some trees for local variables as been required to be spilled (by the
> backend) at some well defined location in the call frame, and be able to
> query that location (i.e. its offset).
> Perhaps a possible approach might be to add, at the C front-end level,
> an extra variable attribute telling that the variable should be spilled
> always at the same offset in the call frame, to have some machinery to
> query the value of that fixed offset, and to also have a GCC builtin
> which flushes all the registers into the call frame?
> This is just food for thoughts and still fuzzy in my head. Comments are
> welcome (including things like we should not care at all about GC).
> Notice that if we had such support for garbage collection, the (dying)
> Java front-end could be resurrected to provide a faster GC than Boehm
> GC. And GCC based compilers for languages like Go or D which have
> garbage collection could also profit. (even MELT might take advantage of
This all sounds like a lot of work.
I think a simpler first step might be to have some kind of option to
support tracking on-stack roots; presumably some kind of late RTL pass
that writes out a stack map: const data describing what GC-pointers are
live, at each %pc range, assuming we already have enough metadata to let
a collector walk the stack frames of a thread (presumably we already
have that for e.g. backtraces). This assumes we have enough type
information at the RTL phase to be able to distinguish GC types at
different places in the frame, or to punt it and be imprecise.
Though that doesn't solve GC ptrs in registers.
That said, fwiw I'm already fully tasked with things for GCC 6.