This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] libgccjit.so: an embeddable JIT-compiler based on GCC
- From: Xinliang David Li <davidxl at google dot com>
- To: David Malcolm <dmalcolm at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 7 Oct 2013 15:12:39 -0700
- Subject: Re: [PATCH] libgccjit.so: an embeddable JIT-compiler based on GCC
- Authentication-results: sourceware.org; auth=none
- References: <1380763968 dot 3628 dot 47 dot camel at surprise>
On Wed, Oct 2, 2013 at 6:32 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> This is very much a proof-of-concept/work-in-progress at this stage, but
> attached is a patch to GCC which aims to provide an embeddable
> JIT-compilation API, using GCC as the backend: libgccjit.so.
>
> This shared library can then be dynamically-linked into bytecode
> interpreters and other such programs that want to generate machine code
> "on the fly" at run-time.
>
> The idea is that GCC is configured with a special --enable-host-shared
> option, which leads to it being built as position-independent code. You
> would configure it with host==target, given that the generated machine
> code will be executed within the same process (the whole point of JIT).
>
> libgccjit.so is built against libbackend.a. To the rest of GCC, it
> looks like a "frontend" (in the "gcc/jit" subdir), but the parsing hook
> just runs a callback provided by client code. You can see a diagram of
> how it all fits together within the patch (see gcc/jit/notes.txt). The
> jit "frontend" requires --enable-host-shared, so it is off by default,
> so you need to configure with:
> --enable-host-shared --enable-languages=jit
> to get the jit (and see caveats below).
>
> The "main" function is in the client code. It uses a pure C API to call
> into libgccjit.so, registering a code creation hook:
>
> gcc_jit_context *ctxt;
> gcc_jit_result *result;
>
> ctxt = gcc_jit_context_acquire ();
>
> gcc_jit_context_set_code_factory (ctxt,
> some_code_making_callback, user_data);
>
> /* This actually calls into GCC and runs the build, all
> in a mutex for now, getting make a result object. */
> result = gcc_jit_context_compile (ctxt);
> /* result is actually a wrapper around a DSO */
>
> /* Now that we have result, we're done with ctxt: */
> gcc_git_context_release (ctxt);
>
> /* Look up a generated function by name, getting a void* back
> from the result object (pointing to the machine code), and
> cast it to the appropriate type for the function: */
> some_fn_type some_fn = (some_fn_type)gcc_jit_result_get_code (result,
> "some_fn");
>
> /* We can now call the machine code: */
> int val = some_fn (3, 4);
>
> /* Presumably we'd call it more than once.
> Once we're done with the code, this unloads the built DSO: */
> gcc_jit_result_release (result);
>
> There are some major kludges in there, but it does work: it can
> successfully build code in-process 1000 times in a row [1], albeit with
> a slow memory leak, with all optimization turned off. Upon turning on
> optimizations I run into crashes owing to not properly purging all state
> within the compiler - so this is a great motivation for doing more
> state-cleanup work. I've also hacked timevars to run in "cumulative"
> mode, accumulating all timings across all iterations.
>
> The library API hides GCC's internals, and tries to be much more
> typesafe than GCC's, giving something rather like Andrew MacLeod's
> proposed changes - client code does not see "tree", instead dealing with
> types, rvalues, lvalues, jump labels, etc. It is pure C, given the
> horror stories I have heard about people dealing with C++ ABIs. FWIW I
> also have the beginnings of Python bindings for the library (doing the
> interface as pure C makes language-bindings easier), though that would
> probably live in a separate repository (so not part of this patch).
>
> The API deliberately uses C terminology, given that it's likely that the
> user will want to be plugging the JIT-generated code into a C/C++
> program (or library).
>
> I've been able to successfully use this API to add JIT-compilation to a
> toy bytecode interpreter:
> https://github.com/davidmalcolm/jittest
> (where regvm.cc uses this API to compile a bytecode function into
> machine code).
>
> There's a DejaGnu-based test suite, which I can invoke via:
> make check-parallel-jit RUNTESTFLAGS=""
> (potentially with some -v verbosity options in RUNTESTFLAGS), giving
> # of expected passes 144
> and no failures on this box.
>
> Various caveats:
> * Currently it only supports a small subset of C-like code.
> * The API is still in flux: I'm not convinced by the label-placement
> approach; I suspect having an explicit "block" type may be easier for
> users to deal with.
> * The patch is against r202664, which is a little out-of-date
> (2013-09-17), but I'm interested in feedback rather than perfection at
> this stage.
> * I'm running into configure/Makefile issues with
> --enable-host-shared, where CFLAGS contains -fPIC, but only on
> invocations of leaf Makefiles, not on recursive "make" - so it works if
> you cd into $builddir/gcc and make (and so on for libcpp etc), but not
> from the top-level builddir. Hence building the thing is currently
> unreliable (but again, I'm interested in feedback rather than
> perfection). Help with configure/Makefiles would be appreciated!
> * There are some grotesque kludges in internal-api.c, especially in
> how we go from .s assembler files to a DSO (grep for "gross hack" ;) )
> * There are some changes to the rest of GCC that are needed by the JIT
> code. Some of this is state removal. Some of the changes are gross,
> some are probably reasonable.
> * Only tested so far on Fedora and RHEL x86_64 boxes.
>
> Hopefully this is of interest to other GCC people.
>
> Shall I get this into a "jit" branch? I greatly prefer git to svn, so
> I'd probably do:
> http://gcc.gnu.org/wiki/GitMirror#Git-only_branches
> assuming that this allows a sane path to (I hope) eventual merger.
>
> Thoughts?
Neat.
Think further ahead, it might better to leave '_jit_' out of the API
names -- the APIs can be used by any frontends including alternate
ones for C/C++. The APIs can also be used by other consumers such as
bitcode writer.
thanks,
David
> Dave
>
> Current Changelog.jit follows inline:
> /
> * configure.ac: Add --enable-host-shared
> * configure: Regenerate.
>
> gcc/
> * Makefile.in (LIBIBERTY): Use pic build of libiberty.a if
> configured with --enable-host-shared.
> (BUILD_LIBIBERTY): Likewise.
> * cgraph.c (cgraph_c_finalize): New.
> * cgraph.h (symtab_c_finalize): New declaration.
> (cgraph_c_finalize): Likewise.
> (cgraphunit_c_finalize): Likewise.
> (cgraphbuild_c_finalize): Likewise.
> (ipa_c_finalize): Likewise.
> (predict_c_finalize): Likewise.
> (varpool_c_finalize): Likewise.
> * cgraphbuild.c (cgraphbuild_c_finalize): New.
> * cgraphunit.c (first_analyzed): Move from analyze_functions
> to file-scope.
> (first_analyzed_var): Likewise.
> (analyze_functions): Move static variables into file-scope.
> (cgraphunit_c_finalize): New.
> * configure.ac: Add --enable-host-shared, adding -fPIC.
> * configure: Regenerate.
> * dwarf2out.c (dwarf2out_c_finalize): New.
> * dwarf2out.h (dwarf2out_c_finalize): Declare.
> * ggc-page.c (init_ggc): Make idempotent.
> * ipa-pure-const.c (function_insertion_hook_holder): Move to be
> a field of class pass_ipa_pure_const.
> (node_duplication_hook_holder): Likewise.
> (node_removal_hook_holder): Likewise.
> (register_hooks): Convert to method...
> (pass_ipa_pure_const::register_hooks): ...here, converting
> static variable init_p into...
> (pass_ipa_pure_const::init_p): ...new field.
> (pure_const_generate_summary): Update invocation of
> register_hooks to invoke as a method of current_pass.
> (pure_const_read_summary): Likewise.
> (propagate): Convert to...
> (pass_ipa_pure_const::execute): ...method.
> * ipa.c (ipa_c_finalize): New.
> * main.c (main): Update usage of toplev_main.
> * params.c (global_init_params): Make idempotent.
> * passes.c (execute_ipa_summary_passes): Set current_pass.
> * predict.c (predict_c_finalize): New.
> * stringpool.c (init_stringpool): Clean up if we're called more
> than once.
> * symtab.c (symtab_c_finalize): New.
> * timevar.c (timevar_init): Ignore repeated calls.
> * timevar.def (TV_CLIENT_CALLBACK): Add.
> (TV_ASSEMBLE): Add.
> (TV_LINK): Add.
> (TV_LOAD): Add.
> * toplev.c (do_compile) Add parameter (const toplev_options *);
> use it to avoid starting/stopping/reporting timevar TV_TOTAL
> for the case where toplev_main does not emcompass all timevars.
> (toplev_main): Add parameter (const toplev_options *); pass it
> to do_compile.
> (toplev_finalize): New.
> * toplev.h (struct toplev_options): New.
> (toplev_main): Add parameter (const toplev_options *).
> (toplev_finalize): New.
> * varpool.c (varpool_c_finalize): New.
>
> gcc/jit/
> * Make-lang.in: New.
> * TODO.rst: New.
> * config-lang.in: New.
> * dummy-frontend.c: New.
> * internal-api.c: New.
> * internal-api.h: New.
> * libgccjit.c: New.
> * libgccjit.h: New.
> * libgccjit.map: New.
> * notes.txt: New.
>
> gcc/testsuite/
> * jit.dg: New subdirectory
> * jit.dg/harness.h: New.
> * jit.dg/jit.exp: New.
> * jit.dg/test-accessing-struct.c: New.
> * jit.dg/test-calling-external-function.c: New.
> * jit.dg/test-dot-product.c: New.
> * jit.dg/test-factorial.c: New.
> * jit.dg/test-failure.c: New.
> * jit.dg/test-fibonacci.c: New.
> * jit.dg/test-hello-world.c: New.
> * jit.dg/test-string-literal.c: New.
> * jit.dg/test-sum-of-squares.c: New.
>
> libbacktrace/
> * configure.ac: Add --enable-host-shared.
> * configure: Regenerate.
>
> libcpp/
> * configure.ac: Add --enable-host-shared.
> * configure: Regenerate.
>
> libdecnumber/
> * configure.ac: Add --enable-host-shared.
> * configure: Regenerate.
>
> libiberty/
> * configure.ac: If --enable-host-shared, use -fPIC.
> * configure: Regenerate.
>
> zlib/
> * configure.ac: Add --enable-host-shared.
> * configure: Regenerate.
>