This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

New GCC plugin: gcc-python-plugin

I've been working on a new plugin for GCC, which supports embedding
Python within GCC, exposing GCC's internal data structures as Python
objects and classes.

The plugin links against libpython, and (I hope) allows you to invoke
arbitrary Python scripts from inside a compile.  My aim is to allow
people to write GCC "plugins" as Python scripts, and to make it much
easier to prototype new GCC features (Python is great for doing this
kind of thing).

The plugin is Free Software, licensed under the GPLv3 (or later).

The code can be seen here:;a=summary

and the website for the plugin is the Trac instance here:

The documentation is in the "docs" subdirectory (using sphinx).  You can
see a pre-built HTML version of the docs here:

It's still at the "experimental proof-of-concept stage"; expect crashes
and tracebacks (I'm new to the insides of GCC, and I may have
misunderstood things.  I'm entirely ignoring the garbage collector, and
I've also used a few entrypoints that aren't yet exposed in the plugin

It's already possible to use this to add additional compiler
errors/warnings, e.g. domain-specific checks, or static analysis.

One of my goals for this is to "teach" GCC about the common mistakes
people make when writing extensions for CPython [1], but it could be
  - e.g. to teach GCC about GTK's reference-counting semantics, 
  - to check locking in the Linux kernel
  - to check signal-safety in APIs, etc
  - rapid prototyping

Other ideas include visualizations of code structure.   There are handy
methods for plotting control flow graphs (using graphviz), showing the
source code interleaved with GCC's internal representation, such as the
one here:

It could also be used to build a more general static-analysis tool.

The CPython API checker has the beginnings of this:

Example output:

test.c: In function âleakyâ:
test.c:21:10: error: leak of PyObject* reference acquired at call to
PyList_New at test.c:21 [-fpermissive]
  test.c:22: taking False path at     if (!list)
    test.c:24: reaching here     item = PyLong_FromLong(42);
  test.c:27: taking True path at     if (!item)
  test.c:21: returning NULL

Numerous caveats right now (e.g. how I deal with loops is really
dubious).  It's disabled for now within the source tree (I need to fix
my selftests to pass again...)  It perhaps could be generalized to do
e.g. {malloc,FILE*, fd} leaks, array bounds checking, int overflow, etc,
but obviously that's a far bigger task.

So far, I'm just doing a limited form of "abstract interpretation" (or,
at least, based on my understanding of that term), dealing with explicit
finite prefixes of traces of execution, tracking abstract values (e.g.
NULL-ptr vs non-NULL-ptr) and stopping when the trace loops (which is
just an easy way to guarantee termination, not a good one, but for my
use-case is good enough, I hope.  Plus it ought to make it easier to
generate highly-readable error messages).

Thanks to Red Hat for allowing me to devote a substantial chunk of
$DAYJOB to this over the last couple of months.

I hope this will be helpful to both the GCC and Python communities.


[1] see

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]