RFC: elimination of global state relating to passes

Richard Biener richard.guenther@gmail.com
Thu Apr 4 10:45:00 GMT 2013


On Wed, Apr 3, 2013 at 6:13 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> I'm working on my first gcc contribution, but it's a large patch, and I
> wanted to sound things out on this list.
>
> I want to eliminate/minimize global state within gcc, since I think
> doing so is a key part of making gcc more modular.
>
> Currently there's a lot of global state associated with passes:
> * the pass tree itself: a single global tree of passes, with callbacks
> ("gate" and "execute")
> * globals within individual pass .c files
>
> My plan is to instead have the passes be dynamically created instances
> of pass subclasses, where each pass has a custom subclass of the
> appropriate pass class:
>
> So, for example, mudflap would become something like:
>
>    class pass_mudflap_2 : public gimple_opt_pass
>    {
>      public:
>       bool gate() { return gate_mudflap(); }
>       unsigned int execute() { return execute_mudflap_function_ops(); }
>    };
>
> where these subclasses are hidden inside the .c files (in this case
> inside gcc/tree-mudflap.c).
>
> All that's exposed to the headers would then be a factory function:
>
>   gimple_opt_pass *
>   make_pass_mudflap_2 (context &ctxt)
>   {
>      return new pass_mudflap_2 (ctxt);
>   }
>
> Globals within a .c file that are specific to a particular pass may then
> be movable to instance data of the pass subclass (on a case-by-case
> basis).
>
> Each pass also has a reference back to a new "context" class, shared by
> all the passes: this is currently empty, but I see it being useful as a
> place to add any global state that isn't part of the pass itself: given
> that this patch is touching hundreds of files I'd rather add it now,
> rather than having to go back and add it later.
>
> I started doing this by hand, however, there are hundreds of passes, so
> to avoid lots of error-prone typing I've written a refactoring script.
>
> You can see the test suite for the refactoring script here:
> https://github.com/davidmalcolm/gcc-refactoring-scripts/blob/master/test_refactor.py
> which should give a better idea of the before/after when applying the
> refactoring (and the use of a test suite hopefully increases the
> credibility that the resulting patch doesn't change the effective
> behavior of the code).
>
> When run, the refactoring script generates this
> 107 files changed, 5008 insertions(+), 4214 deletions(-)
>
> There's also a hand-written part of the patch, which I'm attaching the
> most pertinent parts of i.e. gcc/tree-pass.h

Sounds like a good plan - though I doubt moving "global" pass state
into pass class members will be easy for refactoring (all functions
refering to that data would need to become member functions).

Eventually having both mechanisms in place for a transition time
and convert each pass separately is easier to review.

> It doesn't quite compile yet, gcc/passes.c needs work - in particular
> the code to construct and manage the tree of passes (I think I can do
> this, but it may need moving the bulk of init_optimization_passes() into
> a new passes.def file so that I can include it again elsewhere with a
> different defintion of NEXT_PASS).   There's also a new "pipeline" class
> representing the pass hierarchy, which stores the pointers to the pass
> instances, so that it's easy to access them from gdb when debugging.

One of my silly plans was to make the tree of passes dynamic - thus
no longer static in source as it is right now.  Mainly to allow easier
experiments with re-ordering of passes but also to allow "dynamic"
pass scheduling, like do { pass1, pass2, ... } until nothing changes.
Basically have a tiny pass scripting language ;)  Your patches would
make that a lot easier.

One complication is that with dynamic scheduling dumpfile names
suddenly are no longer predictable which is a problem for the testsuite.

> One wart in my plan is that a lot of the existing callbacks are checked
> against NULL and if non-NULL, then extra things happen before/after the
> call.
>
> I don't know of a portable way to do this for a C++ virtual function, so
> each callback becomes *two* vtable entries:
>   bool has_FOO()   // equivalent to (pass->FOO != NULL) in old code
> and
>   impl_FOO()       // equivalent to (pass->FOO ()) in old code

Ick.  It might be easier to figure out what semantics the NULL check
dependent code has and implement that in a more sensible way.

> I'm currently working off of this git repo:
>   git://gcc.gnu.org/git/gcc.git
> in the hope of getting this into 4.9 during stage 1.
>
> Hope this is constructive

For sure.

I originally thought of using namespaces to fend off pass-specific globals,
but eventually moving it all to a pass specific class would work, too.

Note that there are some passes that have global data that is allocated
and initialized once per compilation, so the pass class objects will
probably not be short-lived (like construct / destruct per pass invocation
which would be the cleanest design IMHO, at least for pass local data
lifetime management ...).  Thus eventually the 'pass' object should be
split into a global and per invocation object?

Richard.

> Dave



More information about the Gcc-patches mailing list