This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 000/236] Introduce rtx subclasses


On 08/06/14 11:19, David Malcolm wrote:

The aim of the patch series is to improve the type-safety and
readability of the backend by introducing subclasses of rtx (actually
rtx_def) for *instructions*, and also for EXPR_LIST, INSN_LIST, SEQUENCE.

That way we can document directly in the code the various places that
manipulate insn chains vs other kinds of rtx node.
Right. And by catching this stuff at compile time developers working on RTL bits should become at least a tiny bit more efficient.


The class hierarchy looks like this (using indentation to show
inheritance, and indicating the invariants):

class rtx_def;
   class rtx_expr_list;           /* GET_CODE (X) == EXPR_LIST */
   class rtx_insn_list;           /* GET_CODE (X) == INSN_LIST */
   class rtx_sequence;            /* GET_CODE (X) == SEQUENCE */
   class rtx_insn;                /* INSN_CHAIN_CODE_P (GET_CODE (X)) */
     class rtx_real_insn;         /* INSN_P (X) */
       class rtx_debug_insn;      /* DEBUG_INSN_P (X) */
       class rtx_nonjump_insn;    /* NONJUMP_INSN_P (X) */
       class rtx_jump_insn;       /* JUMP_P (X) */
       class rtx_call_insn;       /* CALL_P (X) */
     class rtx_jump_table_data;   /* JUMP_TABLE_DATA_P (X) */
     class rtx_barrier;           /* BARRIER_P (X) */
     class rtx_code_label;        /* LABEL_P (X) */
     class rtx_note;              /* NOTE_P (X) */
So I think the only real question here is do we want to distinguish between rtx_real_insn and the others?

When I see "rtx_real_insn" my first thought is something that's going to turn into an instruction. But that's not true in this hierarchy due to rtx_debug_insn.

So throughout the work-to-date, did you find much use for rtx_real_insn?


The patch series also contains some cleanups using inline methods:

   * being able to get rid of this boilerplate everywhere that jump tables
     are handled:

       if (GET_CODE (PATTERN (table)) == ADDR_VEC)
         vec = XVEC (PATTERN (table), 0);
       else
         vec = XVEC (PATTERN (table), 1);

    in favor of a helper method (inlined):

       vec = table->get_labels ();
Right. I suspect there's other cleanups like this all over the place that we could do and probably should in the interest of readability.





   * having a subclass for EXPR_LIST allows for replacing this kind of
     thing:

       for (x = forced_labels; x; x = XEXP (x, 1))
         if (XEXP (x, 0))
            set_label_offsets (XEXP (x, 0), NULL_RTX, 1);

     with the following, which captures that it's an EXPR_LIST, and makes
     it clearer that we're simply walking a singly-linked list:

       for (rtx_expr_list *x = forced_labels; x; x = x->next ())
         if (x->element ())
           set_label_offsets (x->element (), NULL_RTX, 1);
And would could argue here that we really just want to utilize some standard list container. I'm not sure what we're getting by rolling our own. When we discussed this a few weeks ago, I probably should have pushed a bit harder on that. But I think we can/should table that discussion. What you've done is a clear step forward and I think should be reviewed on those merits. If someone wants to go another step forward and use standard containers, that can be dealt with independently.



There are some surface details to the patches:

   * class names.  The subclass names correspond to the lower_case name
     from rtl.def, with an "rtx_" prefix.  "rtx_insn" and "rtx_real_insn"
     don't correspond to concrete node kinds, and hence I had to invent
     the names.  (In an earlier version of the patches these were
     "rtx_base_insn" and "rtx_insn" respectively, but the former occurred
     much more than the latter and so it seemed better to use the shorter
     spelling for the common case).
I can live with the existing class names.


   * there's a NULL_RTX define in rtl.h.   In an earlier version of the
     patch I added a NULL_INSN define, but in this version I simply use
     NULL, and I'm in two minds about whether a NULL_INSN is desirable
     (would we have a NULL_FOO for all of the subclasses?).  I like having
     a strong distinction between arbitrary RTL nodes vs instructions,
     so maybe there's a case for NULL_INSN, but not for the subclasses?
No strong opinion here. I think we added NULL_TREE/NULL_RTX. I could possibly see extending that to the overall concept of "insn chain things", but I think doing one for each subclass would probably be overkill.


   * I added an "rtx_real_insn" subclass for the INSN_P predicate, adding
     the idea of a PATTERN, a basic_block, and a location - but I hardly
     use this anywhere.  That said, it seems to be a real concept in the
     code, so I added it.
So DEBUG_INSN has a pattern and that's how it got into rtx_real_insn. Hmmm. Would agree it's a real concept in the code, but as much as I hate bikeshedding on names, this is one that just doesn't fit well. If you didn't use it much, maybe it's fixable without an enormous amount of renaming work.


     So in the following patches the pointerness is explicit: the patches
     refer to:
        rtx_insn *insn;
     rather than just:
        rtx_insn insn;
     and so one can write:
        const rtx_insn *insn
     and the "constness" applies to the insn, not to the pointer.
Seems like the right direction to me.



   * Should as_a <rtx_insn *> accept a NULL pointer?  It's possible to make
     the is_a_helper cope with NULL, but this adds an extra conditional.
     I instead added an as_a_nullable<> cast, so that you can explicitly
     add a check against NULL before checking the code of the rtx node.
     But maybe it's cleaner to simply have is_a_helper<rtx_insn *> cope
     with NULL?
I'd think not.



Some deeper questions:

   * should rtx_insn eventually be a separate class from rtx, separating
     insn chain nodes from rtx nodes?  I don't know if that's a worthwhile
     longterm goal, but this patch series gets us somewhere closer to
     being able to achieve that.  (actually getting there would be a much
     more invasive set of patches).
I think so and it generally matches what we've done with gimple. One could even argue that the insn chain itself really should just be a standard container class, but I think that's much further out.


To keep this reviewable, and to try to mitigate bitrot, I've chopped it up
into numerous relatively small patches.  The aim is that at every patch,
the code correctly builds on all supported configurations.  That said,
there's a complicated dependency graph of types in gcc's code, so to
tame that, the patch series is divided into 6 phases:
So from a staging standpoint I think you should install as we go rather than waiting for the entire set to be approved. My worry is that given the size, by the time we get it reviewed & updated, there'll have been more bitrot and you end up iterating quite a bit just because it takes time to fixup details in your patchkit and the code is continually changing under you.



   * phase 1 adds "scaffolding": in various places, strengthen the return
     types from internal APIs and macros so as to promise an rtx_insn *
     rather than a plain rtx.  For example, the DEP_PRO/DEP_CON macros
     in sched-int.h which lookup fields within struct _dep become inline
     functions that return rtx_insn * (using checked casts).  In this way,
     stronger type information can be used by subsequent patches whilst
     avoiding the chicken-and-egg issue since writes to the fields can
     still be arbitrary rtx nodes, according to the type system at least.
Right.


   * phase 2: an alphabetical tour of the backend: a series of patches,
     from alias.c through web.c, each patch touching one file,
     strengthening the types within it as much as we can at that point.
     The patches sometimes make use of the alphabetic ordering in order to
     use APIs that have already been strengthened to work on rtx_insn.
OK. Good to know that patch later patch in this part may depend on the API guarantees from an earlier patch.


   * phase 3: similar to phase 2, but for the various config
     subdirectories.
Right. I'd think in phase 3, each backend ought to be independent. So various backend changes ought to be able to go in, even if there's something contentious in a different config directory.

   * phase 4: tears down the scaffolding, replacing checked casts as much
     as possible by strengthening core fields of core types.  For example,
     we eventually reach the point in sched-int.h where the fields "pro"
     and "con" within struct _dep can become rtx_insn *, and hence
     DEP_PRO/DEP_CON can be converted back to plain macros, without needing
     the checked cast.
Right.



   * phase 5: all of the above was for instructions; this phase adds three
     subclasses for other node kinds.  I experimented with subclasses for
     various node kinds; these three seemed most appropriate: EXPR_LIST,
     INSN_LIST, SEQUENCE.  I kept these as a separate phase as Jeff asked
     me to separate them from the instruction patches, to avoid
     complicating things, but I think these three are also a worthwhile
     cleanup with relatively little complexity.
Ok.  Wished we'd talked more about this, but it's not the end of the world.


   * phase 6: (new since my Cauldron talk): this phase freely uses
     EXPR_LIST, INSN_LIST, SEQUENCE to do cleanups that were awkward
     without them.  In particular, by the end of this phase, NEXT_INSN()
     and PREV_INSN() require rtx_insn * rather than plain rtx.
Woo Wooo!

So out of curiosity, any thoughts on what other big things are out there that need to be fixed. I'm keen to keep this stuff moving as much as possible.


The patches (and my control for testing) has been on top of r211061, which
being from 2014-05-29 is now two months old, but hopefully is still
reviewable; naturally I'll perform bootstrap and regression testing for
each patch in turn before committing.
It's still reviewable, though you'll likely find some minor changes will be needed just due to churn. The iterator/for_each_rtx work comes immediately to mind, but I'm sure there'll be others.

A part of me really wonders if the 6 phases above should have been submitted independently. ie, once the scaffolding was done why not go ahead and get that reviewed & installed, then move on to phase2 patches. I bring it up more for future work of a similar nature.

I realize that you had to do a fair amount of the later work to ensure the scaffolding was right and so that we could see what the end result would likely look like. But something feels like it could be staged better.

Anyway...



Performance
===========

I tested the performance with --enable-checking=release using
two large files (kdecore.cc, bigcode.c), comparing a control build
to a patched build.

There were no significant differences in compilation time:
OK. A bit of a surprise, but then again perhaps not because we don't enable RTL checking by default. And I suspect a goodly amount of the RTL checking overhead is in the operands, not on the chain itself.




As for the performance of a regular build i.e. with as_a<>
checks *enabled*; looking at the wallclock time taken for a bootstrap and
regression test, for my s390 builds (with -j3) I saw:

s390 control:
   "make" time: 68 mins
   "make check" time: 122 mins
   total time: 190 mins

s390 experiment:
   "make" time: 70 mins
   "make check" time: 126 mins
   total time: 196 mins

showing a 3% increase, presumably due to the as_a and as_a_nullable
checks.
You want to be real careful benchmarking on ppc/s390 hardware as they're partitioned and what goes on in a different partition can affect your partition.



i.e. a release build shows no change in performance; a debug build shows
a 3% increase in time taken to bootstrap and regression test.  I believe
the debug build could be sped up with further patches to eliminate the
checked casts.
Yea. As I've mentioned, my vision is to try and push out the casts as much as reasonably possible.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]