This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH 000/236] Introduce rtx subclasses

From: Jeff Law <law at redhat dot com>
To: David Malcolm <dmalcolm at redhat dot com>, gcc-patches at gcc dot gnu dot org
Date: Tue, 12 Aug 2014 14:39:44 -0600
Subject: Re: [PATCH 000/236] Introduce rtx subclasses
Authentication-results: sourceware.org; auth=none
References: <1407345815-14551-1-git-send-email-dmalcolm at redhat dot com>

On 08/06/14 11:19, David Malcolm wrote:


The aim of the patch series is to improve the type-safety and
readability of the backend by introducing subclasses of rtx (actually
rtx_def) for *instructions*, and also for EXPR_LIST, INSN_LIST, SEQUENCE.

That way we can document directly in the code the various places that
manipulate insn chains vs other kinds of rtx node.

Right. And by catching this stuff at compile time developers working onRTL bits should become at least a tiny bit more efficient.


The class hierarchy looks like this (using indentation to show
inheritance, and indicating the invariants):

class rtx_def;
   class rtx_expr_list;           /* GET_CODE (X) == EXPR_LIST */
   class rtx_insn_list;           /* GET_CODE (X) == INSN_LIST */
   class rtx_sequence;            /* GET_CODE (X) == SEQUENCE */
   class rtx_insn;                /* INSN_CHAIN_CODE_P (GET_CODE (X)) */
     class rtx_real_insn;         /* INSN_P (X) */
       class rtx_debug_insn;      /* DEBUG_INSN_P (X) */
       class rtx_nonjump_insn;    /* NONJUMP_INSN_P (X) */
       class rtx_jump_insn;       /* JUMP_P (X) */
       class rtx_call_insn;       /* CALL_P (X) */
     class rtx_jump_table_data;   /* JUMP_TABLE_DATA_P (X) */
     class rtx_barrier;           /* BARRIER_P (X) */
     class rtx_code_label;        /* LABEL_P (X) */
     class rtx_note;              /* NOTE_P (X) */

So I think the only real question here is do we want to distinguishbetween rtx_real_insn and the others?

When I see "rtx_real_insn" my first thought is something that's going toturn into an instruction. But that's not true in this hierarchy due tortx_debug_insn.


So throughout the work-to-date, did you find much use for rtx_real_insn?


The patch series also contains some cleanups using inline methods:

   * being able to get rid of this boilerplate everywhere that jump tables
     are handled:

       if (GET_CODE (PATTERN (table)) == ADDR_VEC)
         vec = XVEC (PATTERN (table), 0);
       else
         vec = XVEC (PATTERN (table), 1);

    in favor of a helper method (inlined):

       vec = table->get_labels ();

Right. I suspect there's other cleanups like this all over the placethat we could do and probably should in the interest of readability.


   * having a subclass for EXPR_LIST allows for replacing this kind of
     thing:

       for (x = forced_labels; x; x = XEXP (x, 1))
         if (XEXP (x, 0))
            set_label_offsets (XEXP (x, 0), NULL_RTX, 1);

     with the following, which captures that it's an EXPR_LIST, and makes
     it clearer that we're simply walking a singly-linked list:

       for (rtx_expr_list *x = forced_labels; x; x = x->next ())
         if (x->element ())
           set_label_offsets (x->element (), NULL_RTX, 1);

And would could argue here that we really just want to utilize somestandard list container. I'm not sure what we're getting by rolling ourown. When we discussed this a few weeks ago, I probably should havepushed a bit harder on that. But I think we can/should table thatdiscussion. What you've done is a clear step forward and I think shouldbe reviewed on those merits. If someone wants to go another stepforward and use standard containers, that can be dealt with independently.


There are some surface details to the patches:

   * class names.  The subclass names correspond to the lower_case name
     from rtl.def, with an "rtx_" prefix.  "rtx_insn" and "rtx_real_insn"
     don't correspond to concrete node kinds, and hence I had to invent
     the names.  (In an earlier version of the patches these were
     "rtx_base_insn" and "rtx_insn" respectively, but the former occurred
     much more than the latter and so it seemed better to use the shorter
     spelling for the common case).

I can live with the existing class names.


   * there's a NULL_RTX define in rtl.h.   In an earlier version of the
     patch I added a NULL_INSN define, but in this version I simply use
     NULL, and I'm in two minds about whether a NULL_INSN is desirable
     (would we have a NULL_FOO for all of the subclasses?).  I like having
     a strong distinction between arbitrary RTL nodes vs instructions,
     so maybe there's a case for NULL_INSN, but not for the subclasses?

No strong opinion here. I think we added NULL_TREE/NULL_RTX. I couldpossibly see extending that to the overall concept of "insn chainthings", but I think doing one for each subclass would probably be overkill.


   * I added an "rtx_real_insn" subclass for the INSN_P predicate, adding
     the idea of a PATTERN, a basic_block, and a location - but I hardly
     use this anywhere.  That said, it seems to be a real concept in the
     code, so I added it.

So DEBUG_INSN has a pattern and that's how it got into rtx_real_insn.Hmmm. Would agree it's a real concept in the code, but as much as Ihate bikeshedding on names, this is one that just doesn't fit well. Ifyou didn't use it much, maybe it's fixable without an enormous amount ofrenaming work.


     So in the following patches the pointerness is explicit: the patches
     refer to:
        rtx_insn *insn;
     rather than just:
        rtx_insn insn;
     and so one can write:
        const rtx_insn *insn
     and the "constness" applies to the insn, not to the pointer.

Seems like the right direction to me.


   * Should as_a <rtx_insn *> accept a NULL pointer?  It's possible to make
     the is_a_helper cope with NULL, but this adds an extra conditional.
     I instead added an as_a_nullable<> cast, so that you can explicitly
     add a check against NULL before checking the code of the rtx node.
     But maybe it's cleaner to simply have is_a_helper<rtx_insn *> cope
     with NULL?

I'd think not.


Some deeper questions:

   * should rtx_insn eventually be a separate class from rtx, separating
     insn chain nodes from rtx nodes?  I don't know if that's a worthwhile
     longterm goal, but this patch series gets us somewhere closer to
     being able to achieve that.  (actually getting there would be a much
     more invasive set of patches).

I think so and it generally matches what we've done with gimple. Onecould even argue that the insn chain itself really should just be astandard container class, but I think that's much further out.


To keep this reviewable, and to try to mitigate bitrot, I've chopped it up
into numerous relatively small patches.  The aim is that at every patch,
the code correctly builds on all supported configurations.  That said,
there's a complicated dependency graph of types in gcc's code, so to
tame that, the patch series is divided into 6 phases:

So from a staging standpoint I think you should install as we go ratherthan waiting for the entire set to be approved. My worry is that giventhe size, by the time we get it reviewed & updated, there'll have beenmore bitrot and you end up iterating quite a bit just because it takestime to fixup details in your patchkit and the code is continuallychanging under you.


   * phase 1 adds "scaffolding": in various places, strengthen the return
     types from internal APIs and macros so as to promise an rtx_insn *
     rather than a plain rtx.  For example, the DEP_PRO/DEP_CON macros
     in sched-int.h which lookup fields within struct _dep become inline
     functions that return rtx_insn * (using checked casts).  In this way,
     stronger type information can be used by subsequent patches whilst
     avoiding the chicken-and-egg issue since writes to the fields can
     still be arbitrary rtx nodes, according to the type system at least.

Right.


   * phase 2: an alphabetical tour of the backend: a series of patches,
     from alias.c through web.c, each patch touching one file,
     strengthening the types within it as much as we can at that point.
     The patches sometimes make use of the alphabetic ordering in order to
     use APIs that have already been strengthened to work on rtx_insn.

OK. Good to know that patch later patch in this part may depend on theAPI guarantees from an earlier patch.


   * phase 3: similar to phase 2, but for the various config
     subdirectories.

Right. I'd think in phase 3, each backend ought to be independent. Sovarious backend changes ought to be able to go in, even if there'ssomething contentious in a different config directory.

   * phase 4: tears down the scaffolding, replacing checked casts as much
     as possible by strengthening core fields of core types.  For example,
     we eventually reach the point in sched-int.h where the fields "pro"
     and "con" within struct _dep can become rtx_insn *, and hence
     DEP_PRO/DEP_CON can be converted back to plain macros, without needing
     the checked cast.

Right.


   * phase 5: all of the above was for instructions; this phase adds three
     subclasses for other node kinds.  I experimented with subclasses for
     various node kinds; these three seemed most appropriate: EXPR_LIST,
     INSN_LIST, SEQUENCE.  I kept these as a separate phase as Jeff asked
     me to separate them from the instruction patches, to avoid
     complicating things, but I think these three are also a worthwhile
     cleanup with relatively little complexity.

Ok.  Wished we'd talked more about this, but it's not the end of the world.


   * phase 6: (new since my Cauldron talk): this phase freely uses
     EXPR_LIST, INSN_LIST, SEQUENCE to do cleanups that were awkward
     without them.  In particular, by the end of this phase, NEXT_INSN()
     and PREV_INSN() require rtx_insn * rather than plain rtx.

Woo Wooo!

So out of curiosity, any thoughts on what other big things are out therethat need to be fixed. I'm keen to keep this stuff moving as much aspossible.


The patches (and my control for testing) has been on top of r211061, which
being from 2014-05-29 is now two months old, but hopefully is still
reviewable; naturally I'll perform bootstrap and regression testing for
each patch in turn before committing.

It's still reviewable, though you'll likely find some minor changes willbe needed just due to churn. The iterator/for_each_rtx work comesimmediately to mind, but I'm sure there'll be others.

A part of me really wonders if the 6 phases above should have beensubmitted independently. ie, once the scaffolding was done why not goahead and get that reviewed & installed, then move on to phase2 patches.I bring it up more for future work of a similar nature.

I realize that you had to do a fair amount of the later work to ensurethe scaffolding was right and so that we could see what the end resultwould likely look like. But something feels like it could be staged better.


Anyway...


Performance
===========

I tested the performance with --enable-checking=release using
two large files (kdecore.cc, bigcode.c), comparing a control build
to a patched build.

There were no significant differences in compilation time:

OK. A bit of a surprise, but then again perhaps not because we don'tenable RTL checking by default. And I suspect a goodly amount of theRTL checking overhead is in the operands, not on the chain itself.


As for the performance of a regular build i.e. with as_a<>
checks *enabled*; looking at the wallclock time taken for a bootstrap and
regression test, for my s390 builds (with -j3) I saw:

s390 control:
   "make" time: 68 mins
   "make check" time: 122 mins
   total time: 190 mins

s390 experiment:
   "make" time: 70 mins
   "make check" time: 126 mins
   total time: 196 mins

showing a 3% increase, presumably due to the as_a and as_a_nullable
checks.

You want to be real careful benchmarking on ppc/s390 hardware as they'repartitioned and what goes on in a different partition can affect yourpartition.


i.e. a release build shows no change in performance; a debug build shows
a 3% increase in time taken to bootstrap and regression test.  I believe
the debug build could be sped up with further patches to eliminate the
checked casts.

Yea. As I've mentioned, my vision is to try and push out the casts asmuch as reasonably possible.

Follow-Ups:
- Re: [PATCH 000/236] Introduce rtx subclasses
  - From: David Malcolm

References:
- [PATCH 000/236] Introduce rtx subclasses
  - From: David Malcolm

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]