This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gimple-classes, committed 4/6] tree-ssa-tail-merge.c: Use gassign

On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
> On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
> > To be constructive here - the above case is from within a
> > GIMPLE_ASSIGN case label
> > and thus I'd have expected
> > 
> >     case GIMPLE_ASSIGN:
> >       {
> >         gassign *a1 = as_a <gassign *> (s1);
> >         gassign *a2 = as_a <gassign *> (s2);
> >       lhs1 = gimple_assign_lhs (a1);
> >       lhs2 = gimple_assign_lhs (a2);
> >       if (TREE_CODE (lhs1) != SSA_NAME
> >           && TREE_CODE (lhs2) != SSA_NAME)
> >         return (operand_equal_p (lhs1, lhs2, 0)
> >                 && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
> >                                                  gimple_assign_rhs1 (a2)));
> >       else if (TREE_CODE (lhs1) == SSA_NAME
> >                && TREE_CODE (lhs2) == SSA_NAME)
> >         return vn_valueize (lhs1) == vn_valueize (lhs2);
> >       return false;
> >       }
> > 
> > instead.  That's the kind of changes I have expected and have approved of.
> But even that looks like just adding extra work for all developers, with no
> gain.  You only have to add extra code and extra temporaries, in switches
> typically also have to add {} because of the temporaries and thus extra
> indentation level, and it doesn't simplify anything in the code.

The branch attempts to use the C++ typesystem to capture information
about the kinds of gimple statement we expect, both:
  (A) so that the compiler can detect type errors, and
  (B) as a comprehension aid to the human reader of the code

The ideal here is when function params and struct field can be
strengthened from "gimple" to a subclass ptr.  This captures the
knowledge that every use of a function or within a struct has a given
gimple code.

Examples of this for the initial patchkit were:

* the "call_stmt" field of a cgraph_edge becoming a gcall *,
  rather than a plain gimple.

* various variables in tree-into-ssa.c change from just
  vec<gimple> to being vec<gphi *>, capturing the "phi-ness" of
  the contents as a compile-time check

* tree-inline.h's struct copy_body_data, the field "debug_stmts"
  can be "concretized" from a vec<gimple> to a vec<gdebug *>.

A more recent example, from:;a=commitdiff;h=5bd16d92b9e928b5a5a7aebd571d92f72dd517f8
The fields "arr_ref_first" and "arr_ref_last" of
tree-switch-conversion.c's struct switch_conv_info can be strengthened
from gimple to gassign *: they are always GIMPLE_ASSIGN.

I applied cleanups to do my initial patchkit, which Jeff approved (with
some provisos), and which became the first 92 commits on the branch:

"[gimple-classes, committed 00/92] Initial slew of commits":

followed by a merger from trunk into the branch:
"[gimple-classes] Merge trunk r216157-r216746 into branch":

With those commits, I was able to convert 180 accessors to taking a
concrete subclass, with 158 left taking a gimple or
const_gimple i.e. about half of them.
(My script to analyze this is

I got it into my head that it was desirable to convert *all*
gimple accessors to this form, and to eliminate the GIMPLE_CHECK
macros (given that gcc development community seems to dislike
partial transitions).

I've been attempting this full conversion - convert all of the
gimple_ accessors, to require an appropriate gimple subclass
ptr, in particular focusing on the gimple_assign_ ones, but it's a *lot*
of extra work, and much more invasive than the patches
that Jeff conditionally approved.

I now suspect that it's going too far - in the initial patchkit I was
doing the clean, obvious ones, but now I'm left with the awkward ones
that would require me to uglify the code to "fix".

If it's OK to only convert some of them, then I'd rather just do that.

The type-strengthening is rarely as neat as being able to simply convert
a param or field type.  Some examples:

Functions passed a gsi
Sometimes functions are passed a gsi, where it can be known that the gsi
currently references a stmt of known kind (although that isn't
necessarily obvious from reading the body of the function):

Example from tree-ssa-strlen.c:

handle_char_store (gimple_stmt_iterator *gsi)
   int idx = -1;
   strinfo si = NULL;
-  gimple stmt = gsi_stmt (*gsi);
+  gassign *stmt = as_a <gassign *> (gsi_stmt (*gsi));
   tree ssaname = NULL_TREE, lhs = gimple_assign_lhs (stmt);
   if (TREE_CODE (lhs) == MEM_REF


Only acting on one kind of gimple

Some functions accept any kind of gimple, but only act on e.g. a
GIMPLE_ASSIGN, immediately returning if they got a different kind.

So I make this kind of change, where:

  foo (gimple stmt, other params)
    if (!is_gimple_assign (stmt))

    use_various gimple_assign_accessors (stmt);


  foo (gimple gs, other params)
    gassign *stmt = dyn_cast <gassign *> (gs);
    if (!stmt)

    use_various gimple_assign_accessors (stmt);

renaming the param to "gs" to avoid a mass rename of "stmt".
Example tree-ssa-sink.c (statement_sink_location):;a=commitdiff;h=da19cf71e540f52a924d0985efdacd9e03684a6e

Suites where we know the type of a statement

If we have:

   if (is_gimple_assign (stmt))
        /* 20 lines of logic, with numerous gimple_assign_ accessors
           on "stmt". */ 

then I've been converting it to:

   if (gassign *assign_stmt = dyn_cast <gassign *> (stmt))
        /* rename the "stmt" to "assign_stmt", wrapping lines as
           needed. */ 

Is "assign_stmt" too long here?  It's handy to be able to distinguish
the meaning of the stmt e.g. "use_stmt", "def_stmt" can have synonyms
"use_assign" and "def_assign" for the regions where they're known to be

The above is overkill for a 1-liner e.g.:

   if (gimple_get_code (stmt) == GIMPLE_ASSIGN)
     expr = gimple_get_lhs (stmt);

It could be converted to:

   if (gassign *assign_stmt = dyn_cast <gassign *> (stmt))
     expr = gimple_assign_get_lhs (assign_stmt);

or to:

   if (gimple_get_code (stmt) == GIMPLE_ASSIGN)
     expr = gimple_assign_get_lhs (as_a <gassign *> (stmt));

or left as-is if we don't want gimple_assign_get_lhs to require a
gassign *.

Casting functions

Taking RTL's single_set as inspiration, I converted the predicates:
to return a gassign * rather than a bool, so that they can be used as
casting functions:;a=commitdiff;h=06a4829d92f383a0fc1e9d488f1634d3764a0171

(also burning a register, since without LTO or some attribute the
compiler can't know the invariant that if non-NULL, then the return
value == the param (albeit with a cast) - do we have a function
attribute for that?).

Example of use, from tree-ssa-pre.c

@@ -4040,6 +4041,7 @@ eliminate_dom_walker::before_dom_children (basic_block b)
       tree sprime = NULL_TREE;
       gimple stmt = gsi_stmt (gsi);
       tree lhs = gimple_get_lhs (stmt);
+      gassign *assign_stmt;
       if (lhs && TREE_CODE (lhs) == SSA_NAME
          && !gimple_has_volatile_ops (stmt)
          /* See PR43491.  Do not replace a global register variable when
@@ -4049,10 +4051,10 @@ eliminate_dom_walker::before_dom_children (basic_block b)
             ???  The fix isn't effective here.  This should instead
             be ensured by not value-numbering them the same but treating
             them like volatiles?  */
-         && !(gimple_assign_single_p (stmt)
-              && (TREE_CODE (gimple_assign_rhs1 (stmt)) == VAR_DECL
-                  && DECL_HARD_REGISTER (gimple_assign_rhs1 (stmt))
-                  && is_global_var (gimple_assign_rhs1 (stmt)))))
+         && !((assign_stmt = gimple_assign_single_p (stmt))
+              && (TREE_CODE (gimple_assign_rhs1 (assign_stmt)) == VAR_DECL
+                  && DECL_HARD_REGISTER (gimple_assign_rhs1 (assign_stmt))
+                  && is_global_var (gimple_assign_rhs1 (assign_stmt)))))

(followed by lots of renaming of "stmt" to "assign_stmt" within the
guarded scope; this is from:;a=commitdiff;h=18277e45b407ddd9f15012b48caf7403354ebaec

More awkward cases
e.g. code that can work on both GIMPLE_CALL and GIMPLE_ASSIGN; once
we've dealt with GIMPLE_CALL, we can add:

  gassign *assign_stmt = as_a <gassign *> (stmt);

and have the rest of the code work on "assign_stmt".


Current status
I currently have 261 accessors converted, with 75 to go i.e.
about 75%, as compared to the 50% from the original patchkit.
I believe that the additional 50->75% work is relatively clean, but I'm
hitting a wall where the final 25% is requiring
much more invasive work, of the kind objected to earlier in this thread.
I'm having to do uglier and uglier patches.

How to proceed?

I like the initial work I did, the:
  "[gimple-classes, committed 00/92] Initial slew of commits":
work, but I agree that the work on the branch after that is ugly in
places, and the volume of the work may resemble a DoS attack on the
reviewers - sorry.  (to answer a question on IRC, those followup commits
*were* handwritten, not autogenerated).

Is it acceptable to have the partial transition of strengthening the
types of only 50% or 75% of the accessors?

I'd like to apply those first commits to trunk (applying necessary
merger work), but I know we're approaching the close of stage 1 for gcc

I can try to identify a subset of patches I think are likely to be more
acceptable if this work is still viable, and maybe put them on a
different git branch.  I hope I can get close to the 75% mark without
too much ugliness and verbosity.



(fwiw, I'm getting rather sick of refactoring, and keen to focus on
user-visible work for gcc 6)

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]