This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/80155] [7 regression] Performance regression with code hoisting enabled


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155

--- Comment #15 from Thomas Preud'homme <thopre01 at gcc dot gnu.org> ---
(In reply to Thomas Preud'homme from comment #14)
> (In reply to rguenther@suse.de from comment #13)
> > On Thu, 23 Mar 2017, thopre01 at gcc dot gnu.org wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155
> > > 
> > > --- Comment #12 from Thomas Preud'homme <thopre01 at gcc dot gnu.org> ---
> > > (In reply to Richard Biener from comment #9)
> > > > Ah, the patches do not fix the testcase because the testcase is _not_ the
> > > > PRE-creates-IV case.  It's indeed simply hoisting/PRE at work transforming
> > > > 
> > > >   # a_14 = PHI <a_10(, a_5(D)>
> > > >   if (!b)
> > > >     a_8 = a_14 + 1;
> > > > 
> > > >   # a_2 = PHI <a_14(10), a_8(4)>
> > > >   a_10 = a_2 + 1;
> > > >   ... = *(a_2 + 1);
> > > > 
> > > > to
> > > > 
> > > >   # a_14 = PHI <prephimp_12, a_5(D)>
> > > >   _4 = a_14 + 1;
> > > >   if (b)
> > > >     _3 = _4 + 1;
> > > > 
> > > >   # a_2 = PHI <a_14, _4>
> > > >   # prephitmp_12 = PHI <_4, _3>
> > > >   ... = *(a_2 + 1);
> > > > 
> > > > increasing register pressure mainly because nothing figures that a_2 + 1
> > > > in the dereference can be replaced by prephitmp_12 ...
> > > > 
> > > > So this is a missed SLSR opportunity or, in this simple form, a missed
> > > > PRE/CSE opportunity.  Fixing that with the following (otherwise untested)
> > > > restores good code generation for the testcase:
> > > > 
> > > > Index: gcc/tree-ssa-pre.c
> > > > ===================================================================
> > > > --- gcc/tree-ssa-pre.c  (revision 246414)
> > > > +++ gcc/tree-ssa-pre.c  (working copy)
> > > > @@ -4636,6 +4610,35 @@ eliminate_dom_walker::before_dom_childre
> > > >             }
> > > >         }
> > > >  
> > > > +      if (gimple_has_ops (stmt))
> > > > +       for (unsigned i = 0; i < gimple_num_ops (stmt); ++i)
> > > > +         {
> > > > +           tree op = gimple_op (stmt, i);
> > > > +           if (op)
> > > > +             op = get_base_address (op);
> > > > +           if (op
> > > > +               && TREE_CODE (op) == MEM_REF
> > > > +               && ! integer_zerop (TREE_OPERAND (op, 1)))
> > > > +             {
> > > > +               tree ops[2];
> > > > +               vn_nary_op_t nary;
> > > > +               ops[0] = TREE_OPERAND (op, 0);
> > > > +               ops[1] = TREE_OPERAND (op, 1);
> > > > +               tree res = vn_nary_op_lookup_pieces (2, POINTER_PLUS_EXPR,
> > > > +                                                    TREE_TYPE (ops[0]),
> > > > +                                                    ops, &nary);
> > > > +               if (res && TREE_CODE (res) == SSA_NAME)
> > > > +                 res = eliminate_avail (res);
> > > > +               if (res)
> > > > +                 {
> > > > +                   TREE_OPERAND (op, 0) = res;
> > > > +                   TREE_OPERAND (op, 1)
> > > > +                     = build_int_cst (TREE_TYPE (TREE_OPERAND (op, 1)), 0);
> > > > +                   gimple_set_modified (stmt, true);
> > 
> > Add
> > 
> >                     if (TREE_CODE (res) == SSA_NAME
> >                         && !is_gimple_debug (stmt))
> >                       gimple_set_plf (SSA_NAME_DEF_STMT (res),
> >                                       NECESSARY, true);
> > 
> > here.
> > 
> > > > +                 }
> > > > +             }
> > > > +         }
> > > > +
> > > >        if (gimple_modified_p (stmt))
> > > >         {
> > > >           /* If a formerly non-invariant ADDR_EXPR is turned into an
> > > > 
> > > > note that in general optimzing
> > > > 
> > > >    q = p + 1;
> > > >      = ...*(p + 1);
> > > > 
> > > > "back" to *q will be undone by forwprop/stmt folding later but in this
> > > > case the feeding stmt is a PHI node and not a pointer-plus.  It still
> > > > means that the change might be a bit too disruptive at this point
> > > > (we could restricit it a bit by only handling the case where we don't
> > > > replace with a pointer-plus result).
> > > 
> > > Thanks for your work on this! Sadly GCC ICEs with this patch:
> > > 
> > > 0xd36f53 update_dep_bb
> > >         gcc/tree-ssa-tail-merge.c:411
> > > 0xd38f54 stmt_update_dep_bb
> > >         gcc/tree-ssa-tail-merge.c:429
> > > 0xd38f54 same_succ_hash
> > >         gcc/tree-ssa-tail-merge.c:452
> > > 0xd38f54 find_same_succ_bb
> > >         gcc/tree-ssa-tail-merge.c:717
> > > 0xd39927 find_same_succ
> > >         gcc/tree-ssa-tail-merge.c:748
> > > 0xd39927 init_worklist
> > >         gcc/tree-ssa-tail-merge.c:767
> > > 0xd39927 tail_merge_optimize(unsigned int)
> > >         gcc/tree-ssa-tail-merge.c:1726
> > > 0xce2d6a execute
> > >         gcc/tree-ssa-pre.c:5164
> > > 
> > >
> 
> There's progress. Performance is improved but not as much as disabling code
> hoisting. I'll try to reduce the testcase again with that patch to see if I
> can find a testcase that expose all issues.

Funnily this led back to the Cortex-M0+ reduced testcase. With the patch in
comment #13 applied we can still see a difference in the push (one register
pushed Vs 0).

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]