[tree-tailcall] Check if function returns it's argument
Richard Biener
rguenther@suse.de
Tue Dec 6 10:16:00 GMT 2016
On Tue, 6 Dec 2016, Richard Biener wrote:
> On Mon, 5 Dec 2016, Jeff Law wrote:
>
> > On 12/02/2016 01:33 AM, Richard Biener wrote:
> > > > The LHS on the assignment makes it easier to identify when a tail call is
> > > > possible. It's not needed for correctness. Not having the LHS on the
> > > > assignment just means we won't get an optimized tail call.
> > > >
> > > > Under what circumstances would the LHS possibly be removed? We know the
> > > > return statement references the LHS, so it's not going to be something
> > > > that
> > > > DCE will do.
> > >
> > > Well, I thought Prathamesh added the optimization to copy-propagate
> > > the lhs from the returned argument. So we'd have both transforms here.
> > That seems like a mistake -- the fact that we can copy propagate the LHS from
> > the returned argument is interesting, but in practice I've found it to not be
> > useful to do so.
> >
> > The problem is it makes the value look live across a the call and we're then
> > dependent upon the register allocator to know the trick about the returned
> > argument value and apply it consistently -- which it does not last I checked.
> >
> > I think we're better off leaving the call in the form of LHS = call () if the
> > return value is used. That's going to be more palatable to tail calling.
>
> Yes, that's something I also raised earlier in the thread. Note that
> any kind of value-numbering probably wants to know the equivalence
> for simplifications but in the end wants to disable propagating the
> copy (in fact it should propagate the return value from the point of
> the call). I suppose I know how to implement that in FRE/PRE given it has
> separate value-numbering and elimination phases. Something for GCC 8.
The following does that (it shows we don't handle separating LHS
and overall stmt effect very well). It optimizes a testcase like
void *foo (void *p, int c, __SIZE_TYPE__ n)
{
void *q = __builtin_memset (p, c, n);
if (q == p)
return p;
return q;
}
to
foo (void * p, int c, long unsigned int n)
{
void * q;
<bb 2> [0.0%]:
q_7 = __builtin_memset (p_3(D), c_4(D), n_5(D));
return q_7;
in early FRE.
Index: gcc/tree-ssa-pre.c
===================================================================
--- gcc/tree-ssa-pre.c (revision 243282)
+++ gcc/tree-ssa-pre.c (working copy)
@@ -4065,10 +4068,11 @@ eliminate_avail (tree op)
tree valnum = VN_INFO (op)->valnum;
if (TREE_CODE (valnum) == SSA_NAME)
{
+ if (el_avail.length () > SSA_NAME_VERSION (valnum)
+ && el_avail[SSA_NAME_VERSION (valnum)])
+ return el_avail[SSA_NAME_VERSION (valnum)];
if (SSA_NAME_IS_DEFAULT_DEF (valnum))
return valnum;
- if (el_avail.length () > SSA_NAME_VERSION (valnum))
- return el_avail[SSA_NAME_VERSION (valnum)];
}
else if (is_gimple_min_invariant (valnum))
return valnum;
@@ -4263,6 +4275,14 @@ eliminate_dom_walker::before_dom_childre
eliminate_push_avail (sprime);
}
+ /* While we might have value numbered a call return value to
+ one of its arguments the call itself might not be solely
+ represented by its return value. Thus do not ignore
+ side-effects indicated by a VARYING vdef. */
+ if (gimple_vdef (stmt)
+ && VN_INFO (gimple_vdef (stmt))->valnum == gimple_vdef
(stmt))
+ sprime = NULL_TREE;
+
/* If this now constitutes a copy duplicate points-to
and range info appropriately. This is especially
important for inserted code. See tree-ssa-copy.c
Index: gcc/tree-ssa-sccvn.c
===================================================================
--- gcc/tree-ssa-sccvn.c (revision 243282)
+++ gcc/tree-ssa-sccvn.c (working copy)
@@ -3951,6 +4061,25 @@ visit_use (tree use)
changed = defs_to_varying (call_stmt);
goto done;
}
+
+ int rflags = gimple_call_return_flags (call_stmt);
+ if (rflags & ERF_RETURNS_ARG)
+ {
+ unsigned argnum = rflags & ERF_RETURN_ARG_MASK;
+ if (argnum < gimple_call_num_args (call_stmt))
+ {
+ tree arg = gimple_call_arg (call_stmt, argnum);
+ if (TREE_CODE (arg) == SSA_NAME
+ || is_gimple_min_invariant (arg))
+ {
+ changed = visit_copy (lhs, arg);
+ if (gimple_vdef (call_stmt))
+ changed |= set_ssa_val_to (gimple_vdef
(call_stmt),
+ gimple_vdef
(call_stmt));
+ goto done;
+ }
+ }
+ }
}
/* Pick up flags from a devirtualization target. */
> > > Of course as always the user could have written the code in this way.
> > >
> > > If the LHS is not required for correctness then I don't think we need
> > > to put it there - Pratamesh verified the call is tail-called already
> > > if marked by the tailcall pass, even if the LHS is not present.
> > But if the function returns the value from the tail call, then going through
> > an LHS is the right thing to do. Using the magic "argX will be the return
> > value" seems clever, but actually hurts in practice.
>
> So we do want the reverse transform (for the case of returning by
> reference that's going to be tricky if not impossible due to the
> IL hygiene we enforce).
>
> Richard.
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
More information about the Gcc-patches
mailing list