[tree-tailcall] Check if function returns it's argument

Jeff Law law@redhat.com
Wed Dec 7 21:20:00 GMT 2016


On 12/06/2016 03:16 AM, Richard Biener wrote:
> On Tue, 6 Dec 2016, Richard Biener wrote:
>
>> On Mon, 5 Dec 2016, Jeff Law wrote:
>>
>>> On 12/02/2016 01:33 AM, Richard Biener wrote:
>>>>> The LHS on the assignment makes it easier to identify when a tail call is
>>>>> possible.  It's not needed for correctness.  Not having the LHS on the
>>>>> assignment just means we won't get an optimized tail call.
>>>>>
>>>>> Under what circumstances would the LHS possibly be removed?  We know the
>>>>> return statement references the LHS, so it's not going to be something
>>>>> that
>>>>> DCE will do.
>>>>
>>>> Well, I thought Prathamesh added the optimization to copy-propagate
>>>> the lhs from the returned argument.  So we'd have both transforms here.
>>> That seems like a mistake -- the fact that we can copy propagate the LHS from
>>> the returned argument is interesting, but in practice I've found it to not be
>>> useful to do so.
>>>
>>> The problem is it makes the value look live across a the call and we're then
>>> dependent upon the register allocator to know the trick about the returned
>>> argument value and apply it consistently -- which it does not last I checked.
>>>
>>> I think we're better off leaving the call in the form of LHS = call () if the
>>> return value is used.  That's going to be more palatable to tail calling.
>>
>> Yes, that's something I also raised earlier in the thread.  Note that
>> any kind of value-numbering probably wants to know the equivalence
>> for simplifications but in the end wants to disable propagating the
>> copy (in fact it should propagate the return value from the point of
>> the call).  I suppose I know how to implement that in FRE/PRE given it has
>> separate value-numbering and elimination phases.  Something for GCC 8.
>
> The following does that (it shows we don't handle separating LHS
> and overall stmt effect very well).  It optimizes a testcase like
>
> void *foo (void *p, int c, __SIZE_TYPE__ n)
> {
>   void *q = __builtin_memset (p, c, n);
>   if (q == p)
>     return p;
>   return q;
> }
>
> to
>
> foo (void * p, int c, long unsigned int n)
> {
>   void * q;
>
>   <bb 2> [0.0%]:
>   q_7 = __builtin_memset (p_3(D), c_4(D), n_5(D));
>   return q_7;
>
> in early FRE.
Yea.  Not sure how often something like that would happen in practice, 
but using the equivalence to simplify rather than for propagation seems 
like the way to go.

I keep thinking about doing some similar in DOM, but haven't gotten 
around to seeing what the fallout would be.

jeff



More information about the Gcc-patches mailing list