[tree-tailcall] Check if function returns it's argument
Jeff Law
law@redhat.com
Wed Dec 7 21:20:00 GMT 2016
On 12/06/2016 03:16 AM, Richard Biener wrote:
> On Tue, 6 Dec 2016, Richard Biener wrote:
>
>> On Mon, 5 Dec 2016, Jeff Law wrote:
>>
>>> On 12/02/2016 01:33 AM, Richard Biener wrote:
>>>>> The LHS on the assignment makes it easier to identify when a tail call is
>>>>> possible. It's not needed for correctness. Not having the LHS on the
>>>>> assignment just means we won't get an optimized tail call.
>>>>>
>>>>> Under what circumstances would the LHS possibly be removed? We know the
>>>>> return statement references the LHS, so it's not going to be something
>>>>> that
>>>>> DCE will do.
>>>>
>>>> Well, I thought Prathamesh added the optimization to copy-propagate
>>>> the lhs from the returned argument. So we'd have both transforms here.
>>> That seems like a mistake -- the fact that we can copy propagate the LHS from
>>> the returned argument is interesting, but in practice I've found it to not be
>>> useful to do so.
>>>
>>> The problem is it makes the value look live across a the call and we're then
>>> dependent upon the register allocator to know the trick about the returned
>>> argument value and apply it consistently -- which it does not last I checked.
>>>
>>> I think we're better off leaving the call in the form of LHS = call () if the
>>> return value is used. That's going to be more palatable to tail calling.
>>
>> Yes, that's something I also raised earlier in the thread. Note that
>> any kind of value-numbering probably wants to know the equivalence
>> for simplifications but in the end wants to disable propagating the
>> copy (in fact it should propagate the return value from the point of
>> the call). I suppose I know how to implement that in FRE/PRE given it has
>> separate value-numbering and elimination phases. Something for GCC 8.
>
> The following does that (it shows we don't handle separating LHS
> and overall stmt effect very well). It optimizes a testcase like
>
> void *foo (void *p, int c, __SIZE_TYPE__ n)
> {
> void *q = __builtin_memset (p, c, n);
> if (q == p)
> return p;
> return q;
> }
>
> to
>
> foo (void * p, int c, long unsigned int n)
> {
> void * q;
>
> <bb 2> [0.0%]:
> q_7 = __builtin_memset (p_3(D), c_4(D), n_5(D));
> return q_7;
>
> in early FRE.
Yea. Not sure how often something like that would happen in practice,
but using the equivalence to simplify rather than for propagation seems
like the way to go.
I keep thinking about doing some similar in DOM, but haven't gotten
around to seeing what the fallout would be.
jeff
More information about the Gcc-patches
mailing list