This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Extend tree-call-cdce to calls whose result is used
- From: Richard Sandiford <richard dot sandiford at arm dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: Michael Matz <matz at suse dot de>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 13 Nov 2015 13:12:10 +0000
- Subject: Re: Extend tree-call-cdce to calls whose result is used
- Authentication-results: sourceware.org; auth=none
- References: <874mgyorwm dot fsf at e105548-lin dot cambridge dot arm dot com> <alpine dot LSU dot 2 dot 20 dot 1511091509260 dot 11029 at wotan dot suse dot de> <87r3jznqvg dot fsf at e105548-lin dot cambridge dot arm dot com> <alpine dot LSU dot 2 dot 20 dot 1511091652400 dot 11029 at wotan dot suse dot de> <87oaf3m4p3 dot fsf at e105548-lin dot cambridge dot arm dot com> <alpine dot LSU dot 2 dot 20 dot 1511092122060 dot 11029 at wotan dot suse dot de> <CAFiYyc1A2hbvFa0OYmLzRRmDbzaB6+o9iRfdWX_POW-aHJF1DA at mail dot gmail dot com>
Richard Biener <richard.guenther@gmail.com> writes:
> On Mon, Nov 9, 2015 at 10:03 PM, Michael Matz <matz@suse.de> wrote:
>> Hi,
>>
>> On Mon, 9 Nov 2015, Richard Sandiford wrote:
>>
>>> +static bool
>>> +can_use_internal_fn (gcall *call)
>>> +{
>>> + /* Only replace calls that set errno. */
>>> + if (!gimple_vdef (call))
>>> + return false;
>>
>> Oh, I managed to confuse this in my head while reading the patch. So,
>> hmm, you don't actually replace the builtin with an internal function
>> (without the condition) under no-errno-math? Does something else do that?
>> Because otherwise that seems an unnecessary restriction?
>>
>>> >> r229916 fixed that for the non-EH case.
>>> >
>>> > Ah, missed it. Even the EH case shouldn't be difficult. If the
>>> > original dominator of the EH destination was the call block it moves,
>>> > otherwise it remains unchanged.
>>>
>>> The target of the edge is easy in itself, I agree, but that isn't
>>> necessarily the only affected block, if the EH handler doesn't
>>> exit or rethrow.
>>
>> You're worried the non-EH and the EH regions merge again, right? Like so:
>>
>> before change:
>>
>> BB1: throwing-call
>> fallthru/ \EH
>> BB2 BBeh
>> | /....\ (stuff in EH-region)
>> | / some path out of EH region
>> | /------------------/
>> BB3
>>
>> Here, BB3 must at least be dominated by BB1 (the throwing block), or by
>> something further up (when there are other side-entries to the path
>> BB2->BB3 or into the EH region). When further up, nothing changes, when
>> it's BB1, then it's afterwards dominated by the BB containing the
>> condition. So everything with idom==BB1 gets idom=Bcond, except for BBeh,
>> which gets idom=Bcall. Depending on how you split BB1, either Bcond or
>> BBcall might still be BB1 and doesn't lead to changes in the dom tree.
>>
>>> > Currently we have quite some of such passes (reassoc, forwprop,
>>> > lower_vector_ssa, cse_reciprocals, cse_sincos (sigh!), optimize_bswap
>>> > and others), but they are all handling only special situations in one
>>> > way or the other. pass_fold_builtins is another one, but it seems
>>> > most related to what you want (replacing a call with something else),
>>> > so I thought that'd be the natural choice.
>>>
>>> Well, to be pedantic, it's not really replacing the call. Except for
>>> the special case of targets that support direct assignments to errno,
>>> it keeps the original call but ensures that it isn't usually executed.
>>> From that point of view it doesn't really seem like a fold.
>>>
>>> But I suppose that's just naming again :-). And it's easily solved with
>>> s/fold/rewrite/.
>>
>> Exactly, in my mind pass_fold_builtin (like many of the others I
>> mentioned) doesn't do folding but rewriting :)
>
> So I am replying here to the issue of where to do the transform call_cdce
> does and the one Richard wants to add. For example we "lower"
> posix_memalign as early as GIMPLE lowering (that's before CFG construction).
> We also lower sincos to cexpi during GENERIC folding (or if that is dropped
> either GIMPLE lowering or GIMPLE folding during gimplification would be
> appropriate).
>
> Now, with offloading we have to avoid creating target dependencies before
> LTO stream-out (thus no IFN replacements before that - not sure if
> Richards patches have an issue there already).
No, this patch was the earliest point at which we converted to internal
functions. The idea was to make code treat ECF_PURE built-in functions
and internal functions as being basically equivalent. There's therefore
not much benefit to doing a straight replacement of one with the other
during either GENERIC or gimple. Instead the series only used internal
functions for things that built-in functions couldn't do, specifically:
- the case used in this patch, to optimise part of a non-pure built-in
function using a pure equivalent.
- vector versions of built-in functions.
The cfgexpand patch makes sure that pure built-in functions are expanded
like internal functions where possible.
> Which would leave us with a lowering stage early in the main
> optimization pipeline - I think fold_builtins pass is way too late but
> any "folding" pass will do (like forwprop or backprop where the latter
> might be better because it might end up computing FP "ranges" to
> improve the initial lowering code).
This isn't at all related to what backprop is doing though.
backprop is about optimising definitions based on information
about all uses.
Does fold_builtins need to be as late as it is?
> Of course call_cdce is as good as long as it still exists.
Does this meann that you're not against the patch in principle
(i.e. keeping call_cdce for now and extending it in the way that
this patch does)?
Thanks,
Richard