This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Extend tree-call-cdce to calls whose result is used


On Mon, Nov 9, 2015 at 10:03 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Mon, 9 Nov 2015, Richard Sandiford wrote:
>
>> +static bool
>> +can_use_internal_fn (gcall *call)
>> +{
>> +  /* Only replace calls that set errno.  */
>> +  if (!gimple_vdef (call))
>> +    return false;
>
> Oh, I managed to confuse this in my head while reading the patch.  So,
> hmm, you don't actually replace the builtin with an internal function
> (without the condition) under no-errno-math?  Does something else do that?
> Because otherwise that seems an unnecessary restriction?
>
>> >> r229916 fixed that for the non-EH case.
>> >
>> > Ah, missed it.  Even the EH case shouldn't be difficult.  If the
>> > original dominator of the EH destination was the call block it moves,
>> > otherwise it remains unchanged.
>>
>> The target of the edge is easy in itself, I agree, but that isn't
>> necessarily the only affected block, if the EH handler doesn't
>> exit or rethrow.
>
> You're worried the non-EH and the EH regions merge again, right?  Like so:
>
> before change:
>
>         BB1: throwing-call
>      fallthru/           \EH
>     BB2                   BBeh
>      |                   /....\ (stuff in EH-region)
>      |                     /    some path out of EH region
>      | /------------------/
>     BB3
>
> Here, BB3 must at least be dominated by BB1 (the throwing block), or by
> something further up (when there are other side-entries to the path
> BB2->BB3 or into the EH region).  When further up, nothing changes, when
> it's BB1, then it's afterwards dominated by the BB containing the
> condition.  So everything with idom==BB1 gets idom=Bcond, except for BBeh,
> which gets idom=Bcall.  Depending on how you split BB1, either Bcond or
> BBcall might still be BB1 and doesn't lead to changes in the dom tree.
>
>> > Currently we have quite some of such passes (reassoc, forwprop,
>> > lower_vector_ssa, cse_reciprocals, cse_sincos (sigh!), optimize_bswap
>> > and others), but they are all handling only special situations in one
>> > way or the other.  pass_fold_builtins is another one, but it seems
>> > most related to what you want (replacing a call with something else),
>> > so I thought that'd be the natural choice.
>>
>> Well, to be pedantic, it's not really replacing the call.  Except for
>> the special case of targets that support direct assignments to errno,
>> it keeps the original call but ensures that it isn't usually executed.
>> From that point of view it doesn't really seem like a fold.
>>
>> But I suppose that's just naming again :-).  And it's easily solved with
>> s/fold/rewrite/.
>
> Exactly, in my mind pass_fold_builtin (like many of the others I
> mentioned) doesn't do folding but rewriting :)

So I am replying here to the issue of where to do the transform call_cdce
does and the one Richard wants to add.  For example we "lower"
posix_memalign as early as GIMPLE lowering (that's before CFG construction).
We also lower sincos to cexpi during GENERIC folding (or if that is dropped
either GIMPLE lowering or GIMPLE folding during gimplification would be
appropriate).

Now, with offloading we have to avoid creating target dependencies before
LTO stream-out (thus no IFN replacements before that - not sure if
Richards patches have an issue there already).  Which would leave us
with a lowering stage early in the main optimization pipeline - I think
fold_builtins pass is way too late but any "folding" pass will do
(like forwprop or backprop where the latter might be better because
it might end up computing FP "ranges" to improve the initial lowering code).

Of course call_cdce is as good as long as it still exists.

>> > call_cdce is also such a pass, but I think it's simply not the
>> > appropriate one (only in so far as its source file contains the helper
>> > routines you need), and in addition I think it shouldn't exist at all
>> > (and wouldn't need to if it had been part of DCE from the start, or if
>> > you implemented the conditionalizing as part of another pass).  Hey,
>> > you could be one to remove a pass! ;-)
>>
>> It still seems a bit artificial to me to say that the transformation
>> with a null lhs is "DCE enough" to go in the main DCE pass (even though
>> like I say it doesn't actually eliminate any code from the IR, it just
>> adds more code) and should be kept in a separate pass from the one that
>> does the transformation on a non-null lhs.
>
> Oh, I agree, I might not have been clear: I'm not arguing that the normal
> DCE should now be changed to do the conditionalizing when it removes an
> call LHS; I was saying that it _would_ have been good instead of adding
> the call_cdce pass in the past, when it was for DCE purposes only.

Yes, I also argued that.

> But
> now your proposal is on the plate, namely doing the conditionalizing also
> with an LHS.  So that conditionalizing should take place in some rewriting
> pass (and ideally not call_cdce), no matter the LHS, and normal DCE not be
> changed (it will still remove LHSs of non-removable calls, just that those
> then are sometimes under a condition, when DCE runs after the rewriting).

Indeed.  How difficult is to integrate all this with backprop?

Thanks,
Richard.

>
> Ciao,
> Michael.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]