This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Eliminates phi on branch for CMP result
Hi,
Thanks Richard, Segher, Andrew and all.
Segher Boessenkool <segher@kernel.crashing.org> writes:
> Let me try to answer some of this...
>
> On Tue, May 07, 2019 at 03:31:27PM +0200, Richard Biener wrote:
>> On Mon, 6 May 2019, Jiufu Guo wrote:
>> > This patch implements the optimization in PR77820. The optimization
>> > eliminates phi and phi's basic block, if the phi is used only by
>> > condition branch, and the phi's incoming value in the result of a
>> > CMP result.
>
>> I'm not sure I like a new pass here ;) The transform is basically
>> tail-duplicating the PHI block because the exit conditional can
>> be "simplfied" - that's something jump threading generally does
>> though it relies on "simplified" being a little bit more simplified
>> than above.
>
Adding a new pass is just because it may be relatively easy to extend
and maintain.
Adding this micor-optimization into other passes is also a good
sugguestion.
- Similar with jump threading, this new transform is replacing jump
old destination with new destination. While for this case, the new
destination can not be determined at compile time.
- And as Andrew said: forwprop/backprop are similar, but this case is in
opposite: it is spread common code(phi+gcond) into different preds.
- And there is phiopt pass which focus on making 'phi' stmt better.
it also do some similar thing:
bb0:
if (a != b) goto bb2; else goto bb1;
bb1:
bb2:
x = PHI <a (bb1), b (bb0), ...>;
tranform to:
bb0:
bb2:
x = PHI <b (bb0), ...>;
If I avoid to add new pass, any suggustions about which exiting pass
(jumpthreading/forwprop/phiopt/...) would be more suitable to extend to
support this new transform?
> Right, but where in the pipeline does this fit in?
>
>> I suspect this transform was implemented because of some benchmark?
>
> Something in SPEC... 2006 iirc... Will need to dig it up, I forgot
> the details.
>
>> I suspect the performance benefit is because of better branch
>> prediction by not mangling both conditional branches into one?
>
> No, it is that previously a condition was moved to a GPR, and then compared
> again. See PR77820. This is expensive, and serial, too.
>
This transform focusing on eliminating additional GPR saving and
additional compare, then it would helps a lot if the comparasion in in
hot path.
>> The transform is also somewhat similar to tail-duplication done
>> in path splitting or tracer.
>
> Yes.
>
>> The pass itself does quite strict pattern-matching but I wonder
>> if more cases should be handled this way.
>
> Maybe. Probably. But which?
Currently this pass handles the case if there is only one PHI and the
phi is used by gcond. If there are other suitable cases, we can just
extend this tranform.
>
>> Any specific reason for the pass placement between PRE and sinking?
>> tracer and path splitting run much later, jump threading runs
>> all over the place.
>
> Dunno. Jiufu, does the pass placement matter much?
This pass would just need be inserted after ssa, and before the last dce
and forwprop which could help to eliminate temp conversion from bool to
int:
_1 = a_8(D) < b_9(D);
t_14 = (int) _1;
if (_1 != 0)
Putting it after PRE is just because some case is better be handled by PREx,
like below case: if (x) t = a < b; else t = a < b; if (t) xx.
While actually, this pass is ok to insert at other placement.
>
>
> Segher