[PATCH/RFC] Simplify wrapped RTL op

Tue Aug 27 10:50:00 GMT 2019

On Tue, Aug 27, 2019 at 11:12:32AM +0200, Robin Dapp wrote:
> as announced in the wrapped-binop gimple patch mail, on s390 we still
> emit odd code in front of loops:

>    aghi    %r1,-8
>    srlg    %r1,%r1,3
>    aghi    %r1,1

This is done like this because %r1 might be 0.

We see this same problem on Power; there are quite a few PRs about it.

[ ... ]

> helps immediately, yet overflow/range information is not considered.

Yeah, and it has to be.

> Do
> we somehow guarantee that the niter-related we created until doloop do
> not overflow?  I did not note something when looking through the code.
> Granted, the simplification seems oddly specific and is probably not
> useful for a wide range of targets and situations.

You're at least the third target, and it's pretty annoying, and it tends
to cost more than two insns (because things can often be simplified
further after this).  It won't do super much for execution time, there
is a loop after this after all, a handful of insns executed once can't
be all that expensive relatively.

> Another approach would be to store "niter+1" (== n) when niter (== n-1)
> is calculated and, when we need to do the increment, use the niter+1
> that we already have without needing to simplify (n - 8) >> 3 + 1.
> 
> Any comments on this?
> 
> The patch above bootstraps and test suite is without regressions on s390
> fwiw.

When something similar was tried before there were regressions for
rs6000.  I'll find the PR later.

I was hoping that now that ivopts learns about doloops, this can be
handled better as well.  Ideally the doloop pass can move closer to
expand, and do much less analysis and work, all the heavy lifting has
been done already.

Segher