[PATCH 1/2] Seperate -funroll-loops for GIMPLE unroller and RTL unroller

Tue May 26 04:58:09 GMT 2020

David Edelsohn <dje.gcc@gmail.com> writes:

> On Mon, May 25, 2020 at 1:58 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
>>
>> On May 25, 2020 7:40:00 PM GMT+02:00, Segher Boessenkool <segher@kernel.crashing.org> wrote:
>> >On Mon, May 25, 2020 at 02:14:02PM +0200, Richard Biener wrote:
>> >> On Mon, May 25, 2020 at 1:10 PM guojiufu <guojiufu@linux.ibm.com>
>> >wrote:
>> >> Since a new flag is not needed to fix the regression please avoid
>> >> adding -fcomplete-unroll-loops.
>> >>
>> >> For -frtl-unroll-loops you should be able to use
>> >
>> >Erm.  That *is* a new command-line option (the internal flags I do not
>> >care about so much: new implementation details are *good*).  And a new
>> >name that is a mistake in my opinion, for many reasons (users do not
>> >know and should not have to care about "rtl"; the name is not
>> >descriptive; it is useless churn, it is not the same name as we have
>> >had for decades now; it is adding a new option for a future where we
>> >will do most unrolling at gimple level, a future we do not know will
>> >ever exist, and we do not know what that will look like anyway; it is
>> >an extra level of indirection (in the name)).
>> >
>> >We should not have an -frtl-unroll-loops if we do not have a
>> >-ftree-unroll-loops (or whatever).
>> >
>> >Unrolling early is not a good idea in general (the problems with the
>> >very trivial complete unrolling case just underline that).  But we
>> >*should* know which code we expect to unroll later, for better costing.
>> >Adding names like "rtl-unroll-loops" only stands in the way of getting
>> >a better design here.
>>
>> You folks made ppc specific hacks instead of a better design. Those
>> now stand in the way as well. But sure, simply do not expose the
>> flag to the users, use
>> Var(flag_rtl_unroll_loops). My other points still stand.
>>
>> Feel free to ignore the regression part on the branch and come up
>> with a great design. But don't expect to backport that then.
>
> I completely agree.

Thanks a lot for all your comments, suggestions, and tips in the
discussion.  Thank Richar, Segher, David, Hanza, and all!

I may have an explanation about the intention of this work.

We know that loop unrolling is a complex and tickly thing.  It could
help some kinds of code in a great manner.  Sometimes there are side
effects.  For different types of loop and different platforms, it may
result in different effects.
It would makes sense to tune the loop unrolling accordingly.  And so, to
help and tune loop unrolling is what we want to do.

Currently, we have loop unroller at GIMPLE part (cunroll/cunrolli) and
RTL part.  There are some options (like -funroll-loops) and --params to
control unrollers.

Through target hook, it would be helpful for different platforms to tune
unroller: checking the type of loops, check optimization level.
Existing hooks may help with something, like turn --params.

Adding separate flags(or options) may be helpful to control different
behaviors independently.  This is one reason for the patch which
introduces internal undocumented options.

One previous patch, r10-4525, is tunning for ppc at -O2. Which
implements an existing hook for rs6000 to check simple loops for RTL
unroller. For cunroll, it just enables it even increasing size at -O2
directly, without check the type of the loops.  And then the
side/negative effects of cunroll are also visible at -O2 besides
positive effects.  In PR95018, the side effect is shown on complex loop
(early exit, and more peeling).
One idea is for cunroll to tune it to avoid side effects. And if the
heuristic is suitable, it would be helpful for other usage, like -O3 and
-funroll-loops.

Thanks for any comments!

Jiufu

>
> This path is digging a deeper and deeper hole.
>
> - David