[PATCH] Don't peel extra copy of loop in unroller for loops with exit at end

Andrew Pinski pinskia@gmail.com
Sat Oct 15 03:28:00 GMT 2016

On Thu, Sep 22, 2016 at 12:10 PM, Pat Haugen
<pthaugen@linux.vnet.ibm.com> wrote:
> I noticed the loop unroller peels an extra copy of the loop before it enters the switch block code to round the iteration count to a multiple of the unroll factor. This peeled copy is only needed for the case where the exit test is at the beginning of the loop since in that case it inserts the test for zero peel iterations before that peeled copy.
> This patch bumps the iteration count by 1 for loops with the exit at the end so that it represents the number of times the loop body is executed, and therefore removes the need to always execute that first peeled copy. With this change, when the number of executions of the loop is an even multiple of the unroll factor then the code will jump to the unrolled loop immediately instead of executing all the switch code and peeled copies of the loop and then falling into the unrolled loop. This change also reduces code size by removing a peeled copy of the loop.
> Bootstrap/regtest on powerpc64le with no new regressions. Ok for trunk?

This patch or
PR rtl-optimization/68212
* cfgloopmanip.c (duplicate_loop_to_header_edge): Use preheader edge
frequency when computing scale factor for peeled copies.
* loop-unroll.c (unroll_loop_runtime_iterations): Fix freq/count
values for switch/peel blocks/edges.

Caused a ~2.7-3.5% regression in coremarks with -funroll-all-loops.


> 2016-09-22  Pat Haugen  <pthaugen@us.ibm.com>
>         * loop-unroll.c (unroll_loop_runtime_iterations): Condition initial
>         loop peel to loops with exit test at the beginning.

More information about the Gcc-patches mailing list