[PATCH 0/4 GCC11] IVOPTs consider step cost for different forms when unrolling

Kewen.Lin linkw@linux.ibm.com
Thu Jan 16 09:41:00 GMT 2020


Hi,

As we discussed in the thread
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00196.html
Original: https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00104.html,
I'm working to teach IVOPTs to consider D-form group access during unrolling.
The difference on D-form and other forms during unrolling is we can put the
stride into displacement field to avoid additional step increment. eg:

With X-form (uf step increment):
  ...
  LD A = baseA, X
  LD B = baseB, X
  ST C = baseC, X
  X = X + stride
  LD A = baseA, X
  LD B = baseB, X
  ST C = baseC, X
  X = X + stride
  LD A = baseA, X
  LD B = baseB, X
  ST C = baseC, X
  X = X + stride
  ...

With D-form (one step increment for each base):
  ...
  LD A = baseA, OFF
  LD B = baseB, OFF
  ST C = baseC, OFF
  LD A = baseA, OFF+stride
  LD B = baseB, OFF+stride
  ST C = baseC, OFF+stride
  LD A = baseA, OFF+2*stride
  LD B = baseB, OFF+2*stride
  ST C = baseC, OFF+2*stride
  ...
  baseA += stride * uf
  baseB += stride * uf
  baseC += stride * uf

Imagining that if the loop get unrolled by 8 times, then 3 step updates with
D-form vs. 8 step updates with X-form. Here we only need to check stride
meet D-form field requirement, since if OFF doesn't meet, we can construct
baseA' with baseA + OFF.

This patch set consists four parts:
     
  [PATCH 1/4 GCC11] Add middle-end unroll factor estimation

     Add unroll factor estimation in middle-end. It mainly refers to current
     RTL unroll factor determination in function decide_unrolling and its
     sub calls.  As Richard B. suggested, we probably can force unroll factor
     with this and avoid duplicate unroll factor calculation, but I think it
     need more benchmarking work and should be handled separately.

  [PATCH 2/4 GCC11] Add target hook stride_dform_valid_p 

     Add one target hook to determine whether the current memory access with
     the given mode, stride and other flags have available D-form supports.
     
  [PATCH 3/4 GCC11] IVOPTs Consider cost_step on different forms during unrolling

     Teach IVOPTs to identify address type iv group with D-form preferred,
     and flag dform_p of their derived iv cands.  Considering unroll factor,
     increase iv cost with (uf - 1) * cost_step if it's not a dform iv cand. 
     
  [PATCH 4/4 GCC11] rs6000: P9 D-form test cases

     Add some test cases, mainly copied from Kelvin's patch.

Bootstrapped and regress tested on powerpc64le-linux-gnu.
I'll take two weeks leave soon, please expect late responses.
Thanks a lot in advance!

BR,
Kewen

------------

 gcc/cfgloop.h                                       |   3 +
 gcc/config/rs6000/rs6000.c                          |  56 ++++++++++++++++-
 gcc/doc/tm.texi                                     |  14 +++++
 gcc/doc/tm.texi.in                                  |   4 ++
 gcc/target.def                                      |  21 ++++++-
 gcc/testsuite/gcc.target/powerpc/p9-dform-0.c       |  43 +++++++++++++
 gcc/testsuite/gcc.target/powerpc/p9-dform-1.c       |  55 +++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/p9-dform-2.c       |  12 ++++
 gcc/testsuite/gcc.target/powerpc/p9-dform-3.c       |  15 +++++
 gcc/testsuite/gcc.target/powerpc/p9-dform-4.c       |  12 ++++
 gcc/testsuite/gcc.target/powerpc/p9-dform-generic.h |  34 +++++++++++
 gcc/tree-ssa-loop-ivopts.c                          |  84 +++++++++++++++++++++++++-
 gcc/tree-ssa-loop-manip.c                           | 254 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/tree-ssa-loop-manip.h                           |   3 +-
 gcc/tree-ssa-loop.c                                 |  33 ++++++++++
 gcc/tree-ssa-loop.h                                 |   2 +
 16 files changed, 640 insertions(+), 5 deletions(-)



More information about the Gcc-patches mailing list