[Bug testsuite/92464] [10 regression] r278033 breaks gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c

linkw at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue Nov 12 08:23:00 GMT 2019


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92464

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2019-11-12
     Ever confirmed|0                           |1

--- Comment #1 from Kewen Lin <linkw at gcc dot gnu.org> ---
Before the regressed commit, the cost view looks like:
  0x13135eb0 ic[i_35] 2 times vector_stmt costs 2 in prologue
  0x13135eb0 ic[i_35] 1 times vector_stmt costs 1 in prologue
  0x13135eb0 ic[i_35] 1 times vector_load costs 1 in body
  0x13135eb0 ic[i_35] 1 times vec_perm costs 3 in body
  0x13135eb0 _5 1 times vector_store costs 1 in body
  .c:21:3: note:  not using a fully-masked loop.
  cost model: prologue peel iters set to vf/2.
  cost model: epilogue peel iters set to vf/2 because peeling for alignment is
unknown.
  0x13135eb0 <unknown> 1 times cond_branch_taken costs 3 in prologue
  0x13135eb0 <unknown> 1 times cond_branch_not_taken costs 1 in prologue
  0x13135eb0 <unknown> 1 times cond_branch_taken costs 3 in epilogue
  0x13135eb0 <unknown> 1 times cond_branch_not_taken costs 1 in epilogue
  0x13135eb0 ic[i_35] 2 times scalar_load costs 2 in prologue
  0x13135eb0 ic[i_35] 2 times scalar_load costs 2 in epilogue
  0x13135eb0 _5 2 times scalar_store costs 2 in prologue
  0x13135eb0 _5 2 times scalar_store costs 2 in epilogue
  .c:21:3: note:  Cost model analysis:
    Vector inside of loop cost: 5
    Vector prologue cost: 11
    Vector epilogue cost: 8
    Scalar iteration cost: 2
    Scalar outside cost: 0
    Vector outside cost: 19
    prologue iterations: 2
    epilogue iterations: 2
    Calculated minimum iters for profitability: 19

With the commit, the cost view is changed to:
  0x13135eb0 ic[i_35] 2 times vector_stmt costs 2 in prologue
  0x13135eb0 ic[i_35] 1 times vector_stmt costs 1 in prologue
  0x13135eb0 ic[i_35] 1 times vector_load costs 2 in body
  0x13135eb0 ic[i_35] 1 times vec_perm costs 3 in body
  0x13135eb0 _5 1 times vector_store costs 1 in body
  .c:21:3: note:  not using a fully-masked loop.
  cost model: prologue peel iters set to vf/2.
  cost model: epilogue peel iters set to vf/2 because peeling for alignment is
unknown.
  0x13135eb0 <unknown> 1 times cond_branch_taken costs 3 in prologue
  0x13135eb0 <unknown> 1 times cond_branch_not_taken costs 1 in prologue
  0x13135eb0 <unknown> 1 times cond_branch_taken costs 3 in epilogue
  0x13135eb0 <unknown> 1 times cond_branch_not_taken costs 1 in epilogue
  0x13135eb0 ic[i_35] 2 times scalar_load costs 4 in prologue
  0x13135eb0 ic[i_35] 2 times scalar_load costs 4 in epilogue
  0x13135eb0 _5 2 times scalar_store costs 2 in prologue
  0x13135eb0 _5 2 times scalar_store costs 2 in epilogue
  .c:21:3: note:  Cost model analysis:
    Vector inside of loop cost: 6
    Vector prologue cost: 13
    Vector epilogue cost: 10
    Scalar iteration cost: 3
    Scalar outside cost: 0
    Vector outside cost: 23
    prologue iterations: 2
    epilogue iterations: 2
    Calculated minimum iters for profitability: 12

The cost changes are expected, scalar and vector load cost more. It leads the
profitable min iter count become small.

I ran both before- and after-executable with 100000 invocations at 10 times,
the evaluated time are very close, both average time are 65.23s. It means the
cost adjustment doesn't make this case worse.

One fix idea is to adjust the test case iteration count to 11 lower than the
current profitable min iters count.


More information about the Gcc-bugs mailing list