[Bug target/99807] [11 Regression] ICE in vect_slp_analyze_node_operations_1, at tree-vect-slp.c:3727

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon Mar 29 08:35:13 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99807

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> Just to say we currently require a ("random") SLP_TREE_REPRESENTATIVE even on
> VEC_PERM_EXPR SLP nodes.  With the testcase the choosen one is no longer
> explicitely referenced and thus it does not get marked by SLP.  Note this
> will
> also cause it to turn hybrid - so it's probably one of the cases where the
> SLP marking as patterns "helped".

We can fix the ICE by delaying the assert but the other issue still shows
in costing (which also only walks participating stmts):

t.f90:5:30: note: Cost model analysis:
0x35e56c0 _25 1 times scalar_store costs 1 in body
0x35e56c0 _26 1 times scalar_store costs 1 in body
0x35e56c0 _11 + _23 1 times scalar_stmt costs 1 in body
0x35e56c0 _12 + _24 1 times scalar_stmt costs 1 in body
0x35e56c0 REALPART_EXPR <(*z_8(D))[0]> 1 times scalar_load costs 1 in body
0x35e56c0 IMAGPART_EXPR <(*z_8(D))[0]> 1 times scalar_load costs 1 in body
0x35e56c0 REALPART_EXPR <(*z_8(D))[1]> 1 times scalar_load costs 1 in body
0x35e56c0 REALPART_EXPR <(*z_8(D))[0]> 1 times unaligned_load (misalign -1)
costs 1 in body
0x35e56c0 REALPART_EXPR <(*z_8(D))[0]> 1 times vec_perm costs 2 in body
0x35e56c0 REALPART_EXPR <(*z_8(D))[1]> 1 times unaligned_load (misalign -1)
costs 1 in body
0x35e56c0 <unknown> 1 times vec_perm costs 2 in body
0x35e56c0 <unknown> 1 times vec_construct costs 2 in prologue
0x35e56c0 .COMPLEX_FMA (_25, _25, _25) 1 times vector_stmt costs 1 in body
0x35e56c0 <unknown> 1 times vec_construct costs 2 in prologue
0x35e56c0 _25 1 times unaligned_store (misalign -1) costs 1 in body
0x35e56c0 REALPART_EXPR <(*z_8(D))[1]> 1 times vec_to_scalar costs 2 in
epilogue
0x35e56c0 REALPART_EXPR <(*z_8(D))[1]> 1 times vec_to_scalar costs 2 in
epilogue
t.f90:5:30: note: Cost model analysis for part in loop 0:
  Vector cost: 16
  Scalar cost: 7
t.f90:5:30: missed: not vectorized: vectorization is not profitable.

so we're not costing all the scalar stmts covered by the .COMPLEX_FMA
expression
because scalar costing only looks at SLP_TREE_SCALAR_STMTS (it would not
cost multiple stmts covering a scalar pattern either).

That said, I'm going to fix this simple to deal with ICE now, all the rest
needs more thoughts (and I'd like to defer any solution to GCC 12 for the
moment).


More information about the Gcc-bugs mailing list