[Bug tree-optimization/68707] [6 Regression] testcase gcc.dg/vect/O3-pr36098.c vectorized using VEC_PERM_EXPR rather than VEC_LOAD_LANES

alalaw01 at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Thu Dec 17 15:18:00 GMT 2015


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707

--- Comment #20 from alalaw01 at gcc dot gnu.org ---
> Would be nice to have a reduced testcase for this one.

Working on it. Sadly it's fortran :(

The SLP tree that gets cancelled, is quite big (and quite untreelike, if we
could see that - a large portion, 7 nodes, is repeated but with the 2 stmts in
each SLP node reversed). "Decided to SLP 2 instances" indeed becomes "Decided
to SLP 1 instances", with Unrolling factor 2 both times. In the case where the
SLP gets cancelled, several more stmts that would have featured in that tree
are marked hybrid. The 'vector inside of loop cost' increases from 180 (with
SLP) to 308 (if cancelled), but minimum iters for profitability stays at 3.
However, the SLP-cancelled case, outputs a whole extra section 

note: === scheduling SLP instances ===
...
note: ------>vectorizing SLP node starting from: (one of the loads in the
cancelled tree) * 4
...
note: vectorizing stmts using SLP.

(Tho I suspect that's a red herring.)

Whereas later the non-cancelled case, clearly has an extra 'note: add new stmt:
MEM[...] = STORE_LANES'...sounding as if perhaps the SLP finds it can use ST2
opportunistically (??).


More information about the Gcc-bugs mailing list