Bug 97064 - BB vectorization behaves sub-optimal
Summary: BB vectorization behaves sub-optimal
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 11.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2020-09-16 06:29 UTC by Richard Biener
Modified: 2022-01-11 12:15 UTC (History)
1 user (show)

See Also:
Host:
Target: x86_64-*-* i?86-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-09-16 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Biener 2020-09-16 06:29:39 UTC
The testcase g++.dg/vect/slp-pr87105.cc ends in

  _64 = MIN_EXPR <_32, _87>;
  bBox_6(D)->x0 = _64;
  _67 = MIN_EXPR <_33, _86>;
  bBox_6(D)->y0 = _67;
  _70 = MAX_EXPR <_36, _87>;
  bBox_6(D)->x1 = _70;
  _73 = MAX_EXPR <_39, _86>;
  bBox_6(D)->y1 = _73;

thus feeding a 4 element store with a non-uniform SLP opportunity
starting with { MIN, MIN, MAX, MAX }.  With 2-element vector type
vectorization this eventually gets vectorized by splitting the group
which is prioritized over just building the { MIN..., MAX } vector
from scalars but with 4-element vector type vectorization no splitting
is considered and we end up successfully vectorizing just the store
with never considering the smaller vector size.

So at the moment the testcase PASSes with SSE but fails with AVX.
Comment 1 Richard Biener 2022-01-11 12:15:10 UTC
Also partly because we are not evaluating costing of multiple vector sizes on x86.