Bug 91435 - Better induction variable for vectorization
Summary: Better induction variable for vectorization
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 10.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2019-08-13 09:39 UTC by Marc Glisse
Modified: 2019-08-13 12:59 UTC (History)
0 users

See Also:
Host:
Target: x86_64-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2019-08-13 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marc Glisse 2019-08-13 09:39:58 UTC
(from https://stackoverflow.com/q/57465290/1918193)
long RegularTest(int n) {
  long sum = 0;
  for (int i = 0; i < n; ++i)
    if (i % 2 != 0)
      sum += i + 1;
  return sum;
}

Compiling with -O3 -march=skylake, this gets vectorized, but the result has

  # vect_vec_iv_.14_60 = PHI <{ 0, 1, 2, 3, 4, 5, 6, 7 }(5), vect_vec_iv_.14_61(6)>
  vect_vec_iv_.14_61 = vect_vec_iv_.14_60 + { 8, 8, 8, 8, 8, 8, 8, 8 };
  vect__3.17_66 = vect_vec_iv_.14_60 + { 2, 2, 2, 2, 2, 2, 2, 2 };

(those are the only uses of vect_vec_iv_.14_6[01])

If we are only ever going to use x+2, why not use that instead, initialize with {2,3,4,...}, and skip the +2 at every iteration?

(there are other things to discuss about optimizing this testcase, for instance clang is clever enough to unroll by a factor of 2 and remove the condition, but let's stick to the induction variable for this PR)
Comment 1 Marc Glisse 2019-08-13 09:44:56 UTC
(In reply to Marc Glisse from comment #0)
> If we are only ever going to use x+2, why not use that instead, initialize
> with {2,3,4,...}, and skip the +2 at every iteration?

Or since we have another variable

  # vect_vec_iv_.13_57 = PHI <{ 1, 2, 3, 4, 5, 6, 7, 8 }(5), vect_vec_iv_.13_58(6)>
  vect_vec_iv_.13_58 = vect_vec_iv_.13_57 + { 8, 8, 8, 8, 8, 8, 8, 8 };

use that one +1 and maintain one less variable.
Comment 2 Richard Biener 2019-08-13 12:59:08 UTC
Confirmed.  Note that IVOPTs does not even look at vector IVs and unfortunately the vectorizer itself doesn't try to be clever in any way here.

Indeed there's a lot to be desired for the vector code we generate for this
testcase...

Even when not vectorized unrolling the loop to eliminate the conditional
would be profitable I guess (no convenient pass to do this though).