Bug 115363 - Missing loop vectorization due to loop bound load not being pulled out
Summary: Missing loop vectorization due to loop bound load not being pulled out
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 15.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2024-06-05 17:55 UTC by Andrew Pinski
Modified: 2025-01-23 18:13 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2024-06-06 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Pinski 2024-06-05 17:55:04 UTC
Take:
```
struct out {
  unsigned *array;
};

struct m{
  void f(out *output);
  void f1(out *output);
  int size;
};

void  m::f(out *output)
{
      for (int k = 0; k < size; k++) {
        output->array[k] += 1;
      }
}

void  m::f1(out *output)
{
  int tmp = size;
  for (int k = 0; k < size; k++) {
    output->array[k] += 1;
  }
}
```

We should be able to vectorize `m::f` but currently does not since this->size might alias array[k].
But we could version the loop to pull out the this->size out of the loop and we could vectorize the loop then.
Comment 1 Richard Biener 2024-06-06 06:48:23 UTC
Invariant motion doesn't do versioning for aliasing.  But in fact once the
loop iterates array[k] can no longer alias this->size but this is difficult
to exploit (peeling the loop once would help).

I'm not sure we should start to version all those loops where the exit
condition depends on a not hoistable but invariant expression?

But maybe we can diagnose this so people can rewrite their code.
Comment 2 Andrew Pinski 2024-06-07 14:19:34 UTC
(In reply to Richard Biener from comment #1)
> Invariant motion doesn't do versioning for aliasing.  But in fact once the
> loop iterates array[k] can no longer alias this->size but this is difficult
> to exploit (peeling the loop once would help).

So I should mention that this was noticed from some code internally which is optimized by LLVM's polly (graphite) infrastructure and yes it is an important loop in this code.
Comment 3 Andrew Pinski 2025-01-23 18:13:55 UTC
So this can easily show up in C++ code in lambdas where it is not obvious a reference even:
```
void foo(int *A, int *B, int *C, int *D, int len) {
  auto func = [&] () {
    for (int i =0; i < len; i++)
      A[i] = B[i] * C[i] + D[i];
  };
  func();
}
```
Compile with `-O3 -fno-inline` Note the no-inline here is important
This shows up in Geekbench 6.1 src/geekbench/ml/backend/cpu/depthwise_convolution_2d.cpp