Bug 100846 - Different vector handling for strided IVs and modulo conditions
Summary: Different vector handling for strided IVs and modulo conditions
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 12.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2021-06-01 07:22 UTC by Richard Sandiford
Modified: 2021-07-03 05:16 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Sandiford 2021-06-01 07:22:02 UTC
The following loops are equivalent, but we only vectorise the second:

void f1 (int *x)
{
  for (int i = 0; i < 100; ++i)
    if ((i & 1) == 0)
      x[i] += 1;
}

void f2 (int *x)
{
  for (int i = 0; i < 100; i += 2)
    x[i] += 1;
}

I guess this isn't vectorisation-specific.  Perhaps this is something that
ivcanon could handle?
Comment 1 Richard Biener 2021-06-01 08:12:55 UTC
I think that this is iteration space splitting, turning the loop into

   for (int i = 0; i < 100; i+=2)
    if (1)
      x[i] += 1;
   for (int i = 1; i < 100; i+=2)
    if (0)
      x[i] += 1;

alternatively it is unrolling driven by jump threading.  That said,
we'd likely want to handle

  for (int i = 0; i < 100; ++i)
    if ((i & 1) == 0)
      x[i] += 1;
    else
      y[i] += 1;

as well.  loop distribution would eventually create two of the original
loops out of that.

I think the most promising way is to do unrolling and have that keyed on
the ability to optimize away the branches.  So yes, it's ivcanon, but not
said pass but the implementation file.