[Bug tree-optimization/56624] Vectorizer gives up on a group-access if it contains stores to the same location
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Mon Sep 14 12:44:46 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56624
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Michael Zolotukhin from comment #4)
> Sorry, it looks like the reproducer with if could be made, and here it is:
> void foo (long *a)
> {
> int i;
> for (i = 0; i < 100; i+=2)
> {
> if (a[i] == 0)
> {
> a[i+1] = 2;
> a[i] = 3;
> }
> else
> {
> a[i+1] = 3;
> a[i] = 4;
> }
> }
> }
> In this example we have:
> group_access2.c:4: note: === vect_analyze_data_ref_accesses ===
> group_access2.c:4: note: READ_WRITE dependence in interleaving.
> group_access2.c:4: note: not vectorized: complicated access pattern.
> group_access2.c:4: note: bad data access.
> group_access2.c:1: note: vectorized 0 loops in function.
>
> The diagnostic is a bit different, but rootcause is the same I guess.
>
> The test is attached (reproducer 2).
We now vectorize this loop (not with plain SSE2 but with SSE4.2 for example):
.L2:
movq (%rdi), %xmm0
movdqa %xmm2, %xmm4
addq $16, %rdi
punpcklqdq %xmm0, %xmm0
pcmpeqq %xmm1, %xmm0
pblendvb %xmm0, %xmm3, %xmm4
movups %xmm4, -16(%rdi)
cmpq %rdi, %rax
jne .L2
probably because we now sink the common stores from the if arm. Modifying
the testcase to the following reproduces the original issue again:
void foo (long *a)
{
int i;
for (i = 0; i < 100; i+=2)
{
if (a[i] == 0)
{
a[i+1] = 2;
a[i] = 3;
}
else
{
a[i] = 4;
a[i+1] = 3;
}
}
}
More information about the Gcc-bugs
mailing list