[Bug tree-optimization/56624] Vectorizer gives up on a group-access if it contains stores to the same location

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon Sep 14 12:44:46 GMT 2020


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56624

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Michael Zolotukhin from comment #4)
> Sorry, it looks like the reproducer with if could be made, and here it is:
> void foo (long *a)
> {
>   int i;
>   for (i = 0; i < 100; i+=2)
>     {
>       if (a[i] == 0)
>         {
>           a[i+1] = 2;
>           a[i] = 3;
>         }
>       else
>         {
>           a[i+1] = 3;
>           a[i] = 4;
>         }
>     }
> }
> In this example we have:
> group_access2.c:4: note: === vect_analyze_data_ref_accesses ===
> group_access2.c:4: note: READ_WRITE dependence in interleaving.
> group_access2.c:4: note: not vectorized: complicated access pattern.
> group_access2.c:4: note: bad data access.
> group_access2.c:1: note: vectorized 0 loops in function.
> 
> The diagnostic is a bit different, but rootcause is the same I guess.
> 
> The test is attached (reproducer 2).

We now vectorize this loop (not with plain SSE2 but with SSE4.2 for example):

.L2:
        movq    (%rdi), %xmm0
        movdqa  %xmm2, %xmm4
        addq    $16, %rdi
        punpcklqdq      %xmm0, %xmm0
        pcmpeqq %xmm1, %xmm0
        pblendvb        %xmm0, %xmm3, %xmm4
        movups  %xmm4, -16(%rdi)
        cmpq    %rdi, %rax
        jne     .L2

probably because we now sink the common stores from the if arm.  Modifying
the testcase to the following reproduces the original issue again:

void foo (long *a)
{
  int i;
  for (i = 0; i < 100; i+=2)
    {
      if (a[i] == 0)
        {
          a[i+1] = 2;
          a[i] = 3;
        }
      else
        {
          a[i] = 4;
          a[i+1] = 3;
        }
    }
}


More information about the Gcc-bugs mailing list