This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug c/67167] New: cilkplus vectorization problems


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67167

            Bug ID: 67167
           Summary: cilkplus vectorization problems
           Product: gcc
           Version: 5.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: marcin.krotkiewski at gmail dot com
  Target Milestone: ---

I think there is a problem with vectorization of arithmetic operations in the
cilkplus implementation in gcc. I have inspected generated asm of the following
two implementations of vector addition (a = a + b). The code is compiled with
'gcc -O3 -mavx -ftree-vectorize -fopt-info-vec -fcilkplus test.c'.


// ICC compatibility - alignment hint
#ifdef __GNUC__

#define __assume_aligned(lvalueptr, align) lvalueptr = __builtin_assume_aligned
(lvalueptr, align)

#endif
#define RESTRICT __restrict__

typedef double Double;

void test(Double * RESTRICT a, Double * RESTRICT b, int size)
{
  int i;

  __assume_aligned(a, 64);
  __assume_aligned(b, 64);

  for(i=0; i<size; i++)
    a[i] = a[i] + b[i];

}


void test_cilkplus1(Double * RESTRICT a, Double * RESTRICT b, int size)
{

  __assume_aligned(a, 64);
  __assume_aligned(b, 64);

  a[0:size] = a[0:size] + b[0:size];

}


The first code (test) is vectorized as expected - here comes the ASM:

.L4:
        vmovapd (%rdi,%r8), %ymm0
        addl    $1, %r9d
        vaddpd  (%rsi,%r8), %ymm0, %ymm0
        vmovapd %ymm0, (%rdi,%r8)
        addq    $32, %r8
        cmpl    %r9d, %ecx
        ja      .L4


On the contrary, the second function (test_cilkplus1) is not vectorized:

.L21:
        vmovsd  (%rdi,%rax), %xmm0
        movl    %ecx, %r8d
        addl    $1, %ecx
        vaddsd  (%rsi,%rax), %xmm0, %xmm0
        vmovsd  %xmm0, (%rdi,%rax)
        addq    $8, %rax
        cmpl    %r8d, %edx
        jg      .L21


Now I have made sure that the compiler understands that there is no aliasing
(restrict) and that the vectors are aligned in memory. Clearly this is enough
for the standard implementation to produce a vectorized code, but not for the
CilkPlus array notation.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]