This is the mail archive of the gcc-help@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

cilkplus vectorization problems


Hello, everyone

I have recently started to used the CilkPlus capabilities of gcc5, but cannot really grasp the vectorization part. Either I am doing sth. wrong, or there is a bug in gcc. I have inspected generated asm of the following two implementations of vector addition (a = a + b). The code is compiled with 'gcc -O3 -mavx -ftree-vectorize -fopt-info-vec -fcilkplus test.c'.

// ICC compatibility - alignment hint
#ifdef __GNUC__
#define __assume_aligned(lvalueptr, align) lvalueptr = __builtin_assume_aligned (lvalueptr, align)
#endif
#define RESTRICT __restrict__

---------------
usual C implementation
---------------

void test(Double * RESTRICT a, Double * RESTRICT b, int size)
{
  int i;

  __assume_aligned(a, 64);
  __assume_aligned(b, 64);

  for(i=0; i<size; i++)
    a[i] = a[i] + b[i];

}

---------------
CilkPlus array notation
---------------

void test_cilkplus1(Double * RESTRICT a, Double * RESTRICT b, int size)
{

  __assume_aligned(a, 64);
  __assume_aligned(b, 64);

  a[0:size] = a[0:size] + b[0:size];

}


The first code (test) is vectorized as expected - here comes the ASM:

.L4:
        vmovapd (%rdi,%r8), %ymm0
        addl    $1, %r9d
        vaddpd  (%rsi,%r8), %ymm0, %ymm0
        vmovapd %ymm0, (%rdi,%r8)
        addq    $32, %r8
        cmpl    %r9d, %ecx
        ja      .L4


On the contrary, the second function (test_cilkplus1) is not vectorized:

.L21:
        vmovsd  (%rdi,%rax), %xmm0
        movl    %ecx, %r8d
        addl    $1, %ecx
        vaddsd  (%rsi,%rax), %xmm0, %xmm0
        vmovsd  %xmm0, (%rdi,%rax)
        addq    $8, %rax
        cmpl    %r8d, %edx
        jg      .L21


Now I have made sure that the compiler understands that there is no aliasing (restrict) and that the vectors are aligned in memory. Clearly this is enough for the standard implementation, but not for the CilkPlus array notation.

Is this a bug, or am I missing something?

Thanks a lot!

Marcin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]