This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug middle-end/84234] New: #pragma omp declare simd is ignored


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84234

            Bug ID: 84234
           Summary: #pragma omp declare simd is ignored
           Product: gcc
           Version: 7.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gcc.account at lemaitre dot re
  Target Milestone: ---

Created attachment 43344
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43344&action=edit
Simple example showing the bug: gcc -O3 -fopenmp-simd

When I try to use #pragma omp declare simd on a forward declaration, it seems
to be ignored during vectorization at the call site.

ex:
#pragma omp declare simd
float add2(float a, float b);
void ADD2() {
  for (int i = 0; i < 1024; i++) {
    A[i] = add2(A[i], B[i]);
  }
}

is compiled into:
ADD2:
.LFB2:
  .cfi_startproc
  pushq %rbx
  .cfi_def_cfa_offset 16
  .cfi_offset 3, -16
  xorl  %ebx, %ebx
  .p2align 4,,10
  .p2align 3
.L19:
  movss B(%rbx), %xmm1
  addq  $4, %rbx
  movss A-4(%rbx), %xmm0
  call  add2
  movss %xmm0, A-4(%rbx)
  cmpq  $4096, %rbx
  jne .L19
  popq  %rbx
  .cfi_def_cfa_offset 8
  ret
  .cfi_endproc


where
#pragma omp declare simd
float __attribute((noinline)) add1(float a, float b) {
  return a+b;
}
void ADD1() {
  for (int i = 0; i < 1024; i++) {
    A[i] = add1(A[i], B[i]);
  }
}

is compiled into:
ADD1:
.LFB1:
  .cfi_startproc
  pushq %rbx
  .cfi_def_cfa_offset 16
  .cfi_offset 3, -16
  xorl  %ebx, %ebx
  .p2align 4,,10
  .p2align 3
.L15:
  movaps  A(%rbx), %xmm0
  addq  $16, %rbx
  movaps  B-16(%rbx), %xmm1
  call  _ZGVbN4vv_add1
  movaps  %xmm0, A-16(%rbx)
  cmpq  $4096, %rbx
  jne .L15
  popq  %rbx
  .cfi_def_cfa_offset 8
  ret
  .cfi_endproc

When the function has no definition, the compiler doesn't use the vectorized
variant of the function.
This also happens if one tries to give the definition of the function, but
defines the symbol as weak.

This is really annoying as we have to put the definition of such a function
within the same translation unit as it uses, with all problems that might
occur.

This bug is present on all gcc versions I tested, namely: GCC 4.9 x86, GCC 5.5
x86, GCC 6.4 x86, GCC 7.3 x86 and GCC trunk x86 (from godbolt.org).
On other architectures, the pragma seems to be always ignored, even when a
definition is available (GCC 7.2 ARM, GCC 6.3 AARCH64, GCC 6.3 PPC64).

For information, this works as expected on ICC.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]