[Bug ipa/81450] New: Typedef with assume aligned builtin yields segmentation fault in nested loop

philipp.kopp at tum dot de gcc-bugzilla@gcc.gnu.org
Fri Jul 14 19:26:00 GMT 2017


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81450

            Bug ID: 81450
           Summary: Typedef with assume aligned builtin yields
                    segmentation fault in nested loop
           Product: gcc
           Version: 6.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: ipa
          Assignee: unassigned at gcc dot gnu.org
          Reporter: philipp.kopp at tum dot de
                CC: marxin at gcc dot gnu.org
  Target Milestone: ---

Created attachment 41762
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41762&action=edit
Small example with the working and the failing version

Hi,

I found that the assume aligned built-in gives a segmentation fault in nested
loops when using it with a typedef (or the C++ 11 using keyword). I attached a
small example code. I am using Ubuntu 17.04 with the standard gcc 6.3.0 from
the Ubuntu archives. I am compiling with gcc -O3 -g. So SSE vectorization is
enabled, without that the code runs fine.

The core part where execution fails is the following:

>  typedef double __attribute__((aligned (32))) AlignedDouble;

>  const size_t size = 17;
>  double alpha = 1.0 / 3.0;

>  // uses posix_memalign (see full testcase)
>  AlignedDouble* A = aligned_doubles( size * size );

>  for( size_t i = 0; i < size; ++i )
>  {
>    printf( "i = %lu\n", i );
>    for( size_t j = 0; j < size; ++j )
>    {
>      A[j + i * size] += alpha;
>    }
>  }

I checked the disassembly and found that the loop over j runs once cpompletely
and fails the second time, as a temporary pointer to A is incremented by i *
size, which will yield an unaligned pointer if size % 2 != 0 (or 4 with avx2).
However, if the alignment is directly used in the definition of A, without
using a typedef, the result is different. Looking at the output of
-fopt-info-loop the loop is peeled for alignment in this case.

With typedef:
>  gcc -O3 -g main.cpp -o main -fopt-info-loop
>  main.cpp:40:26: note: loop vectorized
>  main.cpp:40:26: note: loop turned into non-loop; it never loops
>  main.cpp:15:5: note: loop turned into non-loop; it never loops.
>  main.cpp:15:5: note: loop with 8 iterations completely unrolled

Without typedef:
>  gcc -O3 -g main.cpp -o main -fopt-info-loop
>  main.cpp:30:26: note: loop vectorized
>  main.cpp:30:26: note: loop peeled for vectorization to enhance alignment
>  main.cpp:30:26: note: loop turned into non-loop; it never loops
>  main.cpp:15:5: note: loop turned into non-loop; it never loops.
>  main.cpp:15:5: note: loop with 9 iterations completely unrolled
>  main.cpp:15:5: note: loop turned into non-loop; it never loops

In the attached test case you find for both scenarios the code, the binary, an
objdump, as well as outputs from -fopt-info-loop and -fopt-info-all. In the
objdum I marked the increment of A as well as the instruction where the
segmentation fault happens.

Please let me know if more info is needed. Thanks a lot for your time!

Best wishes,
Philipp Kopp


More information about the Gcc-bugs mailing list