[Bug target/102856] New: [nvptx] Misaligned accesses with cheap vectorization enabled

jules at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Oct 20 12:21:48 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102856

            Bug ID: 102856
           Summary: [nvptx] Misaligned accesses with cheap vectorization
                    enabled
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jules at gcc dot gnu.org
  Target Milestone: ---

Since revision 2b8453c401b699ed93c085d0413ab4b5030bcdb8 I am seeing several
OpenMP tests fail with misaligned access errors:

PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-11.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-12.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-16.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-3.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-5.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-6.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-9.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-11.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-12.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-16.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-3.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-5.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-6.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-9.c
execution test

These look like, e.g.:

$ ./for-11.exe 

libgomp: cuCtxSynchronize error: misaligned address

libgomp: cuMemFree_v2 error: misaligned address

libgomp: device finalization failed

I suspect the reason is that an operation that is now being vectorized (e.g.
"st.v2.u64 [%frame], %r28;") requires higher alignment than the original scalar
accesses it replaces.

I haven't spotted an obvious culprit for the problem in the nvptx backend. This
is OpenMP, so it could be the soft stack handling -- or it could be something
else.


More information about the Gcc-bugs mailing list