[Bug target/102856] New: [nvptx] Misaligned accesses with cheap vectorization enabled
jules at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Oct 20 12:21:48 GMT 2021
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102856
Bug ID: 102856
Summary: [nvptx] Misaligned accesses with cheap vectorization
enabled
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jules at gcc dot gnu.org
Target Milestone: ---
Since revision 2b8453c401b699ed93c085d0413ab4b5030bcdb8 I am seeing several
OpenMP tests fail with misaligned access errors:
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-11.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-12.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-16.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-3.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-5.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-6.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c++/../libgomp.c-c++-common/for-9.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-11.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-12.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-16.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-3.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-5.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-6.c
execution test
PASS -> FAIL: nvidia-1/libgomp.sum:libgomp.c/../libgomp.c-c++-common/for-9.c
execution test
These look like, e.g.:
$ ./for-11.exe
libgomp: cuCtxSynchronize error: misaligned address
libgomp: cuMemFree_v2 error: misaligned address
libgomp: device finalization failed
I suspect the reason is that an operation that is now being vectorized (e.g.
"st.v2.u64 [%frame], %r28;") requires higher alignment than the original scalar
accesses it replaces.
I haven't spotted an obvious culprit for the problem in the nvptx backend. This
is OpenMP, so it could be the soft stack handling -- or it could be something
else.
More information about the Gcc-bugs
mailing list