[Bug libgomp/83589] [nvptx] mode-transitions.c and private-variables.{c,f90} execution FAILs at GOMP_NVPTX_JIT=-O0
vries at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Sat Jan 20 07:33:00 GMT 2018
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83589
--- Comment #3 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #2)
> This looks like another nvidia driver problem (with driver version 384.111).
Confirmed.
The empty branch
> @ %r36 bra $L5;
> $L5:
is translated into:
...
/*0128*/ @P0 BRA `(.L_1);
.L_1:
...
so, no sync after the branch (or ssy before the branch).
Consequently, when executing the shfl.idx a bit later:
...
/*0158*/ SHFL.IDX PT, R0, R0, RZ, 0x1f;
/*0168*/ SHFL.IDX PT, R2, R2, RZ, 0x1f;
...
we are in divergent mode and get undefined results.
Inserting some sort of nop in the branched-around part:
...
@ %r36 bra $L5;
{
.reg .u32 %nop_src;
.reg .u32 %nop_dst;
mov.u32 %nop_dst, %nop_src;
}
$L5:
...
makes the test pass, because then we generate:
...
/*0128*/ SSY `(.L_1);
/*0130*/ @P0 SYNC (*"TARGET= .L_1 "*);
/*0138*/ SYNC (*"TARGET= .L_1 "*);
.L_1:
...
More information about the Gcc-bugs
mailing list