This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, 2/2][nvptx, PR83589] Workaround for branch-around-nothing JIT bug


On Wed, 24 Jan 2018, Tom de Vries wrote:

> Hi,
> 
> this patch adds a workaround for the nvptx target JIT bug PR83589 - "[nvptx]
> mode-transitions.c and private-variables.{c,f90} execution FAILs at
> GOMP_NVPTX_JIT=-O0".
> 
> 
> When compiling a branch-around-nothing (where the branch is warp neutering, so
> it's a divergent branch):
> ...
>   .reg .pred %r36;
>   {
>     .reg .u32 %x;
>     mov.u32 %x,%tid.x;
>     setp.ne.u32 %r36,%x,0;
>   }
> 
>   @ %r36 bra $L5;
>   $L5:
> ...
> 
> The JIT fails to generate a convergence point here:
> ...
>          /*0128*/               @P0 BRA `(.L_1);
> .L_1:
> ...
> 
> Consequently, we execute subsequent code in divergent mode, and when executing
> a shfl.idx a bit later we run into the undefined behaviour that shfl.idx has
> when executing in divergent mode.
> 
> The workaround detects branch-around-nothing, and inserts a ptx operation that
> does nothing (I'm calling it a fake nop, I haven't been able to come up with a
> better term yet):
> ...
>   @ %r36 bra $L5;
>     {
>       .reg .u32 %nop_src;
>       .reg .u32 %nop_dst;
>       mov.u32 %nop_dst, %nop_src;
>     }
>   $L5:
> ...
> which makes the test pass, because then we generate a convergence point here
> at .L1:
> ...
>         /*0128*/                   SSY `(.L_1);
>         /*0130*/               @P0 SYNC (*"TARGET= .L_1 "*);
>         /*0138*/                   SYNC (*"TARGET= .L_1 "*);
> .L_1:
> ...
> 
> The workaround is not minimal given that it inserts the fake nop in all
> branch-around-nothings it detects, not just the warp neutering ones, but I
> think this is more robust than trying to identify the warp neutering branches.
> Furthermore, I'm not going for optimality here anyway. The optimal way to fix
> this is making sure we don't generate branch-around-nothing, but that's for
> stage1.
> 
> Build and reg-tested on x86_64 with nvptx accelerator.
> 
> I'd like to commit in stage4, but I'd appreciate a review of the code. Does
> the patch look OK?

Ok for stage4, but this isn't a review ;)

Richard.

> Thanks,
> - Tom
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]