[og7, nvptx, openacc, PR85381, committed] Don't emit barriers for empty loops

Tom de Vries Tom_deVries@mentor.com
Sat Apr 21 10:37:00 GMT 2018


Hi,

when compiling this testcase with the og7 branch:
...
int
main (void)
{
   long long v1;
#pragma acc parallel num_gangs (640) num_workers(1) vector_length (128)
#pragma acc loop
   for (v1 = 0; v1 < 20; v1 += 2)
     ;

   return 0;
}
...

this ptx is generated:
...
{
   // fork 4; 

   bar.sync 0;
   // forked 4; 

   // joining 4; 

   bar.sync 0;
   // join 4; 

   ret;
}
...

This triggers some bug on my quadro m1200 (I'm assuming in the ptxas/JIT 
compiler) that hangs the testcase. I can work around this by adding a 
membar.cta before the bar.syc, or two membar.ctas inbetween, but I'm not 
really sure what a minimal workaround should look like (I reported the 
bug to nvidia, I'm hoping for them to answer that question).

This patch works around the bug by doing an optimization: we detect that 
this is an empty loop (a forked immediately followed by a joining), and 
don't emit the barriers.

Build x86_64 with nvptx accelerator and tested libgomp.

Committed to og7 branch.

Thanks,
- Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-nvptx-openacc-Don-t-emit-barriers-for-empty-loops.patch
Type: text/x-patch
Size: 6458 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20180421/f6072d08/attachment.bin>


More information about the Gcc-patches mailing list