This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/83920] New: [nvptx] bad predicate reset


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83920

            Bug ID: 83920
           Summary: [nvptx] bad predicate reset
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: cesar at gcc dot gnu.org
  Target Milestone: ---

Created attachment 43164
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43164&action=edit
gemm test case

Here <https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00516.html> Tom posted a
patch to workaround a PTX JIT bug. However, workaround may clobber a live
predicate register. Consider the offloaded code for the attached gemm.f90
(built using trunk with -fopenacc -O3, but the underlying problem is present in
og7 and impacts da-1.c). 

The nvptx compiler generates:

$L35:
                selp.u32        %r348, 1, 0, %r191;
                shfl.idx.b32    %r348, %r348, 0, 31;
                setp.ne.u32     %r191, %r348, 0;
        @%r191  bra.uni $L2;
        @%r341  bra     $L34;
                mov.u32 %r155, %r61;
                shl.b64 %r158, %r43, 5;
                cvt.s64.s32     %r192, %r155;
                add.u64 %r161, %r192, 1;
                add.u32 %r193, %r53, -1;
                cvt.u64.u32     %r166, %r193;
                add.u64 %r194, %r41, 2;
                add.u64 %r195, %r194, %r166;
                mad.lo.u64      %r156, %r43, %r161, %r195;
                shl.b64 %r169, %r39, 5;
                mad.lo.u64      %r167, %r161, %r39, %r37;
                setp.eq.f32     %r266, %r49, 0f00000000;
                setp.le.s32     %r267, %r53, 0;
                add.u32 %r270, %r55, -1;
                mov.f32 %r271, 0f00000000;
                setp.eq.f32     %r272, %r49, 0f3f800000;
$L34:
$L11:
                setp.eq.u32     %r266, 1, 0;
        @%r341  bra     $L33;
$L33:
                selp.u32        %r347, 1, 0, %r266;
                shfl.idx.b32    %r347, %r347, 0, 31;
                setp.ne.u32     %r266, %r347, 0;
        @!%r266 bra.uni $L22;
                bra     $L3;
$L12:

Note how %r266 is defined in block $L35, but then it gets clobbered in block
$L33. 

This corresponds to the case where beta == 0 in the gemm.f90. I think there
might be other PTX JIT bugs lurking here, because the test program still works
as intended.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]