This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug c/71805] New: incorrect code for test pr45752.c with -mcpu=power9


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71805

            Bug ID: 71805
           Summary: incorrect code for test pr45752.c with -mcpu=power9
           Product: gcc
           Version: 6.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: acsawdey at gcc dot gnu.org
                CC: bergner at gcc dot gnu.org, meissner at gcc dot gnu.org,
                    wschmidt at gcc dot gnu.org
  Target Milestone: ---
            Target: powerpc64le-linux

Created attachment 38859
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38859&action=edit
objdump of generated binary plus my annotations which are abstracted in the
note above

testsuite/gcc.dg/vect/pr45752.c is producing some code where it seems like a
register value needed is being overwritten

Compile flags:

/home/sawdey/src/gcc/gcc-6-branch/build/gcc/xgcc
-B/home/sawdey/src/gcc/gcc-6-branch/build/gcc/
/home/sawdey/src/gcc/gcc-6-branch/gcc/gcc/testsuite/gcc.dg/vect/pr45752.c 
-mcpu=power9 -Wl,-rpath=/tmp/lib64  -fno-diagnostics-show-caret
-fdiagnostics-color=never  -flto -ffat-lto-objects -maltivec -mpower9-vector
-ftree-vectorize -fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-details
--param tree-reassoc-width=1  -lm  -o ./pr45752.exe

The compiler is gcc-6-branch 238072 plus bergner's p9 VMX ICE patch and
kelvin's vpermr fix.

The 4th group of 4 results is incorrect:
(gdb) p check_results
$24 = {3208, 1334, 28764, 35679, 2789, 13028, 4754, 168364, 91254, 12399,
22848, 8174, 307964, 146829, 22009, 32668, 11594, 447564, 202404, 31619}
(gdb) p output
$25 = {3208, 1334, 28764, 35679, 2789, 13028, 4754, 168364, 91254, 12399,
22848, 8174, 310424, 178137, 26529, 31036, 11594, 447564, 202404, 31619}

This is my extraction of the dataflow for the incorrect vector:

    10000788:   09 00 e9 f5     lxv     vs47,0(r9)                             
        << set vs47/v15 from load
    100007b8:   09 00 87 f6     lxv     vs52,0(r7)                             
        << set vs52/v20 from load
    10000898:   09 01 81 f4     lxv     vs36,256(r1)                           
<< set vs36/v4 from load
    100008f8:   99 01 61 f7     lxv     vs59,400(r1)                           
        << set vs59 from load
    10000900:   89 01 01 f4     lxv     vs32,384(r1)                           
        << set vs32 from load
    10000918:   01 00 e7 f7     lxv     vs31,0(r7)                             
        << set vs31 from load
    1000094c:   01 00 49 f4     lxv     vs2,0(r9)                              
<< set vs2 from load
    10000950:   01 00 a7 f5     lxv     vs13,0(r7)                             
<< set vs13 from load
    10000958:   eb 03 fb 11     vperm   v15,v27,v0,v15                         
<< set v15/vs47 from v27, v0, v15
    10000988:   8c 22 81 11     vspltw  v12,v4,1                               
        << set v12/vs44 from v4/vs36
    10000994:   01 00 29 f4     lxv     vs1,0(r9)                              
        << set vs1 from load
    100009a0:   96 64 ac f2     xxlor   vs21,vs44,vs44                         
<< set vs21 from vs44/v12
    100009a4:   8c 22 83 11     vspltw  v12,v4,3                               
<< set v12/vs44 from v4/vs36
    100009c0:   96 64 8c f0     xxlor   vs4,vs44,vs44                          
<< set vs4 from vs44/v12
    100009cc:   91 ac b5 f1     xxlor   vs45,vs21,vs21                         
<< set vs45/v13 from vs21
    100009d0:   91 fc df f1     xxlor   vs46,vs31,vs31                         
<< set vs46/v14 from vs31
    100009f0:   96 7c af f0     xxlor   vs5,vs47,vs47                          
        << set vs5 from vs47/v15
    100009fc:   89 70 ed 10     vmuluwm v7,v13,v14                             
        << set v7/vs39 from v13, v14
    10000a08:   91 14 a2 f1     xxlor   vs45,vs2,vs2                           
<< set vs45/v13 from vs2
    10000a28:   91 24 84 f1     xxlor   vs44,vs4,vs4                           
<< set v12/vs44 from vs4
    10000a2c:   89 68 8c 11     vmuluwm v12,v12,v13                            
<< set v12/vs44 from v12, v13
    10000a3c:   96 64 8c f0     xxlor   vs4,vs44,vs44                          
<< set vs4 from vs44/v12
    10000a40:   f9 00 81 f5     lxv     vs44,240(r1)                           
<< set vs44/v12 from load
    10000a44:   8c 62 c0 11     vspltw  v14,v12,0                              
<< set v14/vs46 from v12/vs44
    10000aa0:   d4 68 5a f1     xxperm  vs10,vs58,vs13                         
        << set vs10 from vs58, vs13
    10000aa4:   8c 22 40 13     vspltw  v26,v4,0                               
        << set v26/vs58 from v4/vs36
    10000acc:   01 00 c7 f7     lxv     vs30,0(r7)                             
        << set vs30 from load
    10000b08:   91 0c 81 f1     xxlor   vs44,vs1,vs1                           
        << set vs44/v12 from vs1
    10000b0c:   89 60 ce 11     vmuluwm v14,v14,v12                            
        << set v14/vs46 from v14, v12
    10000b20:   8c 22 a2 11     vspltw  v13,v4,2                               
        << set v13/vs45 from v4/vs36
    10000b24:   96 6c 8d f3     xxlor   vs28,vs45,vs45                         
        << set vs28 from vs45/v13
    10000b40:   91 2c 85 f1     xxlor   vs44,vs5,vs5                           
        << set vs44/v12 from vs5
    10000b44:   89 a0 8c 12     vmuluwm v20,v12,v20                            
        << set v20/vs52 from v12 and v20
    10000b5c:   01 00 a9 f5     lxv     vs13,0(r9)                             
        << set vs13 from load
    10000b94:   89 68 ac 11     vmuluwm v13,v12,v13                            
        << v13/vs45 set here to be written over?
    10000b98:   91 e4 9c f1     xxlor   vs44,vs28,vs28                         
        << set vs44/v12 from vs28
    10000ba0:   91 fc bf f1     xxlor   vs45,vs31,vs31                         
        << set vs45/v13 from vs31
    10000ba4:   89 68 8c 12     vmuluwm v20,v12,v13                            
        << set v20 from v12 and v13
    10000bcc:   91 24 24 f3     xxlor   vs57,vs4,vs4                           
        << set vs57/v25 from vs4
    10000be0:   80 c8 e7 10     vadduwm v7,v7,v25                              
        << set v7 from v7 and v25
    10000c00:   80 70 e7 10     vadduwm v7,v7,v14                              
        << set v7 from v7 and v14
    10000c10:   91 6c ed f3     xxlor   vs63,vs13,vs13                         
        << set vs63 from vs13
    10000c28:   89 f8 5a 13     vmuluwm v26,v26,v31                            
   << set v26 from v26 and v31
    10000c44:   80 a0 07 11     vadduwm v8,v7,v20                              
     << set v8/vs40 from v7 and v28
    10000c68:   80 d0 08 11     vadduwm v8,v8,v26                              
 << set v8/vs40 from v8 and v26

The punchline is at 10000b94/10000ba0 which both set v13/vs45 and I don't think
that is what was intended.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]