Hello. Following code snippet fails with: ./xgcc -B. -O3 -fstack-protector ice2.ii ice2.ii: In function ‘void fn1()’: ice2.ii:51:1: internal compiler error: in gen_add2_insn, at optabs.c:4442 } ^ 0xc00edb gen_add2_insn(rtx_def*, rtx_def*) ../../gcc/optabs.c:4442 0xc768b4 gen_reload ../../gcc/reload1.c:8710 0xc7a35e emit_input_reload_insns ../../gcc/reload1.c:7664 0xc7b3bc do_input_reload ../../gcc/reload1.c:7950 0xc7b3bc emit_reload_insns ../../gcc/reload1.c:8142 0xc7dd69 reload_as_needed ../../gcc/reload1.c:4661 0xc81a7b reload(rtx_insn*, int) ../../gcc/reload1.c:1062 0xb39da2 do_reload ../../gcc/ira.c:5393 0xb39da2 execute ../../gcc/ira.c:5565 $ cat ice2.ii class A; template <typename _Tp, int m, int n> class B { public: _Tp val[m * n]; }; class C { public: C(A); }; struct D { D(); unsigned long &operator[](int); unsigned long *p; }; class A { public: template <typename _Tp, int m, int n> A(const B<_Tp, m, n> &, bool); int rows, cols; unsigned char *data; unsigned char *datastart; unsigned char *dataend; unsigned char *datalimit; D step; }; template <typename _Tp, int m, int n> A::A(const B<_Tp, m, n> &p1, bool) : rows(m), cols(n) { step[0] = cols * sizeof(_Tp); datastart = data = (unsigned char *)p1.val; datalimit = dataend = datastart + rows * step[0]; } class F { public: static void compute(C); template <typename _Tp, int m, int n, int nm> static void compute(const B<_Tp, m, n> &, B<_Tp, nm, 1> &, B<_Tp, m, nm> &, B<_Tp, n, nm> &); }; D::D() {} unsigned long &D::operator[](int p1) { return p[p1]; } template <typename _Tp, int m, int n, int nm> void F::compute(const B<_Tp, m, n> &, B<_Tp, nm, 1> &, B<_Tp, m, nm> &, B<_Tp, n, nm> &p4) { A a(p4, false); compute(a); } void fn1() { B<double, 4, 4> b, c, e; B<double, 4, 1> d; F::compute(b, d, c, e); } gcc version 5.2.1 works fine. Martin
Started with r230091 which enabled a vectorization, thus the bug looks latent: Author: rguenth <rguenth@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Tue Nov 10 09:43:54 2015 +0000 2015-11-10 Richard Biener <rguenther@suse.de> PR tree-optimization/56118 * tree-vect-slp.c (vect_bb_vectorization_profitable_p): Make equal cost favor vectorized version. * gcc.target/i386/pr56118.c: New testcase.
tree-dump of the problematic function: ;; Function void fn1() (_Z3fn1v, funcdef_no=6, decl_uid=2937, cgraph_uid=4, symbol_order=4) void fn1() () { struct A a; struct C D.3115; struct C D.3114; struct B e; long unsigned int _4; vector(2) long unsigned int vect_cst__5; long unsigned int _6; vector(2) long unsigned int vect_cst__20; <bb 2>: _6 = (long unsigned int) &e.val; vect_cst__20 = {_6, _6}; _4 = (long unsigned int) &MEM[(void *)&e + 128B]; vect_cst__5 = {_4, _4}; MEM[(struct &)&a] ={v} {CLOBBER}; a.rows = 4; a.cols = 4; MEM[(struct &)&a + 40] ={v} {CLOBBER}; MEM[(unsigned char * *)&a + 8B] = vect_cst__20; MEM[(unsigned char * *)&a + 24B] = vect_cst__5; C::C (&D.3114, a); F::compute (D.3115); D.3114 ={v} {CLOBBER}; a ={v} {CLOBBER}; e ={v} {CLOBBER}; return; } 236r.ira: (insn 13 29 33 2 (set (mem/c:V2DI (plus:DI (reg/f:DI 113 sfp) (reg:DI 166)) [5 MEM[(unsigned char * *)&a + 8B]+0 S16 A64]) (vec_select:V2DI (reg:V2DI 163) (parallel:V2DI [ (const_int 1 [0x1]) (const_int 0 [0]) ]))) /tmp/ice2.ii:29 841 {*vsx_stxvd2x2_le_v2di} (expr_list:REG_DEAD (reg:DI 166) (expr_list:REG_DEAD (reg:V2DI 163) (nil)))) 237r.reload: Reloads for insn # 13 Reload 0: reload_in (DI) = (plus:DI (reg/f:DI 1 1) (const_int 240 [0xf0])) BASE_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0) reload_in_reg: (plus:DI (reg/f:DI 1 1) (const_int 240 [0xf0])) reload_reg_rtx: (reg:DI 10 10) Reload 1: BASE_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0), optional, can't combine, secondary_reload_p Reload 2: GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0), optional, can't combine, secondary_reload_p secondary_out_reload = 1 secondary_out_icode = reload_v2di_di_store Reload 3: reload_out (V2DI) = (mem/c:V2DI (plus:DI (plus:DI (reg/f:DI 1 1) (const_int 240 [0xf0])) (reg:DI 9 9 [166])) [5 MEM[(unsigned char * *)&a + 8B]+0 S16 A64]) NO_REGS, RELOAD_FOR_OUTPUT (opnum = 0), optional reload_out_reg: (mem/c:V2DI (plus:DI (plus:DI (reg/f:DI 1 1) (const_int 240 [0xf0])) (reg:DI 9 9 [166])) [5 MEM[(unsigned char * *)&a + 8B] (crashes here)
The issues can be debugged with a cross-compiler: ../configure --enable-languages=c,c++ --disable-bootstrap --target=ppc64le-suse-linux
I tested this with trunk 236751 and I see the ICE. It does not fail if you add -mlra. It also does not fail with -mcpu=power7.
It also fails on BE. It needs -O3 -fstack-protector -mcpu=power8 to fail.
The reason it is failing due to GCC wanting to vectorize an address off of the soft frame pointer. If you don't do automatic vectorization, or if you don't ask for the stack protection, it works. It also works for LRA.
Created attachment 38580 [details] Proposed patch to fix the problem This patch fixes the problem by copying frame related registers into a temporary if they are used as part of a SPLAT operation.
Just recording that this patch was rejected on the list at https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02136.html, so we still need a fix for this.
Fixed on trunk (r239105) as rs6000 switch to LRA.
GCC 6.2 is being released, adjusting target milestone.
GCC 6.3 is being released, adjusting target milestone.
FWIW, it does not fail for -mcpu=power7 or -mcpu=power9. If you use -mcpu=power7, there is no direct move. If you use -mcpu=power9, the MTVSRDD instruction is generated which bypasses the part that is failing under reload.
I can't reproduce this, but that's probably because of cc1plus: warning: will not generate power8 instructions because assembler lacks power8 support Binutils was just configured for ppc64le-linux - do I need any special options on the configure command line there?
You need to build GCC with a new enough binutils, 2.24 I believe.
You need power8 support for the bug to show itself. In order to have power8 (ISA 2.07) support, you need a binutils that supports at least the power8 instructions.
Created attachment 40980 [details] Proposed patch to fix the problem
Author: meissner Date: Thu Mar 16 20:09:21 2017 New Revision: 246209 URL: https://gcc.gnu.org/viewcvs?rev=246209&root=gcc&view=rev Log: [gcc] 2017-03-16 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/71294 * config/rs6000/vsx.md (vsx_splat_<mode>, VSX_D iterator): Allow a SPLAT operation on ISA 2.07 64-bit systems that have direct move, but no MTVSRDD support, by doing MTVSRD and XXPERMDI. [gcc/testsuite] 2017-03-16 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/71294 * g++.dg/pr71294.C: New test. Added: trunk/gcc/testsuite/g++.dg/pr71294.C Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/vsx.md trunk/gcc/testsuite/ChangeLog
Author: meissner Date: Wed Mar 29 23:15:51 2017 New Revision: 246577 URL: https://gcc.gnu.org/viewcvs?rev=246577&root=gcc&view=rev Log: [gcc] 2017-03-29 Michael Meissner <meissner@linux.vnet.ibm.com> Back port from trunk 2017-03-21 Aaron Sawdey <acsawdey@linux.vnet.ibm.com> PR target/80123 * doc/md.texi (Constraints): Document wA constraint. * config/rs6000/constraints.md (wA): New. * config/rs6000/rs6000.c (rs6000_debug_reg_global): Add wA reg_class. (rs6000_init_hard_regno_mode_ok): Init wA constraint. * config/rs6000/rs6000.h (RS6000_CONSTRAINT_wA): New. * config/rs6000/vsx.md (vsx_splat_<mode>): Use wA constraint. 2017-03-16 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/71294 * config/rs6000/vsx.md (vsx_splat_<mode>, VSX_D iterator): Allow a SPLAT operation on ISA 2.07 64-bit systems that have direct move, but no MTVSRDD support, by doing MTVSRD and XXPERMDI. [gcc/testsuite] 2017-03-29 Michael Meissner <meissner@linux.vnet.ibm.com> Back port from trunk 2017-03-16 Michael Meissner <meissner@linux.vnet.ibm.com> PR target/71294 * g++.dg/pr71294.C: New test. Added: branches/gcc-6-branch/gcc/testsuite/g++.dg/pr71294.C - copied unchanged from r246332, trunk/gcc/testsuite/g++.dg/pr71294.C Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/config/rs6000/constraints.md branches/gcc-6-branch/gcc/config/rs6000/rs6000.c branches/gcc-6-branch/gcc/config/rs6000/rs6000.h branches/gcc-6-branch/gcc/config/rs6000/vsx.md branches/gcc-6-branch/gcc/doc/md.texi branches/gcc-6-branch/gcc/testsuite/ChangeLog
Trunk fixed on March 16th, gcc 6 branch on March 29th.