Created attachment 45985 [details] unreduced testcase The attached testcase ICEs on the GCC 8 branch and trunk (the preprocessed source doesn't work on the 7 branch). > /abuild/rguenther/obj8-aarch64-g/gcc/cc1plus -quiet matmul_op.ii -std=c++11 -mlittle-endian -mabi=lp64 -fno-exceptions - w -Wfatal-errors -I /abuild/rguenther/obj8-aarch64-g/gcc/include during RTL pass: reload In file included from external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/Core:296, from external/eigen_archive/unsupported/Eigen/CXX11/Tensor:14, from ./third_party/eigen3/unsupported/Eigen/CXX11/Tensor:1, from ./tensorflow/core/kernels/matmul_op.h:19, from tensorflow/core/kernels/matmul_op.cc:20: external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h: In member function ‘void Eigen::internal::gebp_kernel<LhsScalar, RhsScalar, Index, DataMapper, mr, nr, ConjugateLhs, ConjugateRhs>::operator()(const DataMapper&, const LhsScalar*, const RhsScalar*, Index, Index, Index, Eigen::internal::gebp_kernel<LhsScalar, RhsScalar, Index, DataMapper, mr, nr, ConjugateLhs, ConjugateRhs>::ResScalar, Index, Index, Index, Index) [with LhsScalar = Eigen::half; RhsScalar = Eigen::half; Index = long int; DataMapper = Eigen::internal::blas_data_mapper<Eigen::half, long int, 0, 0>; int mr = 2; int nr = 4; bool ConjugateLhs = false; bool ConjugateRhs = false]’: external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1879:3: internal compiler error: in emit_move_insn, at expr.c:3723 0xc429cb emit_move_insn(rtx_def*, rtx_def*) /space/rguenther/src/svn/gcc-8-branch/gcc/expr.c:3722 0xe404a6 lra_emit_move(rtx_def*, rtx_def*) /space/rguenther/src/svn/gcc-8-branch/gcc/lra.c:497 0xe50fcc match_reload /space/rguenther/src/svn/gcc-8-branch/gcc/lra-constraints.c:1055 0xe5a255 curr_insn_transform /space/rguenther/src/svn/gcc-8-branch/gcc/lra-constraints.c:4356 0xe5bc52 lra_constraints(bool) /space/rguenther/src/svn/gcc-8-branch/gcc/lra-constraints.c:4898 0xe4507c lra(_IO_FILE*) /space/rguenther/src/svn/gcc-8-branch/gcc/lra.c:2422 0xdf17bc do_reload /space/rguenther/src/svn/gcc-8-branch/gcc/ira.c:5465 0xdf1c08 execute /space/rguenther/src/svn/gcc-8-branch/gcc/ira.c:5649 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. tested with a cross from x86_64-linux to aarch64-linux.
Note the regression state isn't exactly clear. I'm trying brute-force reduction (have no easy way to check testcase validity lacking a compiler that builds this w/o an ICE). The original issue was reported with -O2 -std=c++11 -mlittle-endian -mabi=lp64 -fstack-protector -fno-omit-frame-pointer -ffunction-sections -fdata-sections -fPIC -fno-exceptions -ftemplate-depth=900 -fno-canonical-system-headers.
Confirmed. It ICEs in Eigen::internal::gebp_kernel<Eigen::half, Eigen::half, long int, Eigen::internal::blas_data_mapper<Eigen::half, long int, 0, 0>, 2, 4, false, false>::operator() It seems to choke on this asm during reload: 531: {[r3842:DI]=asm_operands;[r3844:DI]=asm_operands;} and somehow emit a move between these operands: (reg:BLK 3849) (mem/c:BLK (reg:DI 3846) [29 A0+0 S2 A64])
Full instruction: (insn 531 530 532 19 (parallel [ (set (mem/c:BLK (reg:DI 3842) [29 A0+0 S2 A64]) (asm_operands:BLK ("") ("=rm") 0 [ (mem/c:BLK (reg:DI 3846) [29 A0+0 S2 A64]) (mem/c:BLK (reg:DI 3848) [29 A1+0 S2 A128]) ] [ (asm_input:BLK ("0") external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418) (asm_input:BLK ("1") external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418) ] [] external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418)) (set (mem/c:BLK (reg:DI 3844) [29 A1+0 S2 A128]) (asm_operands:BLK ("") ("=rm") 1 [ (mem/c:BLK (reg:DI 3846) [29 A0+0 S2 A64]) (mem/c:BLK (reg:DI 3848) [29 A1+0 S2 A128]) ] [ (asm_input:BLK ("0") external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418) (asm_input:BLK ("1") external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418) ] [] external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418)) ]) "external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h":1418:511 -1 (nil)) I guess this comes from the source like this with some of the arguments being immediates due to template substitution: __asm__ ("" : [a0] "+rm" (A0),[a1] "+rm" (A1)); It's not obvious what this is supposed to achieve, let alone whether it is valid...
Small example which generates the same ICE on every GCC version: typedef struct { int x, y, z; } X; void f(void) { X A0, A1; __asm__ ("" : [a0] "+rm" (A0),[a1] "+rm" (A1)); } So it's completely invalid inline asm...
On Mon, 18 Mar 2019, wilco at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89752 > > --- Comment #4 from Wilco <wilco at gcc dot gnu.org> --- > Small example which generates the same ICE on every GCC version: > > typedef struct { int x, y, z; } X; > > void f(void) > { > X A0, A1; > __asm__ ("" : [a0] "+rm" (A0),[a1] "+rm" (A1)); > } > > So it's completely invalid inline asm... Looks like an attempt to provide some barrier? We still shouldn't ICE.
It is on: __asm__("" : "a0" "=rm" A0, "a1" "=rm" A1 : "0" A0, "1" A1); where A0 and A1 are variables with LhsPacket type, which is 2 byte TYPE_ADDRESSABLE aggregate type. The r in the constraints looks completely bogus to me for addressable vars that must live in memory, but guess we shouldn't ICE even on this invalid stuff.
I've reduced it to: struct A { A (); ~A (); short c; }; void foo () { A a0, a1; __asm volatile ("" : "=rm" (a0), "=rm" (a1) : "0" (a0), "1" (a1)); } which better matches what is going on (the type is TYPE_ADDRESSABLE and that is why it has BLKmode).
On Mon, 18 Mar 2019, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89752 > > Jakub Jelinek <jakub at gcc dot gnu.org> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |jakub at gcc dot gnu.org > > --- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > I've reduced it to: > struct A { A (); ~A (); short c; }; > > void > foo () > { > A a0, a1; > __asm volatile ("" : "=rm" (a0), "=rm" (a1) : "0" (a0), "1" (a1)); > } > which better matches what is going on (the type is TYPE_ADDRESSABLE and that is > why it has BLKmode). And it works "fine" in x86_64 ...
Bisecting now, r210000 still works, r215000 ICEs.
It seems that rewriting "+rm" into "=rm" and "0" is not equivalent. Eg. __asm__ ("" : [a0] "=m" (A0) : "0" (A0)); gives a million warnings "matching constraint does not allow a register", so "0" appears to imply a register operand. Reload seems to deal with "=rm" and "rm" or "=rm" and "0m", but a single "0" prefers a register and that's why it tries to insert an incorrect BLK move.
Actually can't bisect, as gcc 8 I have installed is no longer able to build r215000 or revisions around it (some error on wide-int.h: ../../gcc/wide-int.h:372:10: error: too many template-parameter-lists struct binary_traits <T1, T2, FLEXIBLE_PRECISION, FLEXIBLE_PRECISION> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ etc.). That said, if we want to fix the: struct A { A (); ~A (); short c; }; void foo () { A a0, a1; __asm volatile ("" : "+rm" (a0), "+rm" (a1)); } in the provided testcase, we could use: --- gcc/gimplify.c.jj 2019-03-07 20:45:39.168938360 +0100 +++ gcc/gimplify.c 2019-03-18 16:18:16.515466234 +0100 @@ -6155,6 +6155,19 @@ gimplify_asm_expr (tree *expr_p, gimple_ is_inout = false; } + /* If we can't make copies, we can only accept memory. */ + if (TREE_ADDRESSABLE (TREE_TYPE (TREE_VALUE (link)))) + { + if (allows_mem) + allows_reg = 0; + else + { + error ("impossible constraint in %<asm%>"); + error ("non-memory output %d must stay in memory", i); + return GS_ERROR; + } + } + if (!allows_reg && allows_mem) mark_addressable (TREE_VALUE (link)); which is something we already do for the input args, just for some strange reason not for output args as well (if we wanted to backport it, I'd leave the error part out). That said, because users can write even that: struct A { A (); ~A (); short c; }; void foo () { A a0, a1; __asm volatile ("" : "=rm" (a0), "=rm" (a1) : "0" (a0), "1" (a1)); } we need something in LRA too, that will simply treat BLKmode matching constraints when both registers and memory are allowed as if only memory was allowed.
Author: jakub Date: Tue Mar 19 08:11:25 2019 New Revision: 269793 URL: https://gcc.gnu.org/viewcvs?rev=269793&root=gcc&view=rev Log: PR target/89752 * gimplify.c (gimplify_asm_expr): For output argument with TREE_ADDRESSABLE type, clear allows_reg if it allows memory, otherwise diagnose error. * g++.dg/ext/asm15.C: Check for particular diagnostic wording. * g++.dg/ext/asm16.C: Likewise. * g++.dg/ext/asm17.C: New test. Added: trunk/gcc/testsuite/g++.dg/ext/asm17.C Modified: trunk/gcc/ChangeLog trunk/gcc/gimplify.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/g++.dg/ext/asm15.C trunk/gcc/testsuite/g++.dg/ext/asm16.C
Fixed on the originally provided testcase, not on the #c7 testcase, that needs to be fixed in LRA not to try to reload BLKmode MEMs into REGs.
The following patch fixes the remaining ICE for me: --- gcc/lra-constraints.c.jj 2019-03-16 22:17:21.060937047 +0100 +++ gcc/lra-constraints.c 2019-03-19 11:49:11.982058568 +0100 @@ -2350,6 +2350,8 @@ process_alt_operands (int only_alternati break; reg: + if (mode == BLKmode) + break; this_alternative = reg_class_subunion[this_alternative][cl]; IOR_HARD_REG_SET (this_alternative_set, reg_class_contents[cl]); @@ -2360,8 +2362,6 @@ process_alt_operands (int only_alternati IOR_HARD_REG_SET (this_costly_alternative_set, reg_class_contents[cl]); } - if (mode == BLKmode) - break; winreg = true; if (REG_P (op)) { What in my understanding happens is that even when the r constraint for the BLKmode MEM doesn't win, this_alternative is still updated (to GENERAL_REGS in this case) and as the m constraint matches, we still process it as if GENERAL_REGS is an option. It is not, BLKmode must live in memory.
Author: jakub Date: Wed Mar 20 11:26:42 2019 New Revision: 269819 URL: https://gcc.gnu.org/viewcvs?rev=269819&root=gcc&view=rev Log: PR target/89752 * lra-constraints.c (process_alt_operands) <reg>: For BLKmode, don't update this_alternative nor this_alternative_set. * g++.target/aarch64/aarch64.exp: New file. * g++.target/aarch64/pr89752.C: New test. Added: trunk/gcc/testsuite/g++.target/aarch64/aarch64.exp trunk/gcc/testsuite/g++.target/aarch64/pr89752.C Modified: trunk/gcc/ChangeLog trunk/gcc/lra-constraints.c trunk/gcc/testsuite/ChangeLog
Author: jakub Date: Tue Apr 30 20:49:38 2019 New Revision: 270735 URL: https://gcc.gnu.org/viewcvs?rev=270735&root=gcc&view=rev Log: Backported from mainline 2019-03-19 Jakub Jelinek <jakub@redhat.com> PR target/89752 * gimplify.c (gimplify_asm_expr): For output argument with TREE_ADDRESSABLE type, clear allows_reg if it allows memory, otherwise diagnose error. * g++.dg/ext/asm15.C: Check for particular diagnostic wording. * g++.dg/ext/asm16.C: Likewise. * g++.dg/ext/asm17.C: New test. Added: branches/gcc-8-branch/gcc/testsuite/g++.dg/ext/asm17.C Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/gimplify.c branches/gcc-8-branch/gcc/testsuite/ChangeLog branches/gcc-8-branch/gcc/testsuite/g++.dg/ext/asm15.C branches/gcc-8-branch/gcc/testsuite/g++.dg/ext/asm16.C
Author: jakub Date: Tue Apr 30 20:50:48 2019 New Revision: 270737 URL: https://gcc.gnu.org/viewcvs?rev=270737&root=gcc&view=rev Log: Backported from mainline 2019-03-20 Jakub Jelinek <jakub@redhat.com> PR target/89752 * lra-constraints.c (process_alt_operands) <reg>: For BLKmode, don't update this_alternative nor this_alternative_set. Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/lra-constraints.c
Fixed for 8.4+ too.
Author: jakub Date: Fri Aug 30 12:27:21 2019 New Revision: 275136 URL: https://gcc.gnu.org/viewcvs?rev=275136&root=gcc&view=rev Log: Backported from mainline 2019-03-19 Jakub Jelinek <jakub@redhat.com> PR target/89752 * gimplify.c (gimplify_asm_expr): For output argument with TREE_ADDRESSABLE type, clear allows_reg if it allows memory, otherwise diagnose error. * g++.dg/ext/asm15.C: Check for particular diagnostic wording. * g++.dg/ext/asm16.C: Likewise. * g++.dg/ext/asm17.C: New test. Added: branches/gcc-7-branch/gcc/testsuite/g++.dg/ext/asm17.C Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/gimplify.c branches/gcc-7-branch/gcc/testsuite/ChangeLog branches/gcc-7-branch/gcc/testsuite/g++.dg/ext/asm15.C branches/gcc-7-branch/gcc/testsuite/g++.dg/ext/asm16.C
Author: jakub Date: Fri Aug 30 12:28:22 2019 New Revision: 275138 URL: https://gcc.gnu.org/viewcvs?rev=275138&root=gcc&view=rev Log: Backported from mainline 2019-03-20 Jakub Jelinek <jakub@redhat.com> PR target/89752 * lra-constraints.c (process_alt_operands) <reg>: For BLKmode, don't update this_alternative nor this_alternative_set. Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/lra-constraints.c