Bug 89752

Summary: [8 Regression] ICE in emit_move_insn, at expr.c:3723
Product: gcc Reporter: Richard Biener <rguenth>
Component: targetAssignee: Jakub Jelinek <jakub>
Status: RESOLVED FIXED    
Severity: normal CC: jakub, vmakarov
Priority: P3 Keywords: ice-on-valid-code
Version: 8.3.1   
Target Milestone: 8.4   
Host: Target: aarch64
Build: Known to work: 8.3.1, 9.0
Known to fail: 8.3.0 Last reconfirmed: 2019-03-18 00:00:00
Attachments: unreduced testcase

Description Richard Biener 2019-03-18 08:48:40 UTC
Created attachment 45985 [details]
unreduced testcase

The attached testcase ICEs on the GCC 8 branch and trunk (the preprocessed source doesn't work on the 7 branch).

> /abuild/rguenther/obj8-aarch64-g/gcc/cc1plus -quiet matmul_op.ii -std=c++11 -mlittle-endian -mabi=lp64 -fno-exceptions - w -Wfatal-errors -I /abuild/rguenther/obj8-aarch64-g/gcc/include
during RTL pass: reload
In file included from external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/Core:296,
                 from external/eigen_archive/unsupported/Eigen/CXX11/Tensor:14,
                 from ./third_party/eigen3/unsupported/Eigen/CXX11/Tensor:1,
                 from ./tensorflow/core/kernels/matmul_op.h:19,
                 from tensorflow/core/kernels/matmul_op.cc:20:
external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h: In member function ‘void Eigen::internal::gebp_kernel<LhsScalar, RhsScalar, Index, DataMapper, mr, nr, ConjugateLhs, ConjugateRhs>::operator()(const DataMapper&, const LhsScalar*, const RhsScalar*, Index, Index, Index, Eigen::internal::gebp_kernel<LhsScalar, RhsScalar, Index, DataMapper, mr, nr, ConjugateLhs, ConjugateRhs>::ResScalar, Index, Index, Index, Index) [with LhsScalar = Eigen::half; RhsScalar = Eigen::half; Index = long int; DataMapper = Eigen::internal::blas_data_mapper<Eigen::half, long int, 0, 0>; int mr = 2; int nr = 4; bool ConjugateLhs = false; bool ConjugateRhs = false]’:
external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1879:3: internal compiler error: in emit_move_insn, at expr.c:3723
0xc429cb emit_move_insn(rtx_def*, rtx_def*)
        /space/rguenther/src/svn/gcc-8-branch/gcc/expr.c:3722
0xe404a6 lra_emit_move(rtx_def*, rtx_def*)
        /space/rguenther/src/svn/gcc-8-branch/gcc/lra.c:497
0xe50fcc match_reload
        /space/rguenther/src/svn/gcc-8-branch/gcc/lra-constraints.c:1055
0xe5a255 curr_insn_transform
        /space/rguenther/src/svn/gcc-8-branch/gcc/lra-constraints.c:4356
0xe5bc52 lra_constraints(bool)
        /space/rguenther/src/svn/gcc-8-branch/gcc/lra-constraints.c:4898
0xe4507c lra(_IO_FILE*)
        /space/rguenther/src/svn/gcc-8-branch/gcc/lra.c:2422
0xdf17bc do_reload
        /space/rguenther/src/svn/gcc-8-branch/gcc/ira.c:5465
0xdf1c08 execute
        /space/rguenther/src/svn/gcc-8-branch/gcc/ira.c:5649
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

tested with a cross from x86_64-linux to aarch64-linux.
Comment 1 Richard Biener 2019-03-18 08:50:26 UTC
Note the regression state isn't exactly clear.  I'm trying brute-force reduction
(have no easy way to check testcase validity lacking a compiler that builds this w/o an ICE).  The original issue was reported with -O2  -std=c++11 -mlittle-endian -mabi=lp64 -fstack-protector -fno-omit-frame-pointer -ffunction-sections -fdata-sections -fPIC -fno-exceptions -ftemplate-depth=900 -fno-canonical-system-headers.
Comment 2 Wilco 2019-03-18 12:42:31 UTC
Confirmed. It ICEs in Eigen::internal::gebp_kernel<Eigen::half, Eigen::half, long int, Eigen::internal::blas_data_mapper<Eigen::half, long int, 0, 0>, 2, 4, false, false>::operator()

It seems to choke on this asm during reload:

  531: {[r3842:DI]=asm_operands;[r3844:DI]=asm_operands;}

and somehow emit a move between these operands:

(reg:BLK 3849) (mem/c:BLK (reg:DI 3846) [29 A0+0 S2 A64])
Comment 3 Wilco 2019-03-18 13:14:28 UTC
Full instruction:

(insn 531 530 532 19 (parallel [
            (set (mem/c:BLK (reg:DI 3842) [29 A0+0 S2 A64])
                (asm_operands:BLK ("") ("=rm") 0 [
                        (mem/c:BLK (reg:DI 3846) [29 A0+0 S2 A64])
                        (mem/c:BLK (reg:DI 3848) [29 A1+0 S2 A128])
                    ]
                     [
                        (asm_input:BLK ("0") external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418)
                        (asm_input:BLK ("1") external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418)
                    ]
                     [] external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418))
            (set (mem/c:BLK (reg:DI 3844) [29 A1+0 S2 A128])
                (asm_operands:BLK ("") ("=rm") 1 [
                        (mem/c:BLK (reg:DI 3846) [29 A0+0 S2 A64])
                        (mem/c:BLK (reg:DI 3848) [29 A1+0 S2 A128])
                    ]
                     [
                        (asm_input:BLK ("0") external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418)
                        (asm_input:BLK ("1") external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418)
                    ]
                     [] external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h:1418))
        ]) "external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/products/GeneralBlockPanelKernel.h":1418:511 -1
     (nil))


I guess this comes from the source like this with some of the arguments being immediates due to template substitution:

__asm__ ("" : [a0] "+rm" (A0),[a1] "+rm" (A1));

It's not obvious what this is supposed to achieve, let alone whether it is valid...
Comment 4 Wilco 2019-03-18 13:32:10 UTC
Small example which generates the same ICE on every GCC version:

typedef struct { int x, y, z; } X;

void f(void)
{
  X A0, A1;
  __asm__ ("" : [a0] "+rm" (A0),[a1] "+rm" (A1));
}

So it's completely invalid inline asm...
Comment 5 rguenther@suse.de 2019-03-18 13:53:27 UTC
On Mon, 18 Mar 2019, wilco at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89752
> 
> --- Comment #4 from Wilco <wilco at gcc dot gnu.org> ---
> Small example which generates the same ICE on every GCC version:
> 
> typedef struct { int x, y, z; } X;
> 
> void f(void)
> {
>   X A0, A1;
>   __asm__ ("" : [a0] "+rm" (A0),[a1] "+rm" (A1));
> }
> 
> So it's completely invalid inline asm...

Looks like an attempt to provide some barrier?

We still shouldn't ICE.
Comment 6 Jakub Jelinek 2019-03-18 13:58:19 UTC
It is on:
__asm__("" : "a0" "=rm" A0, "a1" "=rm" A1 : "0" A0, "1" A1);
where A0 and A1 are variables with LhsPacket type, which is 2 byte TYPE_ADDRESSABLE aggregate type.
The r in the constraints looks completely bogus to me for addressable vars that must live in memory, but guess we shouldn't ICE even on this invalid stuff.
Comment 7 Jakub Jelinek 2019-03-18 14:00:20 UTC
I've reduced it to:
struct A { A (); ~A (); short c; };

void
foo ()
{
  A a0, a1;
  __asm volatile ("" : "=rm" (a0), "=rm" (a1) : "0" (a0), "1" (a1));
}
which better matches what is going on (the type is TYPE_ADDRESSABLE and that is why it has BLKmode).
Comment 8 rguenther@suse.de 2019-03-18 14:05:35 UTC
On Mon, 18 Mar 2019, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89752
> 
> Jakub Jelinek <jakub at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |jakub at gcc dot gnu.org
> 
> --- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> I've reduced it to:
> struct A { A (); ~A (); short c; };
> 
> void
> foo ()
> {
>   A a0, a1;
>   __asm volatile ("" : "=rm" (a0), "=rm" (a1) : "0" (a0), "1" (a1));
> }
> which better matches what is going on (the type is TYPE_ADDRESSABLE and that is
> why it has BLKmode).

And it works "fine" in x86_64 ...
Comment 9 Jakub Jelinek 2019-03-18 14:06:41 UTC
Bisecting now, r210000 still works, r215000 ICEs.
Comment 10 Wilco 2019-03-18 14:52:09 UTC
It seems that rewriting "+rm" into "=rm" and "0" is not equivalent. Eg.

__asm__  ("" : [a0] "=m" (A0) : "0" (A0));

gives a million warnings "matching constraint does not allow a register", so "0" appears to imply a register operand. Reload seems to deal with "=rm" and "rm" or "=rm" and "0m", but a single "0" prefers a register and that's why it tries to insert an incorrect BLK move.
Comment 11 Jakub Jelinek 2019-03-18 15:31:38 UTC
Actually can't bisect, as gcc 8 I have installed is no longer able to build r215000 or revisions around it (some error on wide-int.h:
../../gcc/wide-int.h:372:10: error: too many template-parameter-lists
   struct binary_traits <T1, T2, FLEXIBLE_PRECISION, FLEXIBLE_PRECISION>
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
etc.).

That said, if we want to fix the:
struct A { A (); ~A (); short c; };

void
foo ()
{
  A a0, a1;
  __asm volatile ("" : "+rm" (a0), "+rm" (a1));
}
in the provided testcase, we could use:
--- gcc/gimplify.c.jj	2019-03-07 20:45:39.168938360 +0100
+++ gcc/gimplify.c	2019-03-18 16:18:16.515466234 +0100
@@ -6155,6 +6155,19 @@ gimplify_asm_expr (tree *expr_p, gimple_
 	  is_inout = false;
 	}
 
+      /* If we can't make copies, we can only accept memory.  */
+      if (TREE_ADDRESSABLE (TREE_TYPE (TREE_VALUE (link))))
+	{
+	  if (allows_mem)
+	    allows_reg = 0;
+	  else
+	    {
+	      error ("impossible constraint in %<asm%>");
+	      error ("non-memory output %d must stay in memory", i);
+	      return GS_ERROR;
+	    }
+	}
+
       if (!allows_reg && allows_mem)
 	mark_addressable (TREE_VALUE (link));
 
which is something we already do for the input args, just for some strange reason not for output args as well (if we wanted to backport it, I'd leave the error part out).

That said, because users can write even that:
struct A { A (); ~A (); short c; };

void
foo ()
{
  A a0, a1;
  __asm volatile ("" : "=rm" (a0), "=rm" (a1) : "0" (a0), "1" (a1));
}
we need something in LRA too, that will simply treat BLKmode matching constraints when both registers and memory are allowed as if only memory was allowed.
Comment 12 Jakub Jelinek 2019-03-19 08:11:57 UTC
Author: jakub
Date: Tue Mar 19 08:11:25 2019
New Revision: 269793

URL: https://gcc.gnu.org/viewcvs?rev=269793&root=gcc&view=rev
Log:
	PR target/89752
	* gimplify.c (gimplify_asm_expr): For output argument with
	TREE_ADDRESSABLE type, clear allows_reg if it allows memory, otherwise
	diagnose error.

	* g++.dg/ext/asm15.C: Check for particular diagnostic wording.
	* g++.dg/ext/asm16.C: Likewise.
	* g++.dg/ext/asm17.C: New test.

Added:
    trunk/gcc/testsuite/g++.dg/ext/asm17.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/gimplify.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/g++.dg/ext/asm15.C
    trunk/gcc/testsuite/g++.dg/ext/asm16.C
Comment 13 Jakub Jelinek 2019-03-19 08:18:35 UTC
Fixed on the originally provided testcase, not on the #c7 testcase, that needs to be fixed in LRA not to try to reload BLKmode MEMs into REGs.
Comment 14 Jakub Jelinek 2019-03-19 10:52:36 UTC
The following patch fixes the remaining ICE for me:
--- gcc/lra-constraints.c.jj	2019-03-16 22:17:21.060937047 +0100
+++ gcc/lra-constraints.c	2019-03-19 11:49:11.982058568 +0100
@@ -2350,6 +2350,8 @@ process_alt_operands (int only_alternati
 		  break;
 
 		reg:
+		  if (mode == BLKmode)
+		    break;
 		  this_alternative = reg_class_subunion[this_alternative][cl];
 		  IOR_HARD_REG_SET (this_alternative_set,
 				    reg_class_contents[cl]);
@@ -2360,8 +2362,6 @@ process_alt_operands (int only_alternati
 		      IOR_HARD_REG_SET (this_costly_alternative_set,
 					reg_class_contents[cl]);
 		    }
-		  if (mode == BLKmode)
-		    break;
 		  winreg = true;
 		  if (REG_P (op))
 		    {
What in my understanding happens is that even when the r constraint for the BLKmode MEM doesn't win, this_alternative is still updated (to GENERAL_REGS in this case) and as the m constraint matches, we still process it as if GENERAL_REGS is an option.  It is not, BLKmode must live in memory.
Comment 15 Jakub Jelinek 2019-03-20 11:27:13 UTC
Author: jakub
Date: Wed Mar 20 11:26:42 2019
New Revision: 269819

URL: https://gcc.gnu.org/viewcvs?rev=269819&root=gcc&view=rev
Log:
	PR target/89752
	* lra-constraints.c (process_alt_operands) <reg>: For BLKmode, don't
	update this_alternative nor this_alternative_set.

	* g++.target/aarch64/aarch64.exp: New file.
	* g++.target/aarch64/pr89752.C: New test.

Added:
    trunk/gcc/testsuite/g++.target/aarch64/aarch64.exp
    trunk/gcc/testsuite/g++.target/aarch64/pr89752.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/lra-constraints.c
    trunk/gcc/testsuite/ChangeLog
Comment 16 Jakub Jelinek 2019-04-30 20:50:09 UTC
Author: jakub
Date: Tue Apr 30 20:49:38 2019
New Revision: 270735

URL: https://gcc.gnu.org/viewcvs?rev=270735&root=gcc&view=rev
Log:
	Backported from mainline
	2019-03-19  Jakub Jelinek  <jakub@redhat.com>

	PR target/89752
	* gimplify.c (gimplify_asm_expr): For output argument with
	TREE_ADDRESSABLE type, clear allows_reg if it allows memory, otherwise
	diagnose error.

	* g++.dg/ext/asm15.C: Check for particular diagnostic wording.
	* g++.dg/ext/asm16.C: Likewise.
	* g++.dg/ext/asm17.C: New test.

Added:
    branches/gcc-8-branch/gcc/testsuite/g++.dg/ext/asm17.C
Modified:
    branches/gcc-8-branch/gcc/ChangeLog
    branches/gcc-8-branch/gcc/gimplify.c
    branches/gcc-8-branch/gcc/testsuite/ChangeLog
    branches/gcc-8-branch/gcc/testsuite/g++.dg/ext/asm15.C
    branches/gcc-8-branch/gcc/testsuite/g++.dg/ext/asm16.C
Comment 17 Jakub Jelinek 2019-04-30 20:51:20 UTC
Author: jakub
Date: Tue Apr 30 20:50:48 2019
New Revision: 270737

URL: https://gcc.gnu.org/viewcvs?rev=270737&root=gcc&view=rev
Log:
	Backported from mainline
	2019-03-20  Jakub Jelinek  <jakub@redhat.com>

	PR target/89752
	* lra-constraints.c (process_alt_operands) <reg>: For BLKmode, don't
	update this_alternative nor this_alternative_set.

Modified:
    branches/gcc-8-branch/gcc/ChangeLog
    branches/gcc-8-branch/gcc/lra-constraints.c
Comment 18 Jakub Jelinek 2019-05-01 07:14:50 UTC
Fixed for 8.4+ too.
Comment 19 Jakub Jelinek 2019-08-30 12:27:52 UTC
Author: jakub
Date: Fri Aug 30 12:27:21 2019
New Revision: 275136

URL: https://gcc.gnu.org/viewcvs?rev=275136&root=gcc&view=rev
Log:
	Backported from mainline
	2019-03-19  Jakub Jelinek  <jakub@redhat.com>

	PR target/89752
	* gimplify.c (gimplify_asm_expr): For output argument with
	TREE_ADDRESSABLE type, clear allows_reg if it allows memory, otherwise
	diagnose error.

	* g++.dg/ext/asm15.C: Check for particular diagnostic wording.
	* g++.dg/ext/asm16.C: Likewise.
	* g++.dg/ext/asm17.C: New test.

Added:
    branches/gcc-7-branch/gcc/testsuite/g++.dg/ext/asm17.C
Modified:
    branches/gcc-7-branch/gcc/ChangeLog
    branches/gcc-7-branch/gcc/gimplify.c
    branches/gcc-7-branch/gcc/testsuite/ChangeLog
    branches/gcc-7-branch/gcc/testsuite/g++.dg/ext/asm15.C
    branches/gcc-7-branch/gcc/testsuite/g++.dg/ext/asm16.C
Comment 20 Jakub Jelinek 2019-08-30 12:28:54 UTC
Author: jakub
Date: Fri Aug 30 12:28:22 2019
New Revision: 275138

URL: https://gcc.gnu.org/viewcvs?rev=275138&root=gcc&view=rev
Log:
	Backported from mainline
	2019-03-20  Jakub Jelinek  <jakub@redhat.com>

	PR target/89752
	* lra-constraints.c (process_alt_operands) <reg>: For BLKmode, don't
	update this_alternative nor this_alternative_set.

Modified:
    branches/gcc-7-branch/gcc/ChangeLog
    branches/gcc-7-branch/gcc/lra-constraints.c