This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

From: Tamar Christina <tamar dot christina at arm dot com>
To: gcc-patches at gcc dot gnu dot org
Cc: nd at arm dot com, Ramana dot Radhakrishnan at arm dot com, Richard dot Earnshaw at arm dot com, nickc at redhat dot com, Kyrylo dot Tkachov at arm dot com
Date: Mon, 23 Jul 2018 17:56:09 +0100
Subject: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

Hi All,

My previous patch changed arm_can_change_mode_class to allow subregs of
64bit registers on arm big-endian.  However it seems that we can't do this
because a the data in 64 bit VFP registers are stored in little-endian order,
even on big-endian.

Allowing this change had a knock on effect that caused GCC's no-op detection
to think that loading from the first lane on arm big-endian is a no-op.  this
because we can't describe the weird ordering we have on D registers on big-endian.

The original issue comes from the fact that the code does

... foo (... bar)
{
  return bar;
}

The expansion of the return statement causes GCC to try to return the value in
a register.  GCC will try to emit the move then, from MEM to REG (due to the SSA
temporary.).  It checks for a mov optab for this which isn't available and
then tries to do the move in bits using emit_move_multi_word.

emit_move_multi_word will split the move into sub parts, but then needs to get
the sub parts and does this using subregs, but it's told it can't do subregs!

The compiler is now stuck in an infinite loop.

The way this is worked around in the back-end is that we have move patterns in
neon.md that usually just force the register instead of checking with the
back-end. This prevents emit_move_multi_word from being needed.  However the
pattern for V4HF and V8HF were guarded by TARGET_NEON && TARGET_FP16.

I don't believe the TARGET_FP16 guard to be needed, because the pattern doesn't
actually generate code and requires another pattern for that, and a reg to reg move
should always be possible anyway. So allowing the force to register here is safe
and it allows the compiler to generate a correct error instead of ICEing in an
infinite loop.

This patch ensures gcc.target/arm/big-endian-subreg.c is fixed without introducing
any regressions while fixing

gcc.dg/vect/vect-nop-move.c execution test
g++.dg/torture/vshuf-v2si.C   -O3 -g  execution test
g++.dg/torture/vshuf-v4si.C   -O3 -g  execution test
g++.dg/torture/vshuf-v8hi.C   -O3 -g  execution test

Regtested on armeb-none-eabi and no regressions.
Bootstrapped on arm-none-linux-gnueabihf and no issues.


Ok for trunk?

Thanks,
Tamar

gcc/
2018-07-23  Tamar Christina  <tamar.christina@arm.com>

	PR target/84711
	* config/arm/arm.c (arm_can_change_mode_class): Disallow subreg.
	* config/arm/neon.md (movv4hf, movv8hf): Refactored to..
	(mov<mov>): ..this and enable unconditionally.

--

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index ec3abbcba9fb1ed040bc9cdcf162f5bb4f8df586..8d5897c8f0f93502fb958cba61b7c0b1570d1570 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -31512,8 +31512,8 @@ arm_can_change_mode_class (machine_mode from, machine_mode to,
 {
   if (TARGET_BIG_END
       && !(GET_MODE_SIZE (from) == 16 && GET_MODE_SIZE (to) == 8)
-      && (GET_MODE_UNIT_SIZE (from) > UNITS_PER_WORD
-	  || GET_MODE_UNIT_SIZE (to) > UNITS_PER_WORD)
+      && (GET_MODE_SIZE (from) > UNITS_PER_WORD
+	  || GET_MODE_SIZE (to) > UNITS_PER_WORD)
       && reg_classes_intersect_p (VFP_REGS, rclass))
     return false;
   return true;
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 1646b2172970acaaf949ba8b77d43ec72b688d73..57df17fe18dfeaa82e890fd339e2104ad27ee13b 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -137,10 +137,10 @@
     }
 })
 
-(define_expand "movv4hf"
-  [(set (match_operand:V4HF 0 "s_register_operand")
-	(match_operand:V4HF 1 "s_register_operand"))]
-  "TARGET_NEON && TARGET_FP16"
+(define_expand "mov<mode>"
+  [(set (match_operand:VH 0 "s_register_operand")
+	(match_operand:VH 1 "s_register_operand"))]
+  "TARGET_NEON"
 {
   /* We need to use force_reg to avoid TARGET_CAN_CHANGE_MODE_CLASS
      causing an ICE on big-endian because it cannot extract subregs in
@@ -148,22 +148,7 @@
   if (can_create_pseudo_p ())
     {
       if (!REG_P (operands[0]))
-	operands[1] = force_reg (V4HFmode, operands[1]);
-    }
-})
-
-(define_expand "movv8hf"
-  [(set (match_operand:V8HF 0 "")
-	(match_operand:V8HF 1 ""))]
-  "TARGET_NEON && TARGET_FP16"
-{ 
-  /* We need to use force_reg to avoid TARGET_CAN_CHANGE_MODE_CLASS
-     causing an ICE on big-endian because it cannot extract subregs in
-     this case.  */
-  if (can_create_pseudo_p ())
-    {
-      if (!REG_P (operands[0]))
-	operands[1] = force_reg (V8HFmode, operands[1]);
+	operands[1] = force_reg (<MODE>mode, operands[1]);
     }
 })

Follow-Ups:
- Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.
  - From: Thomas Preudhomme

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]