[Patch] [Aarch64] PR rtl-optimization/87763 - this patch fixes gcc.target/aarch64/lsl_asr_sbfiz.c

Fri Apr 26 14:19:00 GMT 2019

On 4/16/19 10:29 AM, Steve Ellcey wrote:
> Re-ping.  I know there are discussions about bigger changes to fix the
> various failures listed in PR rtl-optimization/87763 but this patch
> at least fixes one of them (gcc.target/aarch64/lsl_asr_sbfiz.c).
> 
> Steve Ellcey
> sellcey@marvell.com
So here's my take on fixing the lsl_asr_sbfix.c regression.

As I mentioned earlier the problem is the aarch64 is using the old way
of describing bitfield extractions (extv, extzv).  Specifically it
defined a single expander that operated on the natural word mode (DImode).

This forces the input operand to be used in DImode as well.  So we get
those annoying subregs in the expressions that combine generates.  But
there's a better way :-)

The new way to handle this stuff is to define expanders for supported
modes (SI and DI for aarch64).  Interestingly enough the aarch64 port
already does this for bitfield insertions via insv.

So I made the obvious changes to the extv/extzv expander.  That fixes
the lsl_asr_sbfiz regression.  But doesn't bootstrap.  The availability
of the new expander makes extract_bit_field_using_extv want to extract a
field from an HFmode object and shove it into an SImode.  That runs
afoul of checks in validate_subreg (as it should).  We shouldn't be
using subregs to change the size of a floating point object, so that
needs to be filtered out in extract_bit_field_using_extv.

That fixes the bootstrap issue.  But regression testing trips over
ashtidisi.c.  That can be easily fixed by allowing zero/sign extractions
which change size.  ie:

(set (reg:DI) (sign_extract:DI (reg:SI) ...)))

Which seems like a reasonable thing to handle.

So here's what I've got.  I've bootstrapped and regression tested on
aarch64.  It's also bootstrapped on aarch64_be for good measure.

OK (from aarch64 maintainers) for the gcc-9 branch and trunk?  Or we
could just address on the trunk for gcc-10.  I don't have strong
preferences either way.

Jeff

ps.  Time to return insv regressions and address Segher's comments :-)


-------------- next part --------------
	* aarch64.md (extv, extzv expander): Generalize so that it works
	for both SImode and DImode.
	(extv_3264, extzv_3264): New pattern to handle extractions with
	size change between the input and output operand.
	* expmed.c (extract_bitfield_using_extv): Do not allow changing
	sizes of floating point objects using SUBREGs.

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 5a1894063a1..13e2bca05a1 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5390,16 +5390,16 @@
 ;; Bitfields
 ;; -------------------------------------------------------------------
 
-(define_expand "<optab>"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-	(ANY_EXTRACT:DI (match_operand:DI 1 "register_operand")
+(define_expand "<optab><mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(ANY_EXTRACT:GPI (match_operand:GPI 1 "register_operand")
 			(match_operand 2
-			  "aarch64_simd_shift_imm_offset_di")
-			(match_operand 3 "aarch64_simd_shift_imm_di")))]
+			  "aarch64_simd_shift_imm_offset_<mode>")
+			(match_operand 3 "aarch64_simd_shift_imm_<mode>")))]
   ""
   {
     if (!IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]),
-		   1, GET_MODE_BITSIZE (DImode) - 1))
+		   1, GET_MODE_BITSIZE (<MODE>mode) - 1))
      FAIL;
   }
 )
@@ -5418,6 +5418,21 @@
   [(set_attr "type" "bfx")]
 )
 
+;; Similar to the previous pattern, but 32->64 extraction
+(define_insn "*<optab>_3264"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(ANY_EXTRACT:DI (match_operand:SI 1 "register_operand" "r")
+			 (match_operand 2
+			   "aarch64_simd_shift_imm_offset_si" "n")
+			 (match_operand 3
+			   "aarch64_simd_shift_imm_si" "n")))]
+  "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]),
+	     1, GET_MODE_BITSIZE (DImode) - 1)"
+  "<su>bfx\\t%x0, %x1, %3, %2"
+  [(set_attr "type" "bfx")]
+)
+
+
 ;; When the bit position and width add up to 32 we can use a W-reg LSR
 ;; instruction taking advantage of the implicit zero-extension of the X-reg.
 (define_split
diff --git a/gcc/expmed.c b/gcc/expmed.c
index d7f8e9a5d76..ce8f9595b9a 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -1544,7 +1544,14 @@ extract_bit_field_using_extv (const extraction_insn *extv, rtx op0,
 	 mode.  Instead, create a temporary and use convert_move to set
 	 the target.  */
       if (REG_P (target)
-	  && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
+	  && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)
+	  /* We can't change the size of float mode subregs (see
+	     validate_subreg).  So if either mode is a floating point
+	     mode and the sizes are not equal, then reject doing this
+	     via gen_lowpart.  */
+	  && !((FLOAT_MODE_P (GET_MODE (target)) || FLOAT_MODE_P (ext_mode))
+	       && maybe_ne (GET_MODE_BITSIZE (GET_MODE (target)),
+			    GET_MODE_BITSIZE (ext_mode))))
 	{
 	  target = gen_lowpart (ext_mode, target);
 	  if (partial_subreg_p (GET_MODE (spec_target), ext_mode))