This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, ARM] Keep constants in register when expanding


On 8 August 2014 23:22, Ramana Radhakrishnan <ramana.gcc@googlemail.com> wrote:
> On Tue, Aug 5, 2014 at 10:31 AM, Zhenqiang Chen
> <zhenqiang.chen@linaro.org> wrote:
>> Hi,
>>
>> For some large constants, ARM will split them during expanding, which
>> makes impossible to hoist them out the loop or shared by different
>> references (refer the test case in the patch).
>>
>> The patch keeps some constants in registers. If the constant can not
>> be optimized, the cprop and combine passes can optimize them as what
>> we do in current expand pass with
>>
>> define_insn_and_split "*arm_subsi3_insn"
>> define_insn_and_split "*arm_andsi3_insn"
>> define_insn_and_split "*iorsi3_insn"
>> define_insn_and_split "*arm_xorsi3"
>>
>> The patch does not modify addsi3 since the define_insn_and_split
>> "*arm_addsi3" is only valid when (reload_completed ||
>> !arm_eliminable_register (operands[1])). The cprop and combine passes
>> can not optimize the large constant if we put it in register, which
>> will lead to regression.
>>
>> For logic operators, the patch skips changes for constants:
>>
>> INTVAL (operands[2]) < 0 && const_ok_for_arm (-INTVAL (operands[2])
>>
>> since expand pass always uses "sign-extend" to get the value
>> (trunc_int_for_mode called from immed_wide_int_const) for rtl, and
>> logs show most negative values are UNSIGNED when they are TREE node.
>> And combine pass is smart enough to recover the negative value to
>> positive value.
>
> I am unable to verify any change in code generation for this testcase
> with and without the patch when I had a play with the patch.
>
> what gives ?

Thanks for trying the patch.

Do you add option -fno-gcse which is mentioned in dg-options " -O2
-fno-gcse "? Without it, there is no change for the testcase since
cprop pass will propagate the constant to AND expr (A patch to enhance
cprop pass was discussed at
https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01321.html).

In addition, if the constant can not be hoisted out the loop, later
combine pass can also optimize it. These (cprop and combine) are
reasons why the patch itself has little impact on current tests.

Test case in the patch is updated since it does not work for armv7-m.

Thanks!
-Zhenqiang

> Ramana
>
>
>>
>> Bootstrap and no make check regression on Chrome book.
>> For coremark, dhrystone and eembcv1, no any code size and performance
>> change on Cortex-M4.
>> No any file in CSiBE has code size change for Cortex-A15 and Cortex-M4.
>> No Spec2000 performance regression on Chrome book and dumped assemble
>> codes only show very few difference.
>>
>> OK for trunk?
>>
>> Thanks!
>> -Zhenqiang
>>
>> ChangeLog:
>> 2014-08-05  Zhenqiang Chen  <zhenqiang.chen@linaro.org>
>>
>>         * config/arm/arm.md (subsi3, andsi3, iorsi3, xorsi3): Keep some
>>         large constants in register other than split them.
>>
>> testsuite/ChangeLog:
>> 2014-08-05  Zhenqiang Chen  <zhenqiang.chen@linaro.org>
>>
>>         * gcc.target/arm/maskdata.c: New test.
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index bd8ea8f..c8b3001 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1162,9 +1162,16 @@
     {
       if (TARGET_32BIT)
         {
-          arm_split_constant (MINUS, SImode, NULL_RTX,
-	                      INTVAL (operands[1]), operands[0],
-	  		      operands[2], optimize && can_create_pseudo_p ());
+	  if (!const_ok_for_arm (INTVAL (operands[1]))
+	      && can_create_pseudo_p ())
+	    {
+	      operands[1] = force_reg (SImode, operands[1]);
+	      emit_insn (gen_subsi3 (operands[0], operands[1], operands[2]));
+	    }
+	  else
+	    arm_split_constant (MINUS, SImode, NULL_RTX,
+				INTVAL (operands[1]), operands[0], operands[2],
+				optimize && can_create_pseudo_p ());
           DONE;
 	}
       else /* TARGET_THUMB1 */
@@ -2077,6 +2084,17 @@
 	      emit_insn (gen_thumb2_zero_extendqisi2_v6 (operands[0],
 							 operands[1]));
 	    }
+	  else if (!(const_ok_for_arm (INTVAL (operands[2]))
+		     || const_ok_for_arm (~INTVAL (operands[2]))
+			/* zero_extendhi instruction is efficient enough.  */
+		     || INTVAL (operands[2]) == 0xffff
+		     || (INTVAL (operands[2]) < 0
+			 && const_ok_for_arm (-INTVAL (operands[2]))))
+		   && can_create_pseudo_p ())
+	    {
+	      operands[2] = force_reg (SImode, operands[2]);
+	      emit_insn (gen_andsi3 (operands[0], operands[1], operands[2]));
+	    }
 	  else
 	    arm_split_constant (AND, SImode, NULL_RTX,
 				INTVAL (operands[2]), operands[0],
@@ -2882,9 +2900,20 @@
     {
       if (TARGET_32BIT)
         {
-          arm_split_constant (IOR, SImode, NULL_RTX,
-	                      INTVAL (operands[2]), operands[0], operands[1],
-			      optimize && can_create_pseudo_p ());
+	  if (!(const_ok_for_arm (INTVAL (operands[2]))
+		|| (TARGET_THUMB2
+		    && const_ok_for_arm (~INTVAL (operands[2])))
+		|| (INTVAL (operands[2]) < 0
+		    && const_ok_for_arm (-INTVAL (operands[2]))))
+	      && can_create_pseudo_p ())
+	    {
+	      operands[2] = force_reg (SImode, operands[2]);
+	      emit_insn (gen_iorsi3 (operands[0], operands[1], operands[2]));
+	    }
+	  else
+	    arm_split_constant (IOR, SImode, NULL_RTX,
+				INTVAL (operands[2]), operands[0], operands[1],
+				optimize && can_create_pseudo_p ());
           DONE;
 	}
       else /* TARGET_THUMB1 */
@@ -3052,9 +3081,18 @@
     {
       if (TARGET_32BIT)
         {
-          arm_split_constant (XOR, SImode, NULL_RTX,
-	                      INTVAL (operands[2]), operands[0], operands[1],
-			      optimize && can_create_pseudo_p ());
+	  if (!(const_ok_for_arm (INTVAL (operands[2]))
+		|| (INTVAL (operands[2]) < 0
+		     && const_ok_for_arm (-INTVAL (operands[2]))))
+	      && can_create_pseudo_p ())
+	    {
+	      operands[2] = force_reg (SImode, operands[2]);
+	      emit_insn (gen_xorsi3 (operands[0], operands[1], operands[2]));
+	    }
+	  else
+	    arm_split_constant (XOR, SImode, NULL_RTX,
+				INTVAL (operands[2]), operands[0], operands[1],
+				optimize && can_create_pseudo_p ());
           DONE;
 	}
       else /* TARGET_THUMB1 */

diff --git a/gcc/testsuite/gcc.target/arm/maskdata.c b/gcc/testsuite/gcc.target/arm/maskdata.c
new file mode 100644
index 0000000..6960c75
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/maskdata.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options " -O2 -fno-gcse " }  */
+/* { dg-require-effective-target arm_thumb2_ok } */
+
+#define MASK 0xfe00ff
+void maskdata (int * data, int len)
+{
+   int i = len;
+   for (; i > 0; i -= 2)
+    {
+      data[i] &= MASK;
+      data[i + 1] &= MASK;
+    }
+}
+/* { dg-final { scan-assembler-not "65536" } } */

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]