This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [ARM Patch 1/3]PR53189: optimizations of 64bit logic operation with constant


On Tue, Jul 17, 2012 at 9:47 PM, Ramana Radhakrishnan
<ramana.radhakrishnan@linaro.org> wrote:
> Carrot,
>
> Sorry about the delayed response.
>
> On 3 July 2012 12:28, Carrot Wei <carrot@google.com> wrote:
>> On Thu, Jun 28, 2012 at 12:14 AM, Ramana Radhakrishnan
>> <ramana.radhakrishnan@linaro.org> wrote:
>>> On 28 May 2012 11:08, Carrot Wei <carrot@google.com> wrote:
>>>> Hi
>>>>
>>>> This is the second part of the patches that deals with 64bit and. It directly
>>>> extends the patterns anddi3, anddi3_insn and anddi3_neon to handle 64bit
>>>> constant operands.
>>>>
>>>
>>> Comments about const_di_ok_for_op still apply from my review of your add patch.
>>>
>>> However I don't see why and /ior / xor with constants that have either
>>> the low or high parts set can't be expanded directly into ands of
>>> subregs with moves of zero's or the original value especially if you
>>> aren't looking at doing 64 bit operations in neon .With Neon being
>>> used for 64 bit arithmetic it gets more interesting.
>>>
>>> Finally this should target PR target/53189.
>>>
>>
>> Hi Ramana
>>
>> Thanks for the review. Following is the updated patch according to
>> your comments.
>
> You've missed answering this part of my review :)
>
>>> However I don't see why and /ior / xor with constants that have either
>>> the low or high parts set can't be expanded directly into ands of
>>> subregs with moves of zero's or the original value especially if you
>>> aren't looking at doing 64 bit operations in neon .With Neon being
>>> used for 64 bit arithmetic it gets more interesting.
>
It has been handled by the const_ok_for_dimode_op and the output part
of corresponding SI mode insn. Let's take the IOR case as an example.

In the const_ok_for_dimode_op patch

--- arm.c	(revision 189278)
+++ arm.c	(working copy)
@@ -2524,6 +2524,16 @@
     case PLUS:
       return arm_not_operand (hi, SImode) && arm_add_operand (lo, SImode);

+    case IOR:
+      if ((const_ok_for_arm (hi_val) || (hi_val == 0xFFFFFFFF))
+	  && (const_ok_for_arm (lo_val) || (lo_val == 0xFFFFFFFF)))
+	return 1;
+      if (TARGET_THUMB2
+	  && (const_ok_for_arm (lo_val) || const_ok_for_arm (~lo_val))
+	  && (const_ok_for_arm (hi_val) || const_ok_for_arm (~hi_val)))
+	return 1;
+      return 0;
+
     default:
       return 0;
     }

The 0xFFFFFFFF is not valid arm mode immediate, but ior 0XFFFFFFFF
results in all 1's, so it is also allowed in an iordi3 insn. And the
patch in iorsi3_insn pattern explicitly check the all 0's and all 1's
cases, and output either a simple register mov instruction or
instruction mov all 1's to the destination.

@@ -2902,10 +2915,29 @@
 	(ior:SI (match_operand:SI 1 "s_register_operand" "%r,r,r")
 		(match_operand:SI 2 "reg_or_int_operand" "rI,K,?n")))]
   "TARGET_32BIT"
-  "@
-   orr%?\\t%0, %1, %2
-   orn%?\\t%0, %1, #%B2
-   #"
+  "*
+  {
+    if (CONST_INT_P (operands[2]))
+      {
+	HOST_WIDE_INT i = INTVAL (operands[2]) & 0xFFFFFFFF;
+	if (i == 0xFFFFFFFF)
+	  return \"mvn%?\\t%0, #0\";
+	if (i == 0)
+	  {
+	    if (!rtx_equal_p (operands[0], operands[1]))
+	      return \"mov%?\\t%0, %1\";
+	    else
+	      return \"\";
+	  }
+      }
+
+    switch (which_alternative)
+      {
+      case 0: return \"orr%?\\t%0, %1, %2\";
+      case 1: return \"orn%?\\t%0, %1, #%B2\";
+      case 2: return \"#\";
+      }
+  }"
   "TARGET_32BIT
    && GET_CODE (operands[2]) == CONST_INT
    && !(const_ok_for_arm (INTVAL (operands[2]))


> Is there any reason why we don't split such cases earlier into the
> constituent moves and the associated ands earlier than reload in the
> non-Neon case?
>
I referenced pattern arm_adddi3 which is split after reload_completed.
And the pattern arm_subdi3 is even not split. I guess they keep the
original constant longer may benefit some optimizations involving
constants. But it may also lose flexibility in other cases.

>  In addition, it would be good to have some tests for Thumb2 that deal
> with the replicated constants for Thumb2 . Can you have a look at
> creating some tests similar to the thumb2*replicated*.c tests in
> gcc.target/arm but for 64 bit constants ?
>

The new test cases involving thumb2 replicated constants are added as following.

thanks
Carrot



2012-07-18  Wei Guozhi  <carrot@google.com>

	PR target/53189
	* gcc.target/arm/pr53189-10.c: New testcase.
	* gcc.target/arm/pr53189-11.c: New testcase.
	* gcc.target/arm/pr53189-12.c: New testcase.



Index: pr53189-10.c
===================================================================
--- pr53189-10.c	(revision 0)
+++ pr53189-10.c	(revision 0)
@@ -0,0 +1,9 @@
+/* { dg-options "-mthumb -O2" }  */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-final { scan-assembler-not "mov" } } */
+/* { dg-final { scan-assembler "and.*#-16843010" } } */
+
+void t0p(long long * p)
+{
+  *p &= 0x9fefefefe;
+}


Index: pr53189-11.c
===================================================================
--- pr53189-11.c	(revision 0)
+++ pr53189-11.c	(revision 0)
@@ -0,0 +1,9 @@
+/* { dg-options "-mthumb -O2" }  */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-final { scan-assembler-not "mov" } } */
+/* { dg-final { scan-assembler "eor.*#-1426019584" } } */
+
+void t0p(long long * p)
+{
+  *p ^= 0x7ab00ab00;
+}


Index: pr53189-12.c
===================================================================
--- pr53189-12.c	(revision 0)
+++ pr53189-12.c	(revision 0)
@@ -0,0 +1,9 @@
+/* { dg-options "-mthumb -O2" }  */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-final { scan-assembler-not "mov" } } */
+/* { dg-final { scan-assembler "orr.*#13435085" } } */
+
+void t0p(long long * p)
+{
+  *p |= 0x500cd00cd;
+}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]