This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [ARM Patch 1/3]PR53189: optimizations of 64bit logic operation with constant
On Tue, Jul 17, 2012 at 9:47 PM, Ramana Radhakrishnan
<ramana.radhakrishnan@linaro.org> wrote:
> Carrot,
>
> Sorry about the delayed response.
>
> On 3 July 2012 12:28, Carrot Wei <carrot@google.com> wrote:
>> On Thu, Jun 28, 2012 at 12:14 AM, Ramana Radhakrishnan
>> <ramana.radhakrishnan@linaro.org> wrote:
>>> On 28 May 2012 11:08, Carrot Wei <carrot@google.com> wrote:
>>>> Hi
>>>>
>>>> This is the second part of the patches that deals with 64bit and. It directly
>>>> extends the patterns anddi3, anddi3_insn and anddi3_neon to handle 64bit
>>>> constant operands.
>>>>
>>>
>>> Comments about const_di_ok_for_op still apply from my review of your add patch.
>>>
>>> However I don't see why and /ior / xor with constants that have either
>>> the low or high parts set can't be expanded directly into ands of
>>> subregs with moves of zero's or the original value especially if you
>>> aren't looking at doing 64 bit operations in neon .With Neon being
>>> used for 64 bit arithmetic it gets more interesting.
>>>
>>> Finally this should target PR target/53189.
>>>
>>
>> Hi Ramana
>>
>> Thanks for the review. Following is the updated patch according to
>> your comments.
>
> You've missed answering this part of my review :)
>
>>> However I don't see why and /ior / xor with constants that have either
>>> the low or high parts set can't be expanded directly into ands of
>>> subregs with moves of zero's or the original value especially if you
>>> aren't looking at doing 64 bit operations in neon .With Neon being
>>> used for 64 bit arithmetic it gets more interesting.
>
It has been handled by the const_ok_for_dimode_op and the output part
of corresponding SI mode insn. Let's take the IOR case as an example.
In the const_ok_for_dimode_op patch
--- arm.c (revision 189278)
+++ arm.c (working copy)
@@ -2524,6 +2524,16 @@
case PLUS:
return arm_not_operand (hi, SImode) && arm_add_operand (lo, SImode);
+ case IOR:
+ if ((const_ok_for_arm (hi_val) || (hi_val == 0xFFFFFFFF))
+ && (const_ok_for_arm (lo_val) || (lo_val == 0xFFFFFFFF)))
+ return 1;
+ if (TARGET_THUMB2
+ && (const_ok_for_arm (lo_val) || const_ok_for_arm (~lo_val))
+ && (const_ok_for_arm (hi_val) || const_ok_for_arm (~hi_val)))
+ return 1;
+ return 0;
+
default:
return 0;
}
The 0xFFFFFFFF is not valid arm mode immediate, but ior 0XFFFFFFFF
results in all 1's, so it is also allowed in an iordi3 insn. And the
patch in iorsi3_insn pattern explicitly check the all 0's and all 1's
cases, and output either a simple register mov instruction or
instruction mov all 1's to the destination.
@@ -2902,10 +2915,29 @@
(ior:SI (match_operand:SI 1 "s_register_operand" "%r,r,r")
(match_operand:SI 2 "reg_or_int_operand" "rI,K,?n")))]
"TARGET_32BIT"
- "@
- orr%?\\t%0, %1, %2
- orn%?\\t%0, %1, #%B2
- #"
+ "*
+ {
+ if (CONST_INT_P (operands[2]))
+ {
+ HOST_WIDE_INT i = INTVAL (operands[2]) & 0xFFFFFFFF;
+ if (i == 0xFFFFFFFF)
+ return \"mvn%?\\t%0, #0\";
+ if (i == 0)
+ {
+ if (!rtx_equal_p (operands[0], operands[1]))
+ return \"mov%?\\t%0, %1\";
+ else
+ return \"\";
+ }
+ }
+
+ switch (which_alternative)
+ {
+ case 0: return \"orr%?\\t%0, %1, %2\";
+ case 1: return \"orn%?\\t%0, %1, #%B2\";
+ case 2: return \"#\";
+ }
+ }"
"TARGET_32BIT
&& GET_CODE (operands[2]) == CONST_INT
&& !(const_ok_for_arm (INTVAL (operands[2]))
> Is there any reason why we don't split such cases earlier into the
> constituent moves and the associated ands earlier than reload in the
> non-Neon case?
>
I referenced pattern arm_adddi3 which is split after reload_completed.
And the pattern arm_subdi3 is even not split. I guess they keep the
original constant longer may benefit some optimizations involving
constants. But it may also lose flexibility in other cases.
> In addition, it would be good to have some tests for Thumb2 that deal
> with the replicated constants for Thumb2 . Can you have a look at
> creating some tests similar to the thumb2*replicated*.c tests in
> gcc.target/arm but for 64 bit constants ?
>
The new test cases involving thumb2 replicated constants are added as following.
thanks
Carrot
2012-07-18 Wei Guozhi <carrot@google.com>
PR target/53189
* gcc.target/arm/pr53189-10.c: New testcase.
* gcc.target/arm/pr53189-11.c: New testcase.
* gcc.target/arm/pr53189-12.c: New testcase.
Index: pr53189-10.c
===================================================================
--- pr53189-10.c (revision 0)
+++ pr53189-10.c (revision 0)
@@ -0,0 +1,9 @@
+/* { dg-options "-mthumb -O2" } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-final { scan-assembler-not "mov" } } */
+/* { dg-final { scan-assembler "and.*#-16843010" } } */
+
+void t0p(long long * p)
+{
+ *p &= 0x9fefefefe;
+}
Index: pr53189-11.c
===================================================================
--- pr53189-11.c (revision 0)
+++ pr53189-11.c (revision 0)
@@ -0,0 +1,9 @@
+/* { dg-options "-mthumb -O2" } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-final { scan-assembler-not "mov" } } */
+/* { dg-final { scan-assembler "eor.*#-1426019584" } } */
+
+void t0p(long long * p)
+{
+ *p ^= 0x7ab00ab00;
+}
Index: pr53189-12.c
===================================================================
--- pr53189-12.c (revision 0)
+++ pr53189-12.c (revision 0)
@@ -0,0 +1,9 @@
+/* { dg-options "-mthumb -O2" } */
+/* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-final { scan-assembler-not "mov" } } */
+/* { dg-final { scan-assembler "orr.*#13435085" } } */
+
+void t0p(long long * p)
+{
+ *p |= 0x500cd00cd;
+}