This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][GCC] Simplification of 1U << (31 - x)
- From: Sudi Das <Sudi dot Das at arm dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, Jakub Jelinek <jakub at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, James Greenhalgh <James dot Greenhalgh at arm dot com>
- Date: Tue, 26 Sep 2017 12:44:10 +0000
- Subject: Re: [PATCH][GCC] Simplification of 1U << (31 - x)
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Sudi dot Das at arm dot com;
- Nodisclaimer: True
- References: <AM5PR0802MB2610B3E04DF2484B04208CEC83020@AM5PR0802MB2610.eurprd08.prod.outlook.com> <20170413112151.GD1809@tucnak> <AM5PR0802MB2610B75CC3BDBA5C021B3DA083020@AM5PR0802MB2610.eurprd08.prod.outlook.com> <20170413114125.GE1809@tucnak> <CAFiYyc1Jk2hpuw1xnGD98SNQzzQHTzaoxFhq5C0ZeJVvZ2hODw@mail.gmail.com> <AM5PR0802MB2610D2CFF2BC0E6DF5E9093683020@AM5PR0802MB2610.eurprd08.prod.outlook.com> <VI1PR08MB265576266CC24A0553CCB94898B30@VI1PR08MB2655.eurprd08.prod.outlook.com>,<CAFiYyc0C79=vcjFN7e=uZZVHjORMoOyUpyo3rYd_H9XTPHoG6w@mail.gmail.com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
Still waiting on Jakub's comment on whether there are more things needed at the backend. But I have updated the patch according to Richard's comments.
Thanks
Sudi
From: Richard Biener <richard.guenther@gmail.com>
Sent: Friday, August 4, 2017 11:16 AM
To: Sudi Das
Cc: Wilco Dijkstra; Jakub Jelinek; GCC Patches; nd; Richard Earnshaw; James Greenhalgh
Subject: Re: [PATCH][GCC] Simplification of 1U << (31 - x)
On Tue, Aug 1, 2017 at 11:14 AM, Sudi Das <Sudi.Das@arm.com> wrote:
>
>
>
>
> Sorry about the delayed response but looking at the above discussion, should I conclude that this is a valid tree simplification?
Yes, I think so. Jakub requested code to undo this at RTL expansion
based on target costs, not sure if we really should
require that from you given the user could have written the target
sequence himself.
Few comments about the patch:
+/* Fold (1 << (C - x)) where C = precision(type) - 1
+ into ((1 << C) >> x). */
+(simplify
+ (lshift integer_onep@0 (minus INTEGER_CST@1 @2))
I think this warrants a single_use check on the minus (note :s isn't enough
as with the unsigned case we'd happily ignore it by design).
+ (if (INTEGRAL_TYPE_P (type)
+ && TYPE_PRECISION (type) <= HOST_BITS_PER_WIDE_INT
+ && tree_to_uhwi (@1) == (unsigned)(TYPE_PRECISION (type) - 1))
You can relax this with using
&& wi::eq_p (@1, TYPE_PRECISION (type) - 1)
+ (if (TYPE_UNSIGNED(type))
+ (rshift (lshift @0 @1) @2)
+ (with
+ { tree utype = unsigned_type_for (type); }
+ (convert:type (rshift (lshift (convert:utype @0) @1) @2))))))
+
You can write (convert (rshift ...)), without the :type.
I'm leaving it to Jakub whether you need to write that RTL expansion tweak.
Thanks,
Richard.
> I am pasting the diff of the assembly that AArch64 generates with the test case that I added. I see fewer instructions generated with the patch.
>
> --- pr80131-1.s 2017-08-01 10:02:43.243374174 +0100
> +++ pr80131-1.s-patched 2017-08-01 10:00:54.776455630 +0100
> @@ -24,10 +24,8 @@
> str x0, [sp, 8]
> ldr x0, [sp, 8]
> mov w1, w0
> - mov w0, 63
> - sub w0, w0, w1
> - mov x1, 1
> - lsl x0, x1, x0
> + mov x0, -9223372036854775808
> + lsr x0, x0, x1
> add sp, sp, 16
> ret
> .size f2, .-f2
> @@ -39,10 +37,8 @@
> str x0, [sp, 8]
> ldr x0, [sp, 8]
> mov w1, w0
> - mov w0, 63
> - sub w0, w0, w1
> - mov x1, 1
> - lsl x0, x1, x0
> + mov x0, -9223372036854775808
> + lsr x0, x0, x1
> add sp, sp, 16
> ret
> .size f3, .-f3
> @@ -52,11 +48,9 @@
> f4:
> sub sp, sp, #16
> str w0, [sp, 12]
> - mov w1, 31
> ldr w0, [sp, 12]
> - sub w0, w1, w0
> - mov w1, 1
> - lsl w0, w1, w0
> + mov w1, -2147483648
> + lsr w0, w1, w0
> add sp, sp, 16
> ret
> .size f4, .-f4
>
>
> Thanks
>
> Sudi
>
>
>
>
> From: Wilco Dijkstra
> Sent: Thursday, April 13, 2017 1:01 PM
> To: Richard Biener; Jakub Jelinek
> Cc: Sudi Das; GCC Patches; nd; Richard Earnshaw; James Greenhalgh
> Subject: Re: [PATCH][GCC] Simplification of 1U << (31 - x)
>
> Richard Biener wrote:
>> It is IMHO a valid GIMPLE optimization / canonicalization.
>>
>> movabsq $-9223372036854775808, %rax
>>
>> so this should then have been generated as 1<<63?
>>
>> At some point variable shifts were quite expensive as well..
>
> Yes I don't see a major difference between movabsq and
>
> movl $1, %eax
> salq $63, %rax
>
> on my Sandy Bridge, but if the above is faster then that is what the x64
> backend should emit - it's 1 byte smaller as well, so probably better in all
> cases.
>
> Wilco
diff --git a/gcc/match.pd b/gcc/match.pd
index e9017e4..160c12d 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -600,6 +600,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& tree_nop_conversion_p (type, TREE_TYPE (@1)))
(lshift @0 @2)))
+/* Fold (1 << (C - x)) where C = precision(type) - 1
+ into ((1 << C) >> x). */
+(simplify
+ (lshift integer_onep@0 (minus@1 INTEGER_CST@2 @3))
+ (if (INTEGRAL_TYPE_P (type)
+ && wi::eq_p (@2, TYPE_PRECISION (type) - 1)
+ && single_use (@1))
+ (if (TYPE_UNSIGNED(type))
+ (rshift (lshift @0 @2) @3)
+ (with
+ { tree utype = unsigned_type_for (type); }
+ (convert (rshift (lshift (convert:utype @0) @2) @3))))))
+
/* Fold (C1/X)*C2 into (C1*C2)/X. */
(simplify
(mult (rdiv@3 REAL_CST@0 @1) REAL_CST@2)
diff --git a/gcc/testsuite/gcc.dg/pr80131-1.c b/gcc/testsuite/gcc.dg/pr80131-1.c
new file mode 100644
index 0000000..317ea3e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr80131-1.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-fdump-tree-gimple" } */
+
+/* Checks the simplification of:
+ 1 << (C - x) to (1 << C) >> x, where C = precision (type) - 1
+ f1 is not simplified but f2, f3 and f4 are. */
+
+__INT64_TYPE__ f1 (__INT64_TYPE__ i)
+{
+ return (__INT64_TYPE__)1 << (31 - i);
+}
+
+__INT64_TYPE__ f2 (__INT64_TYPE__ i)
+{
+ return (__INT64_TYPE__)1 << (63 - i);
+}
+
+__UINT64_TYPE__ f3 (__INT64_TYPE__ i)
+{
+ return (__UINT64_TYPE__)1 << (63 - i);
+}
+
+__INT32_TYPE__ f4 (__INT32_TYPE__ i)
+{
+ return (__INT32_TYPE__)1 << (31 - i);
+}
+
+/* { dg-final { scan-tree-dump-times "= 31 -" 1 "gimple" } } */
+/* { dg-final { scan-tree-dump-times "9223372036854775808 >>" 2 "gimple" } } */
+/* { dg-final { scan-tree-dump "2147483648 >>" "gimple" } } */