This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][ARM] Remove DImode expansions for 1-bit shifts
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Cc: nd <nd at arm dot com>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>
- Date: Mon, 30 Oct 2017 13:54:12 +0000
- Subject: Re: [PATCH][ARM] Remove DImode expansions for 1-bit shifts
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco dot Dijkstra at arm dot com;
- Nodisclaimer: True
- References: <VI1PR0802MB2621F60FF0E2131136FE3495837C0@VI1PR0802MB2621.eurprd08.prod.outlook.com>,<AM5PR0802MB261007FB10E237B4D4500F8E831B0@AM5PR0802MB2610.eurprd08.prod.outlook.com>,<VI1PR0802MB26212D543ADCDCB12D600A4783C20@VI1PR0802MB2621.eurprd08.prod.outlook.com>,<AM5PR0802MB261001255E5EA92CA59B8B1283DC0@AM5PR0802MB2610.eurprd08.prod.outlook.com> <DB6PR0801MB2053799E613480014F9F9DA4834F0@DB6PR0801MB2053.eurprd08.prod.outlook.com>,<59F3622E.6080407@foss.arm.com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
Kyrill Tkachov wrote:
> On 16/10/17 12:30, Wilco Dijkstra wrote:
> > DImode right shifts of 1 are rarely used (6 in total in the GCC binary),
> > so there is little benefit of the arm_ashrdi3_1bit and arm_lshrdi3_1bit
> > patterns.
>
> ... but it's still used, and the patterns were put there for a reason.
> Even if GCC itself doesn't use them much they may be used by other
> applications.
>
> So I'd support removing the left shift 1-bit expansions, but not the
> right shift ones.
The purpose of removing the shift-by-1 cases is not just to cleanup code
but also to improve code generation. These shifts cannot expand early
and thus don't benefit from optimization (like shift merging). They also
suffer from the DImode register allocation issues.
As a simple example this loop runs >20% faster with my patch on both
Cortex-A53 and Cortex-A57 when built with -mfpu=vfp:
long long loop1 (long long x, long long y, int n)
{
int i;
for (i = 0; i < n; i++)
{
x >>= 1;
x |= y;
x >>= 1;
x |= y;
x >>= 1;
x |= y;
x >>= 1;
x |= y;
}
return x;
}
So given these shifts are bad for performance, why have them at all?
Wilco