This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][GCC] Simplification of 1U << (31 - x)
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- Cc: Sudi Das <Sudi dot Das at arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, James Greenhalgh <James dot Greenhalgh at arm dot com>
- Date: Thu, 13 Apr 2017 13:41:25 +0200
- Subject: Re: [PATCH][GCC] Simplification of 1U << (31 - x)
- Authentication-results: sourceware.org; auth=none
- Authentication-results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
- Authentication-results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jakub at redhat dot com
- Dkim-filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 01BE33DBCA
- Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 01BE33DBCA
- References: <AM5PR0802MB2610B3E04DF2484B04208CEC83020@AM5PR0802MB2610.eurprd08.prod.outlook.com> <20170413112151.GD1809@tucnak> <AM5PR0802MB2610B75CC3BDBA5C021B3DA083020@AM5PR0802MB2610.eurprd08.prod.outlook.com>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Thu, Apr 13, 2017 at 11:33:12AM +0000, Wilco Dijkstra wrote:
> Jakub Jelinek wrote:
>
> > No. Some constants sometimes even 7 instructions (e.g. sparc64; not talking
> > in particular about 1ULL << 63 constant), or have one instruction
> > that is more expensive than normal small constant load. Compare say x86_64
> > movl/movq vs. movabsq, I think the latter has 3 times longer latency on many
> > CPUs. So no, I think it isn't an unconditional win.
>
> We're specifically only talking about the constants (1L << 63), (1 << 31) and (1 << 15).
> On all targets these need at most 2 simple instructions. That makes it an unconditional win.
It is not a win on at least Haswell-E:
__attribute__((noinline, noclone)) unsigned long long int
foo (int x)
{
asm volatile ("" : : : "memory");
return 1ULL << (63 - x);
}
__attribute__((noinline, noclone)) unsigned long long int
bar (int x)
{
asm volatile ("" : : : "memory");
return (1ULL << 63) >> x;
}
int
main (int argc, const char **argv)
{
int i;
if (argc == 1)
for (i = 0; i < 1000000000; i++)
asm volatile ("" : : "r" (foo (13)));
else
for (i = 0; i < 1000000000; i++)
asm volatile ("" : : "r" (bar (13)));
return 0;
}
$ time /tmp/test
real 0m1.290s
user 0m1.288s
sys 0m0.002s
$ time /tmp/test 1
real 0m1.542s
user 0m1.540s
sys 0m0.002s
As I said, movabsq is expensive.
Jakub