[Bug target/64897] New: Floating-point "and" not optimized on x86-64
schnetter at gmail dot com
gcc-bugzilla@gcc.gnu.org
Sun Feb 1 19:46:00 GMT 2015
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64897
Bug ID: 64897
Summary: Floating-point "and" not optimized on x86-64
Product: gcc
Version: 4.9.2
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: schnetter at gmail dot com
I notice that gcc does not generate "vandpd" for floating-point "and"
operations. Here is an example code that demonstrates this:
{{{
#include <math.h>
#include <string.h>
double fand1(double x)
{
unsigned long ix;
memcpy(&ix, &x, 8);
ix &= 0x7fffffffffffffffUL;
memcpy(&x, &ix, 8);
return x;
}
double fand2(double x)
{
return fabs(x);
}
}}}
When I compile this via:
{{{
gcc-mp-4.9 -O3 -march=native -S fand.c -o fand-gcc-4.9.s
}}}
(OS X, x86-64 CPU, Intel Core i7), this results in:
{{{
_fand1:
movabsq $9223372036854775807, %rax
vmovd %xmm0, %rdx
andq %rdx, %rax
vmovd %rax, %xmm0
ret
_fand2:
vmovsd LC1(%rip), %xmm1
vandpd %xmm1, %xmm0, %xmm0
ret
}}}
This shows that (a) gcc performs the bitwise and operation in an integer
register, which is probably slower, and (b) the implementors of "fabs" assume
that using the "vandpd" instruction is faster.
More information about the Gcc-bugs
mailing list