This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/52034] New: __builtin_copysign optimization suboptimal
- From: "drepper.fsp at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sun, 29 Jan 2012 00:16:27 +0000
- Subject: [Bug tree-optimization/52034] New: __builtin_copysign optimization suboptimal
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52034
Bug #: 52034
Summary: __builtin_copysign optimization suboptimal
Classification: Unclassified
Product: gcc
Version: 4.6.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: drepper.fsp@gmail.com
The most trivial __builtin_copysign optimization is not optimal:
double f(double a, double b)
{
return __builtin_copysign(a,b);
}
With gcc 4.6.2 this gets compiled to
movapd %xmm1, %xmm2
andpd .LC0(%rip), %xmm0
andpd .LC1(%rip), %xmm2
orpd %xmm2, %xmm0
ret
There is no reason for %xmm1 to be duplicated to %xmm2. This is sufficient:
andpd .LC0(%rip), %xmm0
andpd .LC1(%rip), %xmm1
orpd %xmm1, %xmm0
ret
The same happens with more complicated code sequences.