This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/48092] New: associative property of sqrt
- From: "vincenzo.innocente at cern dot ch" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sat, 12 Mar 2011 14:57:26 +0000
- Subject: [Bug tree-optimization/48092] New: associative property of sqrt
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48092
Summary: associative property of sqrt
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: vincenzo.innocente@cern.ch
Is there any reason why (with -Ofast or -ffast-math) associative properties of
sqrt are not exploited as for instance those of division?
examples
division (ok)
float div1(float a, float x, float y) {
return a/x/y;
0: f3 0f 59 ca mulss %xmm2,%xmm1
4: f3 0f 5e c1 divss %xmm1,%xmm0
}
sqrt
float sqrt1(float a, float x, float y) {
return a*std::sqrt(x)*std::sqrt(y);
10: f3 0f 51 d2 sqrtss %xmm2,%xmm2
14: f3 0f 51 c9 sqrtss %xmm1,%xmm1
18: f3 0f 59 ca mulss %xmm2,%xmm1
1c: f3 0f 59 c8 mulss %xmm0,%xmm1
}
20: 0f 28 c1 movaps %xmm1,%xmm0
and
float rsqrt1(float a, float x, float y) {
return a/std::sqrt(x)/std::sqrt(y);
30: f3 0f 51 c9 sqrtss %xmm1,%xmm1
34: f3 0f 51 d2 sqrtss %xmm2,%xmm2
38: f3 0f 59 d1 mulss %xmm1,%xmm2
3c: f3 0f 5e c2 divss %xmm2,%xmm0
}
in this second case I would have at least expected the use of
"rsqrtss" to take precedence above the associative property of "div"
emitting the same code as below
float rsqrt2(float a, float x, float y) {
return a/sqrtf(x*y);
70: f3 0f 59 ca mulss %xmm2,%xmm1
74: f3 0f 52 d9 rsqrtss %xmm1,%xmm3
78: f3 0f 59 cb mulss %xmm3,%xmm1
7c: f3 0f 59 cb mulss %xmm3,%xmm1
80: f3 0f 59 1d 00 00 00 mulss 0(%rip),%xmm3 # 88 <rsqrt2(float,
float, float)+0x18>
87: 00
84: R_X86_64_PC32 .LC1+0xfffffffffffffffc
88: f3 0f 58 0d 00 00 00 addss 0(%rip),%xmm1 # 90 <rsqrt2(float,
float, float)+0x20>
8f: 00
8c: R_X86_64_PC32 .LC0+0xfffffffffffffffc
90: f3 0f 59 cb mulss %xmm3,%xmm1
94: f3 0f 59 c8 mulss %xmm0,%xmm1
}
98: 0f 28 c1 movaps %xmm1,%xmm0