This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/80854] New: hot path is slowed down when the cold return path is merged into it
- From: "nsz at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 22 May 2017 10:11:33 +0000
- Subject: [Bug tree-optimization/80854] New: hot path is slowed down when the cold return path is merged into it
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80854
Bug ID: 80854
Summary: hot path is slowed down when the cold return path is
merged into it
Product: gcc
Version: 7.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: nsz at gcc dot gnu.org
Target Milestone: ---
i see subomptimal code gen for
float foo (float x)
{
if (__builtin_expect (x > 0, 0))
if (x>2) return 0;
return x*x;
}
because the return path merge causes extra register move in the hot path
https://godbolt.org/g/AZxxrR
x86_64:
foo:
pxor %xmm1, %xmm1
ucomiss %xmm1, %xmm0
ja .L8
.L2:
movaps %xmm0, %xmm1 // extra reg move
mulss %xmm0, %xmm1
.L1:
movaps %xmm1, %xmm0 // extra reg move
ret
.L8:
ucomiss .LC1(%rip), %xmm0
jbe .L2
jmp .L1 // need not jmp back
.LC1:
.long 1073741824
aarch64:
foo:
fcmpe s0, #0.0
bgt .L8
.L2:
fmul s1, s0, s0
.L1:
fmov s0, s1 // extra reg move
ret
.p2align 3
.L8:
fmov s2, 2.0e+0
movi v1.2s, #0
fcmpe s0, s2
ble .L2
b .L1 // need not jmp back
i wonder if gcc could do better if there is information about hot/cold paths
(by not merging the hot/cold return paths in some cases).