This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/70465] [4.9/5/6/7 Regression] Poor code for x87 asm
- From: "vmakarov at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 31 Mar 2016 20:52:29 +0000
- Subject: [Bug target/70465] [4.9/5/6/7 Regression] Poor code for x87 asm
- Auto-submitted: auto-generated
- References: <bug-70465-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70465
Vladimir Makarov <vmakarov at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vmakarov at gcc dot gnu.org
--- Comment #6 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
Neither IRA/LRA, nor old RA is/was aware about generation of good code for fp
stack.
GCC-6 generates before IRA (more correctly before coloring in IRA):
(insn 16 4 17 2 (set (reg:DF 90 [ res ])
(mem/c:DF (plus:SI (reg/f:SI 16 argp)
(const_int 8 [0x8])) [1 x+0 S8 A32])) b3.c:6 126
{*movdf_internal}
(nil))
(insn 17 16 8 2 (set (reg/v:DF 88 [ y ])
(mem/c:DF (reg/f:SI 16 argp) [1 y+0 S8 A32])) b3.c:6 126
{*movdf_internal}
(expr_list:REG_EQUIV (mem/c:DF (reg/f:SI 16 argp) [1 y+0 S8 A32])
(nil)))
while gcc-4.3 has before global/reload:
(insn:HI 2 5 3 2 b3.c:6 (set (reg/v:DF 60 [ y ])
(mem/c/i:DF (reg/f:SI 16 argp) [2 y+0 S8 A32])) 102 {*movdf_nointeger}
(nil))
(insn:HI 3 2 4 2 b3.c:6 (set (reg/v:DF 61 [ x ])
(mem/c/i:DF (plus:SI (reg/f:SI 16 argp)
(const_int 8 [0x8])) [2 x+0 S8 A32])) 102 {*movdf_nointeger}
(nil))
So gcc-4.3 was lucky to have load of y first and then x, while gcc-6 is unlucky
to have load of x first and than y.
There are a lot of PRs usually with tiny tests where old RA (or reload) has
a better code. Unfortunately it will always be that way as RA is all about
heuristics. There are no opposite PRs where reload/old RA generates worse code
because it is not used anymore.
In any case if we exchange x, y in the argument list, gcc-4.3 will also
generate fxch.
Still I think it can be fixed. update_equiv_reg transforms code
2: r88:DF=[argp:SI]
3: r89:DF=[argp:SI+0x8]
16: r90:DF=r89:DF
into
16: r90:DF=[argp:SI+0x8]
17: r88:DF=[argp:SI]
This is the source for fxch generation. If we exchange places of insns 16 and
17, fxch will be gone. Although I can not guarantee that there will be no new
PRs as such change might result in some worse code generation.