This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/44141] Redundant loads and stores generated for AMD bdver1 target
- From: "venkataramanan.kumar at amd dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sun, 01 Apr 2012 07:55:24 +0000
- Subject: [Bug rtl-optimization/44141] Redundant loads and stores generated for AMD bdver1 target
- Auto-submitted: auto-generated
- References: <bug-44141-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141
--- Comment #15 from Venkataramanan <venkataramanan.kumar at amd dot com> 2012-04-01 07:55:24 UTC ---
Hi Uros,
I had a look at reload pass.
I have an RTL sequence that look like this.
(insn 32 31 33 2 (set (subreg:V4SF (reg:V2DF 284) 0) <== psuedo reguster
(unspec:V4SF [
(mem/c:V4SF (plus:DI (reg/f:DI 20 frame)
(const_int -568 [0xfffffffffffffdc8])) [3
MEM[(real(kind=8)[26] *)&dsroo + 120B]+0 S16 A64])
] UNSPEC_MOVU)) test.f90:16 1106 {*sse_movups}
(nil))
(insn 33 32 371 2 (set (reg:V2DF 285) <= pseudo register
(div:V2DF (reg:V2DF 284) <== pseudo register
(mem/c:V2DF (plus:DI (reg/f:DI 20 frame)
(const_int -576 [0xfffffffffffffdc0])) [3
MEM[(real(kind=8)[26] *)&dsroo + 112B]+0 S16 A128]))) test.f90:16 1146
{sse2_divv2df3}
(nil))
Now reload examines the first RTL.
(insn 32 31 33 2 (set (subreg:V4SF (reg:V2DF 284) 0) <== psuedo reguster
(unspec:V4SF [
(mem/c:V4SF (plus:DI (reg/f:DI 20 frame)
(const_int -568 [0xfffffffffffffdc8])) [3
MEM[(real(kind=8)[26] *)&dsroo + 120B]+0 S16 A64])
] UNSPEC_MOVU)) test.f90:16 1106 {*sse_movups}
(nil))
This did not get an Hard register and so at reload it chooses xmm0 and
generates an output reload as follows.
(insn 393 0 0 (set (reg:V2DF 284)
(reg:V2DF 21 xmm0)) -1
There is possiblity of reload inheritenace and we can avoid input reload for
the next RTL insn 33
(insn 33 32 371 2 (set (reg:V2DF 285)
(div:V2DF (reg:V2DF 284)
(mem/c:V2DF (plus:DI (reg/f:DI 20 frame)
(const_int -576 [0xfffffffffffffdc0])) [3
MEM[(real(kind=8)[26] *)&dsroo + 112B]+0 S16 A128]))) test.f90:16 1146
{sse2_divv2df3}
(nil))
But It is not happening and input reload gets generated again before this RTL
as follows:
(insn 395 0 0 (set (reg:V2DF 21 xmm0)
(reg:V2DF 284)
Also another outpout reload gets emitted after the insn 33 for its output
reload as
(insn 394 0 0 (set (reg:V2DF 285)
(reg:V2DF 21 xmm0)) -1. But I am not sure If this computation prevented input
reload inheritence in the insn 33 .
I suspect emit_reload_insns is not preserving output reloads in insn32 for
further inheritence insn33.
Please povide your opinion.