This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/44141] Redundant loads and stores generated for AMD bdver1 target


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141

--- Comment #4 from Venkataramanan <venkataramanan.kumar at amd dot com> 2012-03-22 13:17:34 UTC ---
I dont have permission to confirm this bug.

Here is my analysis for the cause.

#(insn:TI 4886 4885 4888 132 (set (reg:V2DF 25 xmm4 [8797])
#        (mult:V2DF (reg:V2DF 25 xmm4 [8795])
#            (reg:V2DF 22 xmm1 [8758]))) ac.f90:499 1138 {*mulv2df3}
#     (nil))
        vmulpd  %xmm1, %xmm4, %xmm4     # 4886  *mulv2df3/2     [length = 4]

We are forcing a conversion from V2DF to V4SF mode here for unaligned moves
when TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL is set.

(-----Snip ix86_expand_vector_move_misalign-----)
            case V2DFmode:
              if (TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
                {
                  op0 = gen_lowpart (V4SFmode, op0);
                  op1 = gen_lowpart (V4SFmode, op1);
                  emit_insn (gen_sse_movups (op0, op1));
                  return;
                }
(-----Snip-----) 

This conversion generates RTL as shown below.

#(insn:TI 4888 4886 4890 132 (set (mem/c:V4SF (plus:DI (reg/f:DI 7 sp)
#                (const_int 6136 [0x17f8])) [3 MEM[(real(kind=8)[26] *)&dclroo
+ 152B]+0 S16 A64])
#        (unspec:V4SF [
#                (reg:V4SF 25 xmm4 [8797])
#            ] UNSPEC_MOVU)) ac.f90:499 1104 {*sse_movups}
#     (expr_list:REG_DEAD (reg:V4SF 25 xmm4 [8797])
#        (nil)))
        vmovups %xmm4, 6136(%rsp)       # 4888  *sse_movups/2   [length = 9]

Now GCC does not know how to come back to V2DF mode again. As Uros said, it
reloads through memory.

#(insn 4930 4929 8259 132 (set (reg:V4SF 23 xmm2)
#        (unspec:V4SF [
#                (mem/c:V4SF (plus:DI (reg/f:DI 7 sp)
#                        (const_int 6136 [0x17f8])) [3 MEM[(real(kind=8)[26]
*)&dclroo + 152B]+0 S16 A64])
#            ] UNSPEC_MOVU)) ac.f90:503 1104 {*sse_movups}
#     (nil))
        vmovups 6136(%rsp), %xmm2       # 4930  *sse_movups/1   [length = 9]
#(insn:TI 8259 4930 8261 132 (set (mem/c:V4SF (plus:DI (reg/f:DI 7 sp)
#                (const_int 240 [0xf0])) [12 %sfp+-11184 S16 A128])
#        (reg:V4SF 23 xmm2)) ac.f90:503 1098 {*movv4sf_internal}
#     (expr_list:REG_DEAD (reg:V4SF 23 xmm2)
#        (nil)))
        vmovaps %xmm2, 240(%rsp)        # 8259  *movv4sf_internal/3     [length
= 9]
#(insn 8261 8259 4931 132 (set (reg:V2DF 23 xmm2)
#        (mem/c:V2DF (plus:DI (reg/f:DI 7 sp)
#                (const_int 240 [0xf0])) [12 %sfp+-11184 S16 A128])) ac.f90:503
1100 {*movv2df_internal}
#     (nil))
        vmovaps 240(%rsp), %xmm2        # 8261  *movv2df_internal/2     [length
= 9]
#(insn:TI 4931 8261 8260 132 (set (reg:V2DF 23 xmm2)
#        (div:V2DF (reg:V2DF 23 xmm2)
#            (mem/c:V2DF (plus:DI (reg/f:DI 7 sp)
#                    (const_int 6128 [0x17f0])) [3 MEM[(real(kind=8)[26]
*)&dclroo + 144B]+0 S16 A128]))) ac.f90:503 1144 {sse2_divv2df3}
#     (nil))
        vdivpd  6128(%rsp), %xmm2, %xmm2        # 4931  sse2_divv2df3/2 [length
= 9]


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]