[Bug rtl-optimization/44141] Redundant loads and stores generated for AMD bdver1 target

jakub at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Mar 28 10:53:00 GMT 2012


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141

--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-03-28 10:40:45 UTC ---
(In reply to comment #13)
> The expander now converts as shown below for unaligned moves with V2DF mode.
> 
>             if (TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
>                 {
>                   op0 = gen_lowpart (V4SFmode, op0);
>                   op1 = gen_lowpart (V4SFmode, op1);
>                   emit_insn (gen_sse_movups (op0, op1));
>                   return;
>                 }
> 
> You mean conversion is not needed here?  

No, I meant it should do perhaps:
  rtx tem = gen_reg_rtx (V4SFmode);
  emit_insn (gen_sse_movups (tem, gen_lowpart (V4SFmode, op1)));
  emit_move_insn (op0, gen_lowpart (GET_MODE (op0), tem));
  return;
or similar.  Of course in this case it can be done using changes in the
patterns (note, there are lots of other insns that emit {,v}mov{u,a}pd,
which of those should be changed?), I meant this as a general comment that
a vector mode changing subreg on a lhs of an insn is highly undesirable.



More information about the Gcc-bugs mailing list