[Bug rtl-optimization/44141] Redundant loads and stores generated for AMD bdver1 target
jakub at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Wed Mar 28 10:53:00 GMT 2012
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141
--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-03-28 10:40:45 UTC ---
(In reply to comment #13)
> The expander now converts as shown below for unaligned moves with V2DF mode.
>
> if (TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL)
> {
> op0 = gen_lowpart (V4SFmode, op0);
> op1 = gen_lowpart (V4SFmode, op1);
> emit_insn (gen_sse_movups (op0, op1));
> return;
> }
>
> You mean conversion is not needed here?
No, I meant it should do perhaps:
rtx tem = gen_reg_rtx (V4SFmode);
emit_insn (gen_sse_movups (tem, gen_lowpart (V4SFmode, op1)));
emit_move_insn (op0, gen_lowpart (GET_MODE (op0), tem));
return;
or similar. Of course in this case it can be done using changes in the
patterns (note, there are lots of other insns that emit {,v}mov{u,a}pd,
which of those should be changed?), I meant this as a general comment that
a vector mode changing subreg on a lhs of an insn is highly undesirable.
More information about the Gcc-bugs
mailing list