This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] tuning gcc for AMDFAM10 processor (patch 3)
- From: Richard Henderson <rth at redhat dot com>
- To: "Jagasia, Harsha" <harsha dot jagasia at amd dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Tue, 30 Jan 2007 10:00:26 -0800
- Subject: Re: [patch] tuning gcc for AMDFAM10 processor (patch 3)
- References: <D5B24B5251882048AD03DDFA431BB79059CF06@SAUSEXMB3.amd.com>
On Mon, Jan 29, 2007 at 07:12:44PM -0600, Jagasia, Harsha wrote:
> + xorps reg3, reg3
> + movaps reg3, reg2
Surely you're not advocating *moving* a zero. =)
> @@ -9434,6 +9491,13 @@ ix86_expand_vector_move_misalign (enum m
> }
> else
> {
> + if (TARGET_SSE_UNALIGNED_MOVE_OPTIMAL)
> + {
> + op0 = gen_lowpart (V2DFmode, op0);
> + op1 = gen_lowpart (V2DFmode, op1);
> + emit_insn (gen_sse2_movupd (op0, op1));
> + return;
> + }
> /* ??? Not sure about the best option for the Intel chips.
> The following would seem to satisfy; the register is
> entirely cleared, breaking the dependency chain. We
> @@ -9453,7 +9517,16 @@ ix86_expand_vector_move_misalign (enum m
> else
> {
> if (TARGET_SSE_PARTIAL_REG_DEPENDENCY)
> + {
> + if (TARGET_SSE_UNALIGNED_MOVE_OPTIMAL)
> + {
> + op0 = gen_lowpart (V4SFmode, op0);
> + op1 = gen_lowpart (V4SFmode, op1);
> + emit_insn (gen_sse_movups (op0, op1));
> + return;
> + }
> emit_move_insn (op0, CONST0_RTX (mode));
> + }
Un-nest both of these blocks from the IF they're inside.
TARGET_SSE_UNALIGNED_MOVE_OPTIMAL really has no bearing on
TARGET_SSE_PARTIAL_REG_DEPENDENCY or TARGET_SSE_SPLIT_REGS,
and should override both of them.
Otherwise ok.
r~