This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RS6000] reload_vsx_from_gprsf splitter
- From: David Edelsohn <dje dot gcc at gmail dot com>
- To: Alan Modra <amodra at gmail dot com>, Ulrich Weigand <uweigand at de dot ibm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Michael R Meissner <mrmeissn at us dot ibm dot com>
- Date: Thu, 11 Feb 2016 09:34:29 -0800
- Subject: Re: [RS6000] reload_vsx_from_gprsf splitter
- Authentication-results: sourceware.org; auth=none
- References: <20160211140410 dot GC10888 at bubble dot grove dot modra dot org>
On Thu, Feb 11, 2016 at 6:04 AM, Alan Modra <amodra@gmail.com> wrote:
> This is PR68973 part 2, the failure of a boost test, caused by a
> splitter emitting an invalid move in reload_vsx_from_gprsf:
> emit_move_insn (op0_di, op2);
>
> op0 can be any vsx reg, but the mtvsrd destination constraint in
> movdi_internal64 is "wj", which only allows fp regs. I'm not sure why
> we have that constraint so left movdi_internal64 alone and used a
> special p8_mtvsrd instead, similar to p8_mtvsrd_1 and p8_mtvsrd_2 used
> by reload_vsx_from_gpr<mode>. When looking at those, I noticed we're
> restricted to fprs there too so fixed that as well. (We can't use %L
> in asm operands that must accept vsx.)
Michael, please review the "wj" constraint in these patterns.
Alan, the explanation says that it uses a special p8_mtvsrd similar to
p8_mtvsrd_[12], but does not explain why the two similar patterns are
removed. The incorrect use of %L implies those patterns, but the
change is to p8_xxpermdi_<mode> that is not mentioned in the
ChangeLog.
I also would appreciate Uli's comments on this direction because of
his reload expertise.
Thanks, David
>
> Bootstrapped and regression tested powerpc64le-linux. powerpc64-linux
> biarch -mcpu=power7 bootstrap still in progress. OK to apply assuming
> no regressions found?
>
> PR target/68973
> * config/rs6000/vsx.md (vsx_xscvspdpn_directmove): Delete.
> * config/rs6000/rs6000.md (reload_vsx_from_gprsf): Rewrite splitter.
> (p8_mtvsrd): New.
> (p8_mtvsrd_1, p8_mtvsrd_2): Delete.
> (reload_vsx_from_gpr<mode>): Adjust to use p8_mtvsrd.
>
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index cdbf873..745293b 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -7543,29 +7543,22 @@
> (set_attr "type" "three")])
>
> ;; Move 128 bit values from GPRs to VSX registers in 64-bit mode
> -(define_insn "p8_mtvsrd_1"
> - [(set (match_operand:TF 0 "register_operand" "=ws")
> - (unspec:TF [(match_operand:DI 1 "register_operand" "r")]
> +(define_insn "p8_mtvsrd"
> + [(set (match_operand:DF 0 "register_operand" "=ws")
> + (unspec:DF [(match_operand:DI 1 "register_operand" "r")]
> UNSPEC_P8V_MTVSRD))]
> "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
> - "mtvsrd %0,%1"
> - [(set_attr "type" "mftgpr")])
> -
> -(define_insn "p8_mtvsrd_2"
> - [(set (match_operand:TF 0 "register_operand" "+ws")
> - (unspec:TF [(match_dup 0)
> - (match_operand:DI 1 "register_operand" "r")]
> - UNSPEC_P8V_MTVSRD))]
> - "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
> - "mtvsrd %L0,%1"
> + "mtvsrd %x0,%1"
> [(set_attr "type" "mftgpr")])
>
> (define_insn "p8_xxpermdi_<mode>"
> [(set (match_operand:FMOVE128_GPR 0 "register_operand" "=wa")
> - (unspec:FMOVE128_GPR [(match_operand:TF 1 "register_operand" "ws")]
> - UNSPEC_P8V_XXPERMDI))]
> + (unspec:FMOVE128_GPR [
> + (match_operand:DF 1 "register_operand" "ws")
> + (match_operand:DF 2 "register_operand" "ws")]
> + UNSPEC_P8V_XXPERMDI))]
> "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
> - "xxpermdi %x0,%1,%L1,0"
> + "xxpermdi %x0,%x1,%x2,0"
> [(set_attr "type" "vecperm")])
>
> (define_insn_and_split "reload_vsx_from_gpr<mode>"
> @@ -7581,13 +7574,18 @@
> {
> rtx dest = operands[0];
> rtx src = operands[1];
> - rtx tmp = operands[2];
> + /* You might think that we could use op0 as one temp and a DF clobber
> + as the other, but you'd be wrong. These secondary_reload patterns
> + don't check that the clobber is different to the destination, which
> + is probably a reload bug. */
> + rtx tmp_hi = gen_rtx_REG (DFmode, REGNO (operands[2]));
> + rtx tmp_lo = gen_rtx_REG (DFmode, REGNO (operands[2]) + 1);
> rtx gpr_hi_reg = gen_highpart (DImode, src);
> rtx gpr_lo_reg = gen_lowpart (DImode, src);
>
> - emit_insn (gen_p8_mtvsrd_1 (tmp, gpr_hi_reg));
> - emit_insn (gen_p8_mtvsrd_2 (tmp, gpr_lo_reg));
> - emit_insn (gen_p8_xxpermdi_<mode> (dest, tmp));
> + emit_insn (gen_p8_mtvsrd (tmp_hi, gpr_hi_reg));
> + emit_insn (gen_p8_mtvsrd (tmp_lo, gpr_lo_reg));
> + emit_insn (gen_p8_xxpermdi_<mode> (dest, tmp_hi, tmp_lo));
> DONE;
> }
> [(set_attr "length" "12")
> @@ -7622,16 +7620,18 @@
> rtx op0 = operands[0];
> rtx op1 = operands[1];
> rtx op2 = operands[2];
> +
> /* Also use the destination register to hold the unconverted DImode value.
> This is conceptually a separate value from OP0, so we use gen_rtx_REG
> rather than simplify_gen_subreg. */
> - rtx op0_di = gen_rtx_REG (DImode, REGNO (op0));
> + rtx op0_df = gen_rtx_REG (DFmode, REGNO (op0));
> + rtx op0_v4sf = gen_rtx_REG (V4SFmode, REGNO (op0));
> rtx op1_di = simplify_gen_subreg (DImode, op1, SFmode, 0);
>
> /* Move SF value to upper 32-bits for xscvspdpn. */
> emit_insn (gen_ashldi3 (op2, op1_di, GEN_INT (32)));
> - emit_move_insn (op0_di, op2);
> - emit_insn (gen_vsx_xscvspdpn_directmove (op0, op0_di));
> + emit_insn (gen_p8_mtvsrd (op0_df, op2));
> + emit_insn (gen_vsx_xscvspdpn (op0_df, op0_v4sf));
> DONE;
> }
> [(set_attr "length" "8")
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index 997ff31..2d2f137 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -1518,15 +1518,6 @@
> "xscvdpspn %x0,%x1"
> [(set_attr "type" "fp")])
>
> -;; Used by direct move to move a SFmode value from GPR to VSX register
> -(define_insn "vsx_xscvspdpn_directmove"
> - [(set (match_operand:SF 0 "vsx_register_operand" "=wa")
> - (unspec:SF [(match_operand:DI 1 "vsx_register_operand" "wa")]
> - UNSPEC_VSX_CVSPDPN))]
> - "TARGET_XSCVSPDPN"
> - "xscvspdpn %x0,%x1"
> - [(set_attr "type" "fp")])
> -
> ;; Convert and scale (used by vec_ctf, vec_cts, vec_ctu for double/long long)
>
> (define_expand "vsx_xvcvsxddp_scale"
>
> --
> Alan Modra
> Australia Development Lab, IBM