[RS6000] reload_vsx_from_gprsf splitter
Alan Modra
amodra@gmail.com
Mon Feb 15 12:37:00 GMT 2016
On Fri, Feb 12, 2016 at 02:57:22PM +0100, Ulrich Weigand wrote:
> > On Fri, Feb 12, 2016 at 08:54:19AM +1030, Alan Modra wrote:
> > > Another concern I had about this, besides using %L in asm output (what
> > > forces TFmode to use just fprs?), is what happens when we're using
> > > IEEE 128-bit floats? In that case it looks like we'd get just one reg.
> >
> > Good point that it breaks if the default long double (TFmode) type is IEEE
> > 128-bit floating point. We would need to have two patterns, one that uses
> > TFmode and one that uses IFmode. I wrote the power8 direct move stuff before
> > going down the road of IEEE 128-bit floating point.
>
> Right. It's a bit unfortunate that we can't just use IFmode unconditionally,
> but it seems rs6000_scalar_mode_supported_p (IFmode) may return false, and
> then we probably shouldn't be using it.
Actually, we can use IFmode unconditionally. scalar_mode_supported_p
is relevant only up to and including expand. Nothing prevents the
backend from using IFmode.
> Another option might be to use TDmode to allocate a scratch register pair.
That won't work, at least if we want to extract the two component regs
with simplify_gen_subreg, due to rs6000_cannot_change_mode_class. In
my original patch I just extracted the regs by using gen_rtx_REG but I
changed that, based on your criticism of using gen_rtx_REG in
reload_vsx_from_gprsf, and because rs6000.md avoids gen_rtx_REG using
operand regnos in other places. That particular change is of course
entirely cosmetic. I also changed reload_vsx_from_gprsf to avoid mode
punning regs, instead duplicating insn patterns as done elsewhere in
the vsx support. I don't believe we will see subregs of vsx or fp
regs after reload, but I'm quite willing to concede the point for a
stage4 fix.
Here's the revised patch. To recap, the main bug fixes here are:
- stop reload_vsx_from_gprsf splitter from emitting a move not
handled by movdi_internal64
- don't use TFmode, which cannot now be assumed to be IBM
double-double.
Secondary to that, not using or passing around TFmode means the %L
restriction no longer matters, and constraints on the reload temp reg
can be relaxed.
Bootstrapped and regression tested powerpc64-linux biarch and
powerpc64le-linux. OK David?
PR target/68973
* config/rs6000/rs6000.md (reload_vsx_from_gprsf): Use p8_mtvsrd_sf
rather than attempting to use movdi_internal64. Remove op0_di.
(p8_mtvsrd_df, p8_mtvsrd_sf): New.
(p8_mtvsrd_1, p8_mtvsrd_2): Delete.
(p8_mtvsrwz): New.
(p8_mtvsrwz_1, p8_mtvsrwz_2): Delete.
(p8_xxpermdi_<mode>): Take two DF inputs rather than one TF.
(p8_fmrgow_<mode>): Likewise.
(reload_vsx_from_gpr<mode>): Make clobber IF. Adjust for above
changes.
(reload_fpr_from_gpr<mode>): Similarly. Use "d" for op0 constraint.
* config/rs6000/vsx.md (vsx_xscvspdpn_directmove): Make op1 SFmode.
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index cdbf873..ec356cb 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7488,41 +7488,31 @@
;; value, since it is allocated in reload and not all of the flow information
;; is setup for it. We have two patterns to do the two moves between gprs and
;; fprs. There isn't a dependancy between the two, but we could potentially
-;; schedule other instructions between the two instructions. TFmode is
-;; currently limited to traditional FPR registers. If/when this is changed, we
-;; will need to revist %L to make sure it works with VSX registers, or add an
-;; %x version of %L.
+;; schedule other instructions between the two instructions.
(define_insn "p8_fmrgow_<mode>"
[(set (match_operand:FMOVE64X 0 "register_operand" "=d")
- (unspec:FMOVE64X [(match_operand:TF 1 "register_operand" "d")]
+ (unspec:FMOVE64X [
+ (match_operand:DF 1 "register_operand" "d")
+ (match_operand:DF 2 "register_operand" "d")]
UNSPEC_P8V_FMRGOW))]
"!TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
- "fmrgow %0,%1,%L1"
+ "fmrgow %0,%1,%2"
[(set_attr "type" "vecperm")])
-(define_insn "p8_mtvsrwz_1"
- [(set (match_operand:TF 0 "register_operand" "=d")
- (unspec:TF [(match_operand:SI 1 "register_operand" "r")]
+(define_insn "p8_mtvsrwz"
+ [(set (match_operand:DF 0 "register_operand" "=d")
+ (unspec:DF [(match_operand:SI 1 "register_operand" "r")]
UNSPEC_P8V_MTVSRWZ))]
"!TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
"mtvsrwz %x0,%1"
[(set_attr "type" "mftgpr")])
-(define_insn "p8_mtvsrwz_2"
- [(set (match_operand:TF 0 "register_operand" "+d")
- (unspec:TF [(match_dup 0)
- (match_operand:SI 1 "register_operand" "r")]
- UNSPEC_P8V_MTVSRWZ))]
- "!TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
- "mtvsrwz %L0,%1"
- [(set_attr "type" "mftgpr")])
-
(define_insn_and_split "reload_fpr_from_gpr<mode>"
- [(set (match_operand:FMOVE64X 0 "register_operand" "=ws")
+ [(set (match_operand:FMOVE64X 0 "register_operand" "=d")
(unspec:FMOVE64X [(match_operand:FMOVE64X 1 "register_operand" "r")]
UNSPEC_P8V_RELOAD_FROM_GPR))
- (clobber (match_operand:TF 2 "register_operand" "=d"))]
+ (clobber (match_operand:IF 2 "register_operand" "=d"))]
"!TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
"#"
"&& reload_completed"
@@ -7530,42 +7520,36 @@
{
rtx dest = operands[0];
rtx src = operands[1];
- rtx tmp = operands[2];
+ rtx tmp_hi = simplify_gen_subreg (DFmode, operands[2], IFmode, 0);
+ rtx tmp_lo = simplify_gen_subreg (DFmode, operands[2], IFmode, 8);
rtx gpr_hi_reg = gen_highpart (SImode, src);
rtx gpr_lo_reg = gen_lowpart (SImode, src);
- emit_insn (gen_p8_mtvsrwz_1 (tmp, gpr_hi_reg));
- emit_insn (gen_p8_mtvsrwz_2 (tmp, gpr_lo_reg));
- emit_insn (gen_p8_fmrgow_<mode> (dest, tmp));
+ emit_insn (gen_p8_mtvsrwz (tmp_hi, gpr_hi_reg));
+ emit_insn (gen_p8_mtvsrwz (tmp_lo, gpr_lo_reg));
+ emit_insn (gen_p8_fmrgow_<mode> (dest, tmp_hi, tmp_lo));
DONE;
}
[(set_attr "length" "12")
(set_attr "type" "three")])
;; Move 128 bit values from GPRs to VSX registers in 64-bit mode
-(define_insn "p8_mtvsrd_1"
- [(set (match_operand:TF 0 "register_operand" "=ws")
- (unspec:TF [(match_operand:DI 1 "register_operand" "r")]
- UNSPEC_P8V_MTVSRD))]
- "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
- "mtvsrd %0,%1"
- [(set_attr "type" "mftgpr")])
-
-(define_insn "p8_mtvsrd_2"
- [(set (match_operand:TF 0 "register_operand" "+ws")
- (unspec:TF [(match_dup 0)
- (match_operand:DI 1 "register_operand" "r")]
+(define_insn "p8_mtvsrd_df"
+ [(set (match_operand:DF 0 "register_operand" "=wa")
+ (unspec:DF [(match_operand:DI 1 "register_operand" "r")]
UNSPEC_P8V_MTVSRD))]
"TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
- "mtvsrd %L0,%1"
+ "mtvsrd %x0,%1"
[(set_attr "type" "mftgpr")])
(define_insn "p8_xxpermdi_<mode>"
[(set (match_operand:FMOVE128_GPR 0 "register_operand" "=wa")
- (unspec:FMOVE128_GPR [(match_operand:TF 1 "register_operand" "ws")]
- UNSPEC_P8V_XXPERMDI))]
+ (unspec:FMOVE128_GPR [
+ (match_operand:DF 1 "register_operand" "wa")
+ (match_operand:DF 2 "register_operand" "wa")]
+ UNSPEC_P8V_XXPERMDI))]
"TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
- "xxpermdi %x0,%1,%L1,0"
+ "xxpermdi %x0,%x1,%x2,0"
[(set_attr "type" "vecperm")])
(define_insn_and_split "reload_vsx_from_gpr<mode>"
@@ -7573,7 +7557,7 @@
(unspec:FMOVE128_GPR
[(match_operand:FMOVE128_GPR 1 "register_operand" "r")]
UNSPEC_P8V_RELOAD_FROM_GPR))
- (clobber (match_operand:TF 2 "register_operand" "=ws"))]
+ (clobber (match_operand:IF 2 "register_operand" "=wa"))]
"TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
"#"
"&& reload_completed"
@@ -7581,13 +7565,17 @@
{
rtx dest = operands[0];
rtx src = operands[1];
- rtx tmp = operands[2];
+ /* You might think that we could use op0 as one temp and a DF clobber
+ as op2, but you'd be wrong. Secondary reload move patterns don't
+ check for overlap of the clobber and the destination. */
+ rtx tmp_hi = simplify_gen_subreg (DFmode, operands[2], IFmode, 0);
+ rtx tmp_lo = simplify_gen_subreg (DFmode, operands[2], IFmode, 8);
rtx gpr_hi_reg = gen_highpart (DImode, src);
rtx gpr_lo_reg = gen_lowpart (DImode, src);
- emit_insn (gen_p8_mtvsrd_1 (tmp, gpr_hi_reg));
- emit_insn (gen_p8_mtvsrd_2 (tmp, gpr_lo_reg));
- emit_insn (gen_p8_xxpermdi_<mode> (dest, tmp));
+ emit_insn (gen_p8_mtvsrd_df (tmp_hi, gpr_hi_reg));
+ emit_insn (gen_p8_mtvsrd_df (tmp_lo, gpr_lo_reg));
+ emit_insn (gen_p8_xxpermdi_<mode> (dest, tmp_hi, tmp_lo));
DONE;
}
[(set_attr "length" "12")
@@ -7608,6 +7596,13 @@
;; Move SFmode to a VSX from a GPR register. Because scalar floating point
;; type is stored internally as double precision in the VSX registers, we have
;; to convert it from the vector format.
+(define_insn "p8_mtvsrd_sf"
+ [(set (match_operand:SF 0 "register_operand" "=wa")
+ (unspec:SF [(match_operand:DI 1 "register_operand" "r")]
+ UNSPEC_P8V_MTVSRD))]
+ "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
+ "mtvsrd %x0,%1"
+ [(set_attr "type" "mftgpr")])
(define_insn_and_split "reload_vsx_from_gprsf"
[(set (match_operand:SF 0 "register_operand" "=wa")
@@ -7622,16 +7617,12 @@
rtx op0 = operands[0];
rtx op1 = operands[1];
rtx op2 = operands[2];
- /* Also use the destination register to hold the unconverted DImode value.
- This is conceptually a separate value from OP0, so we use gen_rtx_REG
- rather than simplify_gen_subreg. */
- rtx op0_di = gen_rtx_REG (DImode, REGNO (op0));
rtx op1_di = simplify_gen_subreg (DImode, op1, SFmode, 0);
/* Move SF value to upper 32-bits for xscvspdpn. */
emit_insn (gen_ashldi3 (op2, op1_di, GEN_INT (32)));
- emit_move_insn (op0_di, op2);
- emit_insn (gen_vsx_xscvspdpn_directmove (op0, op0_di));
+ emit_insn (gen_p8_mtvsrd_sf (op0, op2));
+ emit_insn (gen_vsx_xscvspdpn_directmove (op0, op0));
DONE;
}
[(set_attr "length" "8")
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 997ff31..45af233 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -1521,7 +1521,7 @@
;; Used by direct move to move a SFmode value from GPR to VSX register
(define_insn "vsx_xscvspdpn_directmove"
[(set (match_operand:SF 0 "vsx_register_operand" "=wa")
- (unspec:SF [(match_operand:DI 1 "vsx_register_operand" "wa")]
+ (unspec:SF [(match_operand:SF 1 "vsx_register_operand" "wa")]
UNSPEC_VSX_CVSPDPN))]
"TARGET_XSCVSPDPN"
"xscvspdpn %x0,%x1"
--
Alan Modra
Australia Development Lab, IBM
More information about the Gcc-patches
mailing list