This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH, V3, #4 of 10], Add general prefixed/pcrel support
- From: Michael Meissner <meissner at linux dot ibm dot com>
- To: Michael Meissner <meissner at linux dot ibm dot com>, gcc-patches at gcc dot gnu dot org, segher at kernel dot crashing dot org, dje dot gcc at gmail dot com
- Date: Mon, 26 Aug 2019 16:43:37 -0400
- Subject: [PATCH, V3, #4 of 10], Add general prefixed/pcrel support
- References: <20190826173320.GA7958@ibm-toto.the-meissners.org>
This patch (V3 patch #4) is a rework of the V1 patches #3 and #4. It
adds support to generate prefixed (and local pc-relative) instructions
for all modes, except SDmode. SDmode can't be used with a prefixed
offset instruction, because the default method to load up a SDmode
value is to use the LFIWZX instruction, which only has an indexed
format.
For the stack_protect_setdi and stack_protect_testdi insns, I reworked
them so that the expander will copy the prefixed memory address to a
register and use the indexed instruction format. I added new
predicates to make sure nothing re-combined the insn to form a prefixed
insns.
I changed the logic previously using insn_form to now use trad_insn.
I think in the previoius patch, I mispoke, in that the logic for
pc-relative vector extract is here, and not in the previous patch.
I have built a bootstrap compiler on a little endian power8 system, and
there were no regressions when I ran make check. Once the previous
patches are checked in, can I check in this patch?
2019-08-26 Michael Meissner <meissner@linux.ibm.com>
* config/rs6000/predicates.md (add_operand): Add support for the
PADDI instruction.
(non_add_cint_operand): Add support for the PADDI instruction.
(lwa_operand): Add support for the PLWA instruction.
(non_prefixed_mem_operand): New predicate.
* config/rs6000/rs6000-protos.h (make_memory_non_prefixed): New
declaration.
* config/rs6000/rs6000.c (num_insns_constant_gpr): Add support for
the PADDI instruction.
(rs6000_adjust_vec_address): Add support for optimizing prefixed
and pc-relative extracts with constant extraction elements. Add a
failure when we use pc-relative addressing and non-constant
extraction elements. Use SIGNED_16BIT_OFFSET_P.
(quad_address_p): Add support for prefixed memory instructions.
(mem_operand_gpr): Add support for prefixed memory instructions.
Use SIGNED_16BIT_OFFSET_EXTRA_P.
(mem_operand_ds_form): Add support for prefixed memory
instructions. Use SIGNED_16BIT_OFFSET_EXTRA_P.
(rs6000_legitimate_offset_address_p): Add support for prefixed
memory instructions.
(rs6000_legitimate_address_p): Add support for prefixed memory
instructions.
(rs6000_mode_dependent_address): Add support for prefixed memory
instructions.
(make_memory_non_prefixed): New function.
(prefixed_paddi_p): Fix thinkos in last patch.
(rs6000_rtx_costs): Add support for the PADDI instruction.
(rs6000_num_insns): Don't treat prefixed instructions as being
slower because they have a larger length.
(rs6000_insn_cost): Call rs6000_num_insns.
* config/rs6000/rs6000.md (add<mode>3): Add support for the PADDI
instruction.
(movsi_low): Add support for the PADDI instruction.
(movsi const int splitter): Add support for the PADDI
instruction.
(mov<mode>_64bit_dm): Add support for prefixed memory
instructions. Split alternatives that had merged loading a
constant with register moves.
(movtd_64bit_nodm): Add support for prefixed memory instructions.
(movdi_internal64): Add support for prefixed memory instructions.
(movdi const int splitter): Add comment.
(mov<mode>_ppc64): Add support for prefixed memory instructions.
(stack_protect_setdi): Do not allow prefixed instructions.
(stack_protect_testdi): Do not allow prefixed instructions.
* config/rs6000/vsx.md (vsx_mov<mode>_64bit): Add support for
prefixed memory instructions.
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md (revision 274870)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -839,7 +839,8 @@ (define_special_predicate "indexed_addre
(define_predicate "add_operand"
(if_then_else (match_code "const_int")
(match_test "satisfies_constraint_I (op)
- || satisfies_constraint_L (op)")
+ || satisfies_constraint_L (op)
+ || satisfies_constraint_eI (op)")
(match_operand 0 "gpc_reg_operand")))
;; Return 1 if the operand is either a non-special register, or 0, or -1.
@@ -852,7 +853,8 @@ (define_predicate "adde_operand"
(define_predicate "non_add_cint_operand"
(and (match_code "const_int")
(match_test "!satisfies_constraint_I (op)
- && !satisfies_constraint_L (op)")))
+ && !satisfies_constraint_L (op)
+ && !satisfies_constraint_eI (op)")))
;; Return 1 if the operand is a constant that can be used as the operand
;; of an AND, OR or XOR.
@@ -933,6 +935,13 @@ (define_predicate "lwa_operand"
return false;
addr = XEXP (inner, 0);
+
+ /* The LWA instruction uses the DS-form format where the bottom two bits of
+ the offset must be 0. The prefixed PLWA does not have this
+ restriction. */
+ if (prefixed_local_addr_p (addr, mode, TRAD_INSN_DS))
+ return true;
+
if (GET_CODE (addr) == PRE_INC
|| GET_CODE (addr) == PRE_DEC
|| (GET_CODE (addr) == PRE_MODIFY
@@ -1686,6 +1695,17 @@ (define_predicate "pcrel_ext_address"
return (SYMBOL_REF_P (op) && !SYMBOL_REF_LOCAL_P (op));
})
+;; Return 1 if op is a memory operand that is not prefixed.
+(define_predicate "non_prefixed_mem_operand"
+ (match_code "mem")
+{
+ if (!memory_operand (op, mode))
+ return false;
+
+ return !prefixed_local_addr_p (XEXP (op, 0), GET_MODE (op),
+ TRAD_INSN_DEFAULT);
+})
+
;; Match the first insn (addis) in fusing the combination of addis and loads to
;; GPR registers on power8.
(define_predicate "fusion_gpr_addis"
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h (revision 274872)
+++ gcc/config/rs6000/rs6000-protos.h (working copy)
@@ -170,6 +170,7 @@ typedef enum {
} trad_insn_type;
extern bool prefixed_local_addr_p (rtx, machine_mode, trad_insn_type);
+extern rtx make_memory_non_prefixed (rtx);
extern bool prefixed_load_p (rtx_insn *);
extern bool prefixed_store_p (rtx_insn *);
extern bool prefixed_paddi_p (rtx_insn *);
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c (revision 274872)
+++ gcc/config/rs6000/rs6000.c (working copy)
@@ -5727,7 +5727,7 @@ static int
num_insns_constant_gpr (HOST_WIDE_INT value)
{
/* signed constant loadable with addi */
- if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x10000)
+ if (SIGNED_16BIT_OFFSET_P (value))
return 1;
/* constant loadable with addis */
@@ -5735,6 +5735,10 @@ num_insns_constant_gpr (HOST_WIDE_INT va
&& (value >> 31 == -1 || value >> 31 == 0))
return 1;
+ /* PADDI can support up to 34 bit signed integers. */
+ else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
+ return 1;
+
else if (TARGET_POWERPC64)
{
HOST_WIDE_INT low = ((value & 0xffffffff) ^ 0x80000000) - 0x80000000;
@@ -6905,6 +6909,7 @@ rs6000_adjust_vec_address (rtx scalar_re
rtx element_offset;
rtx new_addr;
bool valid_addr_p;
+ bool pcrel_p = TARGET_PCREL && pcrel_local_address (addr, Pmode);
/* Vector addresses should not have PRE_INC, PRE_DEC, or PRE_MODIFY. */
gcc_assert (GET_RTX_CLASS (GET_CODE (addr)) != RTX_AUTOINC);
@@ -6942,6 +6947,41 @@ rs6000_adjust_vec_address (rtx scalar_re
else if (REG_P (addr) || SUBREG_P (addr))
new_addr = gen_rtx_PLUS (Pmode, addr, element_offset);
+
+ /* Optimize pc-relative addresses. */
+ else if (pcrel_p)
+ {
+ if (CONST_INT_P (element_offset))
+ {
+ rtx addr2 = addr;
+ HOST_WIDE_INT offset = INTVAL (element_offset);
+
+ if (GET_CODE (addr2) == CONST)
+ addr2 = XEXP (addr2, 0);
+
+ if (GET_CODE (addr2) == PLUS)
+ {
+ offset += INTVAL (XEXP (addr2, 1));
+ addr2 = XEXP (addr2, 0);
+ }
+
+ gcc_assert (SIGNED_34BIT_OFFSET_P (offset));
+ if (offset)
+ {
+ addr2 = gen_rtx_PLUS (Pmode, addr2, GEN_INT (offset));
+ new_addr = gen_rtx_CONST (Pmode, addr2);
+ }
+ else
+ new_addr = addr2;
+ }
+
+ /* Right now, the pc-relative support needs to be re-thought if you have
+ a pc-relative address and a variable extract, due to having only have
+ one base register tmp to use. Fail until this is rewritten. */
+ else
+ gcc_unreachable ();
+ }
+
/* Optimize D-FORM addresses with constant offset with a constant element, to
include the element offset in the address directly. */
else if (GET_CODE (addr) == PLUS)
@@ -6956,8 +6996,11 @@ rs6000_adjust_vec_address (rtx scalar_re
HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset);
rtx offset_rtx = GEN_INT (offset);
- if (IN_RANGE (offset, -32768, 32767)
- && (scalar_size < 8 || (offset & 0x3) == 0))
+ if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (offset))
+ new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
+
+ else if (SIGNED_16BIT_OFFSET_P (offset)
+ && (scalar_size < 8 || (offset & 0x3) == 0))
new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx);
else
{
@@ -7007,9 +7050,8 @@ rs6000_adjust_vec_address (rtx scalar_re
/* If we have a PLUS, we need to see whether the particular register class
allows for D-FORM or X-FORM addressing. */
- if (GET_CODE (new_addr) == PLUS)
+ if (GET_CODE (new_addr) == PLUS || pcrel_p)
{
- rtx op1 = XEXP (new_addr, 1);
addr_mask_type addr_mask;
unsigned int scalar_regno = reg_or_subregno (scalar_reg);
@@ -7026,7 +7068,10 @@ rs6000_adjust_vec_address (rtx scalar_re
else
gcc_unreachable ();
- if (REG_P (op1) || SUBREG_P (op1))
+ if (pcrel_p)
+ valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
+ else if (REG_P (XEXP (new_addr, 1))
+ || SUBREG_P (XEXP (new_addr, 1)))
valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0;
else
valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0;
@@ -7454,6 +7499,13 @@ quad_address_p (rtx addr, machine_mode m
if (VECTOR_MODE_P (mode) && !mode_supports_dq_form (mode))
return false;
+ /* Is this a valid prefixed address? If the bottom four bits of the offset
+ are non-zero, we could use a prefixed instruction (which does not have the
+ DQ-form constraint that the traditional instruction had) instead of
+ forcing the unaligned offset to a GPR. */
+ if (prefixed_local_addr_p (addr, mode, TRAD_INSN_DQ))
+ return true;
+
if (GET_CODE (addr) != PLUS)
return false;
@@ -7555,6 +7607,13 @@ mem_operand_gpr (rtx op, machine_mode mo
&& legitimate_indirect_address_p (XEXP (addr, 0), false))
return true;
+ /* Allow prefixed instructions if supported. If the bottom two bits of the
+ offset are non-zero, we could use a prefixed instruction (which does not
+ have the DS-form constraint that the traditional instruction had) instead
+ of forcing the unaligned offset to a GPR. */
+ if (prefixed_local_addr_p (addr, mode, TRAD_INSN_DS))
+ return true;
+
/* Don't allow non-offsettable addresses. See PRs 83969 and 84279. */
if (!rs6000_offsettable_memref_p (op, mode, false))
return false;
@@ -7576,7 +7635,7 @@ mem_operand_gpr (rtx op, machine_mode mo
causes a wrap, so test only the low 16 bits. */
offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
- return offset + 0x8000 < 0x10000u - extra;
+ return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
}
/* As above, but for DS-FORM VSX insns. Unlike mem_operand_gpr,
@@ -7589,6 +7648,13 @@ mem_operand_ds_form (rtx op, machine_mod
int extra;
rtx addr = XEXP (op, 0);
+ /* Allow prefixed instructions if supported. If the bottom two bits of the
+ offset are non-zero, we could use a prefixed instruction (which does not
+ have the DS-form constraint that the traditional instruction had) instead
+ of forcing the unaligned offset to a GPR. */
+ if (prefixed_local_addr_p (addr, mode, TRAD_INSN_DS))
+ return true;
+
if (!offsettable_address_p (false, mode, addr))
return false;
@@ -7609,7 +7675,7 @@ mem_operand_ds_form (rtx op, machine_mod
causes a wrap, so test only the low 16 bits. */
offset = ((offset & 0xffff) ^ 0x8000) - 0x8000;
- return offset + 0x8000 < 0x10000u - extra;
+ return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
}
/* Subroutines of rs6000_legitimize_address and rs6000_legitimate_address_p. */
@@ -7958,8 +8024,10 @@ rs6000_legitimate_offset_address_p (mach
break;
}
- offset += 0x8000;
- return offset < 0x10000 - extra;
+ if (TARGET_PREFIXED_ADDR)
+ return SIGNED_34BIT_OFFSET_EXTRA_P (offset, extra);
+ else
+ return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra);
}
bool
@@ -8856,6 +8924,11 @@ rs6000_legitimate_address_p (machine_mod
&& mode_supports_pre_incdec_p (mode)
&& legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict))
return 1;
+
+ /* Handle prefixed addresses (pc-relative or 34-bit offset). */
+ if (prefixed_local_addr_p (x, mode, TRAD_INSN_DEFAULT))
+ return 1;
+
/* Handle restricted vector d-form offsets in ISA 3.0. */
if (quad_offset_p)
{
@@ -8914,7 +8987,10 @@ rs6000_legitimate_address_p (machine_mod
|| (!avoiding_indexed_address_p (mode)
&& legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict)))
&& rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0)))
- return 1;
+ {
+ /* There is no prefixed version of the load/store with update. */
+ return !prefixed_local_addr_p (XEXP (x, 1), mode, TRAD_INSN_DEFAULT);
+ }
if (reg_offset_p && !quad_offset_p
&& legitimate_lo_sum_address_p (mode, x, reg_ok_strict))
return 1;
@@ -8976,8 +9052,12 @@ rs6000_mode_dependent_address (const_rtx
&& XEXP (addr, 0) != arg_pointer_rtx
&& CONST_INT_P (XEXP (addr, 1)))
{
- unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
- return val + 0x8000 >= 0x10000 - (TARGET_POWERPC64 ? 8 : 12);
+ HOST_WIDE_INT val = INTVAL (XEXP (addr, 1));
+ HOST_WIDE_INT extra = TARGET_POWERPC64 ? 8 : 12;
+ if (TARGET_PREFIXED_ADDR)
+ return !SIGNED_34BIT_OFFSET_EXTRA_P (val, extra);
+ else
+ return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra);
}
break;
@@ -13950,6 +14030,34 @@ prefixed_local_addr_p (rtx addr,
return false;
}
+
+/* Make a memory address non-prefixed if it is prefixed. */
+
+rtx
+make_memory_non_prefixed (rtx mem)
+{
+ gcc_assert (MEM_P (mem));
+ if (prefixed_local_addr_p (XEXP (mem, 0), GET_MODE (mem), TRAD_INSN_DEFAULT))
+ {
+ rtx old_addr = XEXP (mem, 0);
+ rtx new_addr;
+
+ if (GET_CODE (old_addr) == PLUS
+ && (REG_P (XEXP (old_addr, 0)) || SUBREG_P (XEXP (old_addr, 0)))
+ && CONST_INT_P (XEXP (old_addr, 1)))
+ {
+ rtx tmp_reg = force_reg (Pmode, XEXP (old_addr, 1));
+ new_addr = gen_rtx_PLUS (Pmode, XEXP (old_addr, 0), tmp_reg);
+ }
+ else
+ new_addr = force_reg (Pmode, old_addr);
+
+ mem = change_address (mem, VOIDmode, new_addr);
+ }
+
+ return mem;
+}
+
/* Whether a load instruction is a prefixed instruction. This is called from
the prefixed attribute processing. */
@@ -21060,7 +21168,8 @@ rs6000_rtx_costs (rtx x, machine_mode mo
|| outer_code == PLUS
|| outer_code == MINUS)
&& (satisfies_constraint_I (x)
- || satisfies_constraint_L (x)))
+ || satisfies_constraint_L (x)
+ || satisfies_constraint_eI (x)))
|| (outer_code == AND
&& (satisfies_constraint_K (x)
|| (mode == SImode
@@ -21440,6 +21549,42 @@ rs6000_debug_rtx_costs (rtx x, machine_m
return ret;
}
+/* How many real instructions are generated for this insn? This is slightly
+ different from the length attribute, in that the length attribute counts the
+ number of bytes. With prefixed instructions, we don't want to count a
+ prefixed instruction (length 12 bytes including possible NOP) as taking 3
+ instructions, but just one. */
+
+static int
+rs6000_num_insns (rtx_insn *insn)
+{
+ /* Try to figure it out based on the length and whether there are prefixed
+ instructions. While prefixed instructions are only 8 bytes, we have to
+ use 12 as the size of the first prefixed instruction in case the
+ instruction needs to be aligned. Back to back prefixed instructions would
+ only take 20 bytes, since it is guaranteed that one of the prefixed
+ instructions does not need the alignment. */
+ int length = get_attr_length (insn);
+
+ if (length >= 12 && TARGET_PREFIXED_ADDR
+ && get_attr_prefixed (insn) == PREFIXED_YES)
+ {
+ /* Single prefixed instruction. */
+ if (length == 12)
+ return 1;
+
+ /* A normal instruction and a prefixed instruction (16) or two back
+ to back prefixed instructions (20). */
+ if (length == 16 || length == 20)
+ return 2;
+
+ /* Guess for larger instruction sizes. */
+ return 2 + (length - 20) / 4;
+ }
+
+ return length / 4;
+}
+
static int
rs6000_insn_cost (rtx_insn *insn, bool speed)
{
@@ -21453,7 +21598,7 @@ rs6000_insn_cost (rtx_insn *insn, bool s
if (cost > 0)
return cost;
- int n = get_attr_length (insn) / 4;
+ int n = rs6000_num_insns (insn);
enum attr_type type = get_attr_type (insn);
switch (type)
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md (revision 274872)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -1761,15 +1761,17 @@ (define_expand "add<mode>3"
})
(define_insn "*add<mode>3"
- [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r")
- (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b")
- (match_operand:GPR 2 "add_operand" "r,I,L")))]
+ [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r")
+ (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")
+ (match_operand:GPR 2 "add_operand" "r,I,L,eI")))]
""
"@
add %0,%1,%2
addi %0,%1,%2
- addis %0,%1,%v2"
- [(set_attr "type" "add")])
+ addis %0,%1,%v2
+ addi %0,%1,%2"
+ [(set_attr "type" "add")
+ (set_attr "isa" "*,*,*,fut")])
(define_insn "*addsi3_high"
[(set (match_operand:SI 0 "gpc_reg_operand" "=b")
@@ -6909,22 +6911,22 @@ (define_insn "movsi_low"
;; MR LA LWZ LFIWZX LXSIWZX
;; STW STFIWX STXSIWX LI LIS
-;; # XXLOR XXSPLTIB 0 XXSPLTIB -1 VSPLTISW
-;; XXLXOR 0 XXLORC -1 P9 const MTVSRWZ MFVSRWZ
-;; MF%1 MT%0 NOP
+;; PLI # XXLOR XXSPLTIB 0 XXSPLTIB -1
+;; VSPLTISW XXLXOR 0 XXLORC -1 P9 const MTVSRWZ
+;; MFVSRWZ MF%1 MT%0 NOP
(define_insn "*movsi_internal1"
[(set (match_operand:SI 0 "nonimmediate_operand"
"=r, r, r, d, v,
m, Z, Z, r, r,
- r, wa, wa, wa, v,
- wa, v, v, wa, r,
- r, *h, *h")
+ r, r, wa, wa, wa,
+ v, wa, v, v, wa,
+ r, r, *h, *h")
(match_operand:SI 1 "input_operand"
"r, U, m, Z, Z,
r, d, v, I, L,
- n, wa, O, wM, wB,
- O, wM, wS, r, wa,
- *h, r, 0"))]
+ eI, n, wa, O, wM,
+ wB, O, wM, wS, r,
+ wa, *h, r, 0"))]
"gpc_reg_operand (operands[0], SImode)
|| gpc_reg_operand (operands[1], SImode)"
"@
@@ -6938,6 +6940,7 @@ (define_insn "*movsi_internal1"
stxsiwx %x1,%y0
li %0,%1
lis %0,%v1
+ li %0,%1
#
xxlor %x0,%x1,%x1
xxspltib %x0,0
@@ -6954,21 +6957,21 @@ (define_insn "*movsi_internal1"
[(set_attr "type"
"*, *, load, fpload, fpload,
store, fpstore, fpstore, *, *,
- *, veclogical, vecsimple, vecsimple, vecsimple,
- veclogical, veclogical, vecsimple, mffgpr, mftgpr,
- *, *, *")
+ *, *, veclogical, vecsimple, vecsimple,
+ vecsimple, veclogical, veclogical, vecsimple, mffgpr,
+ mftgpr, *, *, *")
(set_attr "length"
"*, *, *, *, *,
*, *, *, *, *,
- 8, *, *, *, *,
- *, *, 8, *, *,
- *, *, *")
+ *, 8, *, *, *,
+ *, *, *, 8, *,
+ *, *, *, *")
(set_attr "isa"
"*, *, *, p8v, p8v,
*, p8v, p8v, *, *,
- *, p8v, p9v, p9v, p8v,
- p9v, p8v, p9v, p8v, p8v,
- *, *, *")])
+ fut, *, p8v, p9v, p9v,
+ p8v, p9v, p8v, p9v, p8v,
+ p8v, *, *, *")])
;; Like movsi, but adjust a SF value to be used in a SI context, i.e.
;; (set (reg:SI ...) (subreg:SI (reg:SF ...) 0))
@@ -7113,14 +7116,15 @@ (define_insn "*movsi_from_df"
"xscvdpsp %x0,%x1"
[(set_attr "type" "fp")])
-;; Split a load of a large constant into the appropriate two-insn
-;; sequence.
+;; Split a load of a large constant into the appropriate two-insn sequence. On
+;; systems that support PADDI (PLI), we can use PLI to load any 32-bit constant
+;; in one instruction.
(define_split
[(set (match_operand:SI 0 "gpc_reg_operand")
(match_operand:SI 1 "const_int_operand"))]
"(unsigned HOST_WIDE_INT) (INTVAL (operands[1]) + 0x8000) >= 0x10000
- && (INTVAL (operands[1]) & 0xffff) != 0"
+ && (INTVAL (operands[1]) & 0xffff) != 0 && !TARGET_PREFIXED_ADDR"
[(set (match_dup 0)
(match_dup 2))
(set (match_dup 0)
@@ -7759,9 +7763,18 @@ (define_expand "mov<mode>"
;; not swapped like they are for TImode or TFmode. Subregs therefore are
;; problematical. Don't allow direct move for this case.
+;; FPR load FPR store FPR move FPR zero GPR load
+;; GPR store GPR move GPR zero MFVSRD MTVSRD
+
(define_insn_and_split "*mov<mode>_64bit_dm"
- [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r,r,d")
- (match_operand:FMOVE128_FPR 1 "input_operand" "d,m,d,<zero_fp>,r,<zero_fp>Y,r,d,r"))]
+ [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand"
+ "=m, d, d, d, Y,
+ r, r, r, r, d")
+
+ (match_operand:FMOVE128_FPR 1 "input_operand"
+ "d, m, d, <zero_fp>, r,
+ <zero_fp>, Y, r, d, r"))]
+
"TARGET_HARD_FLOAT && TARGET_POWERPC64 && FLOAT128_2REG_P (<MODE>mode)
&& (<MODE>mode != TDmode || WORDS_BIG_ENDIAN)
&& (gpc_reg_operand (operands[0], <MODE>mode)
@@ -7769,9 +7782,13 @@ (define_insn_and_split "*mov<mode>_64bit
"#"
"&& reload_completed"
[(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
- [(set_attr "length" "8,8,8,8,12,12,8,8,8")
- (set_attr "isa" "*,*,*,*,*,*,*,p8v,p8v")])
+{
+ rs6000_split_multireg_move (operands[0], operands[1]);
+ DONE;
+}
+ [(set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
+ (set_attr "non_prefixed_length" "8")
+ (set_attr "prefixed_length" "20")])
(define_insn_and_split "*movtd_64bit_nodm"
[(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
@@ -7782,8 +7799,12 @@ (define_insn_and_split "*movtd_64bit_nod
"#"
"&& reload_completed"
[(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
- [(set_attr "length" "8,8,8,12,12,8")])
+{
+ rs6000_split_multireg_move (operands[0], operands[1]);
+ DONE;
+}
+ [(set_attr "non_prefixed_length" "8")
+ (set_attr "prefixed_length" "20")])
(define_insn_and_split "*mov<mode>_32bit"
[(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r")
@@ -8793,24 +8814,24 @@ (define_split
[(pc)]
{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
-;; GPR store GPR load GPR move GPR li GPR lis GPR #
-;; FPR store FPR load FPR move AVX store AVX store AVX load
-;; AVX load VSX move P9 0 P9 -1 AVX 0/-1 VSX 0
-;; VSX -1 P9 const AVX const From SPR To SPR SPR<->SPR
-;; VSX->GPR GPR->VSX
+;; GPR store GPR load GPR move GPR li GPR lis GPR pli
+;; GPR # FPR store FPR load FPR move AVX store AVX store
+;; AVX load AVX load VSX move P9 0 P9 -1 AVX 0/-1
+;; VSX 0 VSX -1 P9 const AVX const From SPR To SPR
+;; SPR<->SPR VSX->GPR GPR->VSX
(define_insn "*movdi_internal64"
[(set (match_operand:DI 0 "nonimmediate_operand"
"=YZ, r, r, r, r, r,
- m, ^d, ^d, wY, Z, $v,
- $v, ^wa, wa, wa, v, wa,
- wa, v, v, r, *h, *h,
- ?r, ?wa")
+ r, m, ^d, ^d, wY, Z,
+ $v, $v, ^wa, wa, wa, v,
+ wa, wa, v, v, r, *h,
+ *h, ?r, ?wa")
(match_operand:DI 1 "input_operand"
- "r, YZ, r, I, L, nF,
- ^d, m, ^d, ^v, $v, wY,
- Z, ^wa, Oj, wM, OjwM, Oj,
- wM, wS, wB, *h, r, 0,
- wa, r"))]
+ "r, YZ, r, I, L, eI,
+ nF, ^d, m, ^d, ^v, $v,
+ wY, Z, ^wa, Oj, wM, OjwM,
+ Oj, wM, wS, wB, *h, r,
+ 0, wa, r"))]
"TARGET_POWERPC64
&& (gpc_reg_operand (operands[0], DImode)
|| gpc_reg_operand (operands[1], DImode))"
@@ -8820,6 +8841,7 @@ (define_insn "*movdi_internal64"
mr %0,%1
li %0,%1
lis %0,%v1
+ li %0,%1
#
stfd%U0%X0 %1,%0
lfd%U1%X1 %0,%1
@@ -8843,26 +8865,28 @@ (define_insn "*movdi_internal64"
mtvsrd %x0,%1"
[(set_attr "type"
"store, load, *, *, *, *,
- fpstore, fpload, fpsimple, fpstore, fpstore, fpload,
- fpload, veclogical, vecsimple, vecsimple, vecsimple, veclogical,
- veclogical, vecsimple, vecsimple, mfjmpr, mtjmpr, *,
- mftgpr, mffgpr")
+ *, fpstore, fpload, fpsimple, fpstore, fpstore,
+ fpload, fpload, veclogical,vecsimple, vecsimple, vecsimple,
+ veclogical, veclogical, vecsimple, vecsimple, mfjmpr, mtjmpr,
+ *, mftgpr, mffgpr")
(set_attr "size" "64")
(set_attr "length"
- "*, *, *, *, *, 20,
- *, *, *, *, *, *,
+ "*, *, *, *, *, *,
+ 20, *, *, *, *, *,
*, *, *, *, *, *,
- *, 8, *, *, *, *,
- *, *")
+ *, *, 8, *, *, *,
+ *, *, *")
(set_attr "isa"
- "*, *, *, *, *, *,
- *, *, *, p9v, p7v, p9v,
- p7v, *, p9v, p9v, p7v, *,
- *, p7v, p7v, *, *, *,
- p8v, p8v")])
+ "*, *, *, *, *, fut,
+ *, *, *, *, p9v, p7v,
+ p9v, p7v, *, p9v, p9v, p7v,
+ *, *, p7v, p7v, *, *,
+ *, p8v, p8v")])
; Some DImode loads are best done as a load of -1 followed by a mask
-; instruction.
+; instruction. On systems that support the PADDI (PLI) instruction,
+; num_insns_constant returns 1, so these splitter would not be used for things
+; that be loaded with PLI.
(define_split
[(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
(match_operand:DI 1 "const_int_operand"))]
@@ -8980,7 +9004,8 @@ (define_insn "*mov<mode>_ppc64"
return rs6000_output_move_128bit (operands);
}
[(set_attr "type" "store,store,load,load,*,*")
- (set_attr "length" "8")])
+ (set_attr "non_prefixed_length" "8,8,8,8,8,40")
+ (set_attr "prefixed_length" "20,20,20,20,8,40")])
(define_split
[(set (match_operand:TI2 0 "int_reg_operand")
@@ -11497,9 +11522,25 @@ (define_insn "stack_protect_setsi"
[(set_attr "type" "three")
(set_attr "length" "12")])
-(define_insn "stack_protect_setdi"
- [(set (match_operand:DI 0 "memory_operand" "=Y")
- (unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET))
+(define_expand "stack_protect_setdi"
+ [(parallel [(set (match_operand:DI 0 "memory_operand")
+ (unspec:DI [(match_operand:DI 1 "memory_operand")]
+ UNSPEC_SP_SET))
+ (set (match_scratch:DI 2)
+ (const_int 0))])]
+ "TARGET_64BIT"
+{
+ if (TARGET_PREFIXED_ADDR)
+ {
+ operands[0] = make_memory_non_prefixed (operands[0]);
+ operands[1] = make_memory_non_prefixed (operands[1]);
+ }
+})
+
+(define_insn "*stack_protect_setdi"
+ [(set (match_operand:DI 0 "non_prefixed_mem_operand" "=YZ")
+ (unspec:DI [(match_operand:DI 1 "non_prefixed_mem_operand" "YZ")]
+ UNSPEC_SP_SET))
(set (match_scratch:DI 2 "=&r") (const_int 0))]
"TARGET_64BIT"
"ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0"
@@ -11543,10 +11584,27 @@ (define_insn "stack_protect_testsi"
lwz%U1%X1 %3,%1\;lwz%U2%X2 %4,%2\;cmplw %0,%3,%4\;li %3,0\;li %4,0"
[(set_attr "length" "16,20")])
-(define_insn "stack_protect_testdi"
+(define_expand "stack_protect_testdi"
+ [(parallel [(set (match_operand:CCEQ 0 "cc_reg_operand")
+ (unspec:CCEQ [(match_operand:DI 1 "memory_operand")
+ (match_operand:DI 2 "memory_operand")]
+ UNSPEC_SP_TEST))
+ (set (match_scratch:DI 4)
+ (const_int 0))
+ (clobber (match_scratch:DI 3))])]
+ "TARGET_64BIT"
+{
+ if (TARGET_PREFIXED_ADDR)
+ {
+ operands[0] = make_memory_non_prefixed (operands[0]);
+ operands[1] = make_memory_non_prefixed (operands[1]);
+ }
+})
+
+(define_insn "*stack_protect_testdi"
[(set (match_operand:CCEQ 0 "cc_reg_operand" "=x,?y")
- (unspec:CCEQ [(match_operand:DI 1 "memory_operand" "Y,Y")
- (match_operand:DI 2 "memory_operand" "Y,Y")]
+ (unspec:CCEQ [(match_operand:DI 1 "non_prefixed_mem_operand" "YZ,YZ")
+ (match_operand:DI 2 "non_prefixed_mem_operand" "YZ,YZ")]
UNSPEC_SP_TEST))
(set (match_scratch:DI 4 "=r,r") (const_int 0))
(clobber (match_scratch:DI 3 "=&r,&r"))]
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md (revision 274864)
+++ gcc/config/rs6000/vsx.md (working copy)
@@ -1149,10 +1149,30 @@ (define_insn "vsx_mov<mode>_64bit"
"vecstore, vecload, vecsimple, mffgpr, mftgpr, load,
store, load, store, *, vecsimple, vecsimple,
vecsimple, *, *, vecstore, vecload")
- (set_attr "length"
- "*, *, *, 8, *, 8,
- 8, 8, 8, 8, *, *,
- *, 20, 8, *, *")
+ (set (attr "non_prefixed_length")
+ (cond [(and (eq_attr "alternative" "4") ;; MTVSRDD
+ (match_test "TARGET_P9_VECTOR"))
+ (const_string "4")
+
+ (eq_attr "alternative" "3,4") ;; GPR <-> VSX
+ (const_string "8")
+
+ (eq_attr "alternative" "5,6,7,8") ;; GPR load/store
+ (const_string "8")]
+ (const_string "*")))
+
+ (set (attr "prefixed_length")
+ (cond [(and (eq_attr "alternative" "4") ;; MTVSRDD
+ (match_test "TARGET_P9_VECTOR"))
+ (const_string "4")
+
+ (eq_attr "alternative" "3,4") ;; GPR <-> VSX
+ (const_string "8")
+
+ (eq_attr "alternative" "5,6,7,8") ;; GPR load/store
+ (const_string "20")]
+ (const_string "*")))
+
(set_attr "isa"
"<VSisa>, <VSisa>, <VSisa>, *, *, *,
*, *, *, *, p9v, *,
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797