[PATCH] PR target/70155: Use SSE for TImode load/store

Uros Bizjak ubizjak@gmail.com
Wed Apr 27 12:03:00 GMT 2016


On Tue, Apr 26, 2016 at 9:50 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>> Here is the updated patch which does that.  Ok for trunk if there
>> is no regressions on x86-64?
>>
>
> CSE works with SSE constants now.  Here is the updated patch.
> OK for trunk if there are no regressions on x86-64?

+static bool
+timode_scalar_to_vector_candidate_p (rtx_insn *insn)
+{
+  rtx def_set = single_set (insn);
+
+  if (!def_set)
+    return false;
+
+  if (has_non_address_hard_reg (insn))
+    return false;
+
+  rtx src = SET_SRC (def_set);
+  rtx dst = SET_DEST (def_set);
+
+  /* Only TImode load and store are allowed.  */
+  if (GET_MODE (dst) != TImode)
+    return false;
+
+  if (MEM_P (dst))
+    {
+      /* Check for store.  Only support store from register or standard
+ SSE constants.  */
+      switch (GET_CODE (src))
+ {
+ default:
+  return false;
+
+ case REG:
+  /* For store from register, memory must be aligned or both
+     unaligned load and store are optimal.  */
+  return (!misaligned_operand (dst, TImode)
+  || (TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
+      && TARGET_SSE_UNALIGNED_STORE_OPTIMAL));

Why check TARGET_SSE_UNALIGNED_LOAD_OPTIMAL here? We are moving from a
register here.

+ case CONST_INT:
+  /* For store from standard SSE constant, memory must be
+     aligned or unaligned store is optimal.  */
+  return (standard_sse_constant_p (src, TImode)
+  && (!misaligned_operand (dst, TImode)
+      || TARGET_SSE_UNALIGNED_STORE_OPTIMAL));
+ }
+    }
+  else if (MEM_P (src))
+    {
+      /* Check for load.  Memory must be aligned or both unaligned
+ load and store are optimal.  */
+      return (GET_CODE (dst) == REG
+      && (!misaligned_operand (src, TImode)
+  || (TARGET_SSE_UNALIGNED_LOAD_OPTIMAL
+      && TARGET_SSE_UNALIGNED_STORE_OPTIMAL)));

Also here. We are loading a regiister, no point to check
TARGET_SSE_UNALIGNED_STORE_OPTIMAL.

+    }
+
+  return false;
+}
+

+/* Convert INSN from TImode to V1T1mode.  */
+
+void
+timode_scalar_chain::convert_insn (rtx_insn *insn)
+{
+  rtx def_set = single_set (insn);
+  rtx src = SET_SRC (def_set);
+  rtx tmp;
+  rtx dst = SET_DEST (def_set);

No need for tmp declaration above ...

+  switch (GET_CODE (dst))
+    {
+    case REG:
+      tmp = find_reg_equal_equiv_note (insn);

... if you declare it here ...

+      if (tmp)
+ PUT_MODE (XEXP (tmp, 0), V1TImode);

/* FALLTHRU */

+    case MEM:
+      PUT_MODE (dst, V1TImode);
+      break;

+    case CONST_INT:
+      switch (standard_sse_constant_p (src, TImode))
+ {
+ case 1:
+  src = CONST0_RTX (GET_MODE (dst));
+  tmp = gen_reg_rtx (V1TImode);
+  break;
+ case 2:
+  src = CONSTM1_RTX (GET_MODE (dst));
+  tmp = gen_reg_rtx (V1TImode);
+  break;
+ default:
+  gcc_unreachable ();
+ }
+      if (NONDEBUG_INSN_P (insn))
+ {

... and here. Please generate temp register here.

+  /* Since there are no instructions to store standard SSE
+     constant, temporary register usage is required.  */
+  emit_conversion_insns (gen_rtx_SET (dst, tmp), insn);
+  dst = tmp;
+ }


   /* This needs to be done at start up.  It's convenient to do it here.  */
   register_pass (&insert_vzeroupper_info);
-  register_pass (&stv_info);
+  register_pass (TARGET_64BIT ? &stv_info_64 : &stv_info_32);
 }

stv_info_timode and stv_info_dimode?

Uros.



More information about the Gcc-patches mailing list