This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR91522


On Fri, 23 Aug 2019, Richard Biener wrote:

> On Fri, 23 Aug 2019, Uros Bizjak wrote:
> 
> > On Thu, Aug 22, 2019 at 3:35 PM Richard Biener <rguenther@suse.de> wrote:
> > >
> > >
> > > This fixes quadraticness in STV and makes
> > >
> > >  machine dep reorg                  :  89.07 ( 95%)   0.02 ( 18%)  89.10 (
> > > 95%)      54 kB (  0%)
> > >
> > > drop to zero.  Anybody remembers why it is the way it is now?
> > >
> > > Bootstrap / regtest running on x86_64-unknown-linux-gnu.
> > >
> > > OK?
> > 
> > Looking at the PR, comment #3 [1], I assume this patch is obsoltete
> > and will be updated?
> > 
> > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91522#c3
> 
> Yes.  I'm still learning how STV operates (and learing DF and RTL...).
> The following is a rewrite of the non-TImode chain conversion
> according to I think how it should operate als allowing the hunk
> that fixes the compile-time and fixing PR91527 on the way
> (which I ran into during more extensive testing of the patch myself).
> 
> So compared to the state before which I still do not 100% understand
> we now do the following.  Chain detection works as before including
> recording of all defs (both defined by the insns in the chain and
> insns outside) that need copy-in or copy-out operations.
> 
> But then the patch changes things as to guarantee that
> after the conversion all uses/defs of a pseudo are
> of the (subreg:Vmode ..) form or of the original scalar form.
> In particular it avoids the need to change any insns that
> are not part of the chain (besides emitting the extra copy
> instructions).  For this all defs that were marked as needing
> copies (thus they have uses/defs both in the chain and outside)
> the chain will use a new pseudo that we copy to from scalar sources
> and that we copy from for scalar uses.  There's the new defs_map
> which records the mapping of old to new reg.  pseudos that are
> only used in the chain already are not remapped.
> 
> The conversion itself then happens in two stages, first,
> in make_vector_copies, we emit the copy-in insns and
> allocate all pseudos we need.  Then the rest of the
> conversion happens fully inside of convert_insn where
> we generate the copy-outs of the insns def, replace
> defs and uses according to the mapping and replace uses
> and defs with the (subreg:Vmode ..) style.
> 
> For PR91527 IRA doesn't like the REG_EQUIV note in
> 
> (insn 4 24 5 2 (set (subreg:V4SI (reg/v:SI 90 [ c ]) 0)
>         (subreg:V4SI (reg:SI 100) 0)) 
> "/space/rguenther/src/svn/trunk2/gcc/testsuite/g++.dg/tree-ssa/pr21463.C":11:4 
> 1248 {movv4si_internal}
>      (expr_list:REG_DEAD (reg:SI 100)
>         (expr_list:REG_EQUIV (mem/c:SI (plus:DI (reg/f:DI 16 argp)
>                     (const_int 16 [0x10])) [1 c+0 S4 A64])
>             (nil))))
> 
> because the SET_DEST is not a REG_P.  I'm not sure if this
> is invalid RTL, docs say SET_DEST might be a strict_low_part
> or a zero_extract but doesn't mention a subreg.  So I opted
> to simply remove equal/equiv notes on insns we convert
> and since the above has a REG_DEAD note I took the liberty
> to update that according to the mapping (so that would have
> been not needed before this patch) rather than dropping it.
> 
> Bootstrapped with and without --with-march=westmere (to get
> some STV coverage, this included all languages) on 
> x88_64-unknown-linux-gnu, testing in progress.
> 
> OK if testing succeeds?

Testing revealed I made an error in general_scalar_chain::convert_insn
failing to move down SET_SRC extraction below replacing with
the defs map.  This showed in 4 execute FAILs in 32bit fortran
testing (only).  Fixed by moving down the whole def_set/src/dst
extraction.

Re-testing on x86_64-unknown-linux-gnu.

Updated patch below.  I'm feeling adventurous and will run
the "westmere" bootstrap with costing disabled (aka always
convert detected chains...).

Richard.

2019-08-23  Richard Biener  <rguenther@suse.de>

	PR target/91522
	PR target/91527
	* config/i386/i386-features.h (general_scalar_chain::defs_map):
	New member.
	(general_scalar_chain::replace_with_subreg): Remove.
	(general_scalar_chain::replace_with_subreg_in_insn): Likewise.
	(general_scalar_chain::convert_reg): Adjust signature.
	* config/i386/i386-features.c (scalar_chain::add_insn): Do not
	iterate over all defs of a reg.
	(general_scalar_chain::replace_with_subreg): Remove.
	(general_scalar_chain::replace_with_subreg_in_insn): Likewise.
	(general_scalar_chain::make_vector_copies): Populate defs_map,
	place copy only after defs that are used as vectors in the chain.
	(general_scalar_chain::convert_reg): Emit a copy for a specific
	def in a specific instruction.
	(general_scalar_chain::convert_op): All reg uses are converted here.
	(general_scalar_chain::convert_insn): Emit copies for scalar
	uses of defs here.  Replace uses with the copies we created.
	Replace and convert the def.  Adjust REG_DEAD notes, remove
	REG_EQUIV/EQUAL notes.
	(general_scalar_chain::convert_registers): Only handle copies
	into the chain here.


Index: gcc/config/i386/i386-features.c
===================================================================
--- gcc/config/i386/i386-features.c	(revision 274843)
+++ gcc/config/i386/i386-features.c	(working copy)
@@ -416,13 +416,9 @@ scalar_chain::add_insn (bitmap candidate
      iterates over all refs to look for dual-mode regs.  Instead this
      should be done separately for all regs mentioned in the chain once.  */
   df_ref ref;
-  df_ref def;
   for (ref = DF_INSN_UID_DEFS (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref))
     if (!HARD_REGISTER_P (DF_REF_REG (ref)))
-      for (def = DF_REG_DEF_CHAIN (DF_REF_REGNO (ref));
-	   def;
-	   def = DF_REF_NEXT_REG (def))
-	analyze_register_chain (candidates, def);
+      analyze_register_chain (candidates, ref);
   for (ref = DF_INSN_UID_USES (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref))
     if (!DF_REF_REG_MEM_P (ref))
       analyze_register_chain (candidates, ref);
@@ -605,42 +601,6 @@ general_scalar_chain::compute_convert_ga
   return gain;
 }
 
-/* Replace REG in X with a V2DI subreg of NEW_REG.  */
-
-rtx
-general_scalar_chain::replace_with_subreg (rtx x, rtx reg, rtx new_reg)
-{
-  if (x == reg)
-    return gen_rtx_SUBREG (vmode, new_reg, 0);
-
-  /* But not in memory addresses.  */
-  if (MEM_P (x))
-    return x;
-
-  const char *fmt = GET_RTX_FORMAT (GET_CODE (x));
-  int i, j;
-  for (i = GET_RTX_LENGTH (GET_CODE (x)) - 1; i >= 0; i--)
-    {
-      if (fmt[i] == 'e')
-	XEXP (x, i) = replace_with_subreg (XEXP (x, i), reg, new_reg);
-      else if (fmt[i] == 'E')
-	for (j = XVECLEN (x, i) - 1; j >= 0; j--)
-	  XVECEXP (x, i, j) = replace_with_subreg (XVECEXP (x, i, j),
-						   reg, new_reg);
-    }
-
-  return x;
-}
-
-/* Replace REG in INSN with a V2DI subreg of NEW_REG.  */
-
-void
-general_scalar_chain::replace_with_subreg_in_insn (rtx_insn *insn,
-						  rtx reg, rtx new_reg)
-{
-  replace_with_subreg (single_set (insn), reg, new_reg);
-}
-
 /* Insert generated conversion instruction sequence INSNS
    after instruction AFTER.  New BB may be required in case
    instruction has EH region attached.  */
@@ -691,204 +651,147 @@ general_scalar_chain::make_vector_copies
   rtx vreg = gen_reg_rtx (smode);
   df_ref ref;
 
-  for (ref = DF_REG_DEF_CHAIN (regno); ref; ref = DF_REF_NEXT_REG (ref))
-    if (!bitmap_bit_p (insns, DF_REF_INSN_UID (ref)))
-      {
-	start_sequence ();
-	if (!TARGET_INTER_UNIT_MOVES_TO_VEC)
-	  {
-	    rtx tmp = assign_386_stack_local (smode, SLOT_STV_TEMP);
-	    if (smode == DImode && !TARGET_64BIT)
-	      {
-		emit_move_insn (adjust_address (tmp, SImode, 0),
-				gen_rtx_SUBREG (SImode, reg, 0));
-		emit_move_insn (adjust_address (tmp, SImode, 4),
-				gen_rtx_SUBREG (SImode, reg, 4));
-	      }
-	    else
-	      emit_move_insn (copy_rtx (tmp), reg);
-	    emit_insn (gen_rtx_SET (gen_rtx_SUBREG (vmode, vreg, 0),
-				    gen_gpr_to_xmm_move_src (vmode, tmp)));
-	  }
-	else if (!TARGET_64BIT && smode == DImode)
-	  {
-	    if (TARGET_SSE4_1)
-	      {
-		emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0),
-					    CONST0_RTX (V4SImode),
-					    gen_rtx_SUBREG (SImode, reg, 0)));
-		emit_insn (gen_sse4_1_pinsrd (gen_rtx_SUBREG (V4SImode, vreg, 0),
-					      gen_rtx_SUBREG (V4SImode, vreg, 0),
-					      gen_rtx_SUBREG (SImode, reg, 4),
-					      GEN_INT (2)));
-	      }
-	    else
-	      {
-		rtx tmp = gen_reg_rtx (DImode);
-		emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0),
-					    CONST0_RTX (V4SImode),
-					    gen_rtx_SUBREG (SImode, reg, 0)));
-		emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, tmp, 0),
-					    CONST0_RTX (V4SImode),
-					    gen_rtx_SUBREG (SImode, reg, 4)));
-		emit_insn (gen_vec_interleave_lowv4si
-			   (gen_rtx_SUBREG (V4SImode, vreg, 0),
-			    gen_rtx_SUBREG (V4SImode, vreg, 0),
-			    gen_rtx_SUBREG (V4SImode, tmp, 0)));
-	      }
-	  }
-	else
-	  emit_insn (gen_rtx_SET (gen_rtx_SUBREG (vmode, vreg, 0),
-				  gen_gpr_to_xmm_move_src (vmode, reg)));
-	rtx_insn *seq = get_insns ();
-	end_sequence ();
-	rtx_insn *insn = DF_REF_INSN (ref);
-	emit_conversion_insns (seq, insn);
-
-	if (dump_file)
-	  fprintf (dump_file,
-		   "  Copied r%d to a vector register r%d for insn %d\n",
-		   regno, REGNO (vreg), INSN_UID (insn));
-      }
-
-  for (ref = DF_REG_USE_CHAIN (regno); ref; ref = DF_REF_NEXT_REG (ref))
-    if (bitmap_bit_p (insns, DF_REF_INSN_UID (ref)))
-      {
-	rtx_insn *insn = DF_REF_INSN (ref);
-	replace_with_subreg_in_insn (insn, reg, vreg);
-
-	if (dump_file)
-	  fprintf (dump_file, "  Replaced r%d with r%d in insn %d\n",
-		   regno, REGNO (vreg), INSN_UID (insn));
-      }
-}
-
-/* Convert all definitions of register REGNO
-   and fix its uses.  Scalar copies may be created
-   in case register is used in not convertible insn.  */
-
-void
-general_scalar_chain::convert_reg (unsigned regno)
-{
-  bool scalar_copy = bitmap_bit_p (defs_conv, regno);
-  rtx reg = regno_reg_rtx[regno];
-  rtx scopy = NULL_RTX;
-  df_ref ref;
-  bitmap conv;
-
-  conv = BITMAP_ALLOC (NULL);
-  bitmap_copy (conv, insns);
-
-  if (scalar_copy)
-    scopy = gen_reg_rtx (smode);
+  defs_map.put (reg, vreg);
 
+  /* For each insn defining REGNO, see if it is defined by an insn
+     not part of the chain but with uses in insns part of the chain
+     and insert a copy in that case.  */
   for (ref = DF_REG_DEF_CHAIN (regno); ref; ref = DF_REF_NEXT_REG (ref))
     {
-      rtx_insn *insn = DF_REF_INSN (ref);
-      rtx def_set = single_set (insn);
-      rtx src = SET_SRC (def_set);
-      rtx reg = DF_REF_REG (ref);
+      if (bitmap_bit_p (insns, DF_REF_INSN_UID (ref)))
+	continue;
+      df_link *use;
+      for (use = DF_REF_CHAIN (ref); use; use = use->next)
+	if (!DF_REF_REG_MEM_P (use->ref)
+	    && bitmap_bit_p (insns, DF_REF_INSN_UID (use->ref)))
+	  break;
+      if (!use)
+	continue;
 
-      if (!MEM_P (src))
+      start_sequence ();
+      if (!TARGET_INTER_UNIT_MOVES_TO_VEC)
 	{
-	  replace_with_subreg_in_insn (insn, reg, reg);
-	  bitmap_clear_bit (conv, INSN_UID (insn));
+	  rtx tmp = assign_386_stack_local (smode, SLOT_STV_TEMP);
+	  if (smode == DImode && !TARGET_64BIT)
+	    {
+	      emit_move_insn (adjust_address (tmp, SImode, 0),
+			      gen_rtx_SUBREG (SImode, reg, 0));
+	      emit_move_insn (adjust_address (tmp, SImode, 4),
+			      gen_rtx_SUBREG (SImode, reg, 4));
+	    }
+	  else
+	    emit_move_insn (copy_rtx (tmp), reg);
+	  emit_insn (gen_rtx_SET (gen_rtx_SUBREG (vmode, vreg, 0),
+				  gen_gpr_to_xmm_move_src (vmode, tmp)));
 	}
-
-      if (scalar_copy)
+      else if (!TARGET_64BIT && smode == DImode)
 	{
-	  start_sequence ();
-	  if (!TARGET_INTER_UNIT_MOVES_FROM_VEC)
+	  if (TARGET_SSE4_1)
 	    {
-	      rtx tmp = assign_386_stack_local (smode, SLOT_STV_TEMP);
-	      emit_move_insn (tmp, reg);
-	      if (!TARGET_64BIT && smode == DImode)
-		{
-		  emit_move_insn (gen_rtx_SUBREG (SImode, scopy, 0),
-				  adjust_address (tmp, SImode, 0));
-		  emit_move_insn (gen_rtx_SUBREG (SImode, scopy, 4),
-				  adjust_address (tmp, SImode, 4));
-		}
-	      else
-		emit_move_insn (scopy, copy_rtx (tmp));
+	      emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0),
+					  CONST0_RTX (V4SImode),
+					  gen_rtx_SUBREG (SImode, reg, 0)));
+	      emit_insn (gen_sse4_1_pinsrd (gen_rtx_SUBREG (V4SImode, vreg, 0),
+					    gen_rtx_SUBREG (V4SImode, vreg, 0),
+					    gen_rtx_SUBREG (SImode, reg, 4),
+					    GEN_INT (2)));
 	    }
-	  else if (!TARGET_64BIT && smode == DImode)
+	  else
 	    {
-	      if (TARGET_SSE4_1)
-		{
-		  rtx tmp = gen_rtx_PARALLEL (VOIDmode,
-					      gen_rtvec (1, const0_rtx));
-		  emit_insn
-		    (gen_rtx_SET
-		       (gen_rtx_SUBREG (SImode, scopy, 0),
-			gen_rtx_VEC_SELECT (SImode,
-					    gen_rtx_SUBREG (V4SImode, reg, 0),
-					    tmp)));
-
-		  tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (1, const1_rtx));
-		  emit_insn
-		    (gen_rtx_SET
-		       (gen_rtx_SUBREG (SImode, scopy, 4),
-			gen_rtx_VEC_SELECT (SImode,
-					    gen_rtx_SUBREG (V4SImode, reg, 0),
-					    tmp)));
-		}
-	      else
-		{
-		  rtx vcopy = gen_reg_rtx (V2DImode);
-		  emit_move_insn (vcopy, gen_rtx_SUBREG (V2DImode, reg, 0));
-		  emit_move_insn (gen_rtx_SUBREG (SImode, scopy, 0),
-				  gen_rtx_SUBREG (SImode, vcopy, 0));
-		  emit_move_insn (vcopy,
-				  gen_rtx_LSHIFTRT (V2DImode,
-						    vcopy, GEN_INT (32)));
-		  emit_move_insn (gen_rtx_SUBREG (SImode, scopy, 4),
-				  gen_rtx_SUBREG (SImode, vcopy, 0));
-		}
+	      rtx tmp = gen_reg_rtx (DImode);
+	      emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, vreg, 0),
+					  CONST0_RTX (V4SImode),
+					  gen_rtx_SUBREG (SImode, reg, 0)));
+	      emit_insn (gen_sse2_loadld (gen_rtx_SUBREG (V4SImode, tmp, 0),
+					  CONST0_RTX (V4SImode),
+					  gen_rtx_SUBREG (SImode, reg, 4)));
+	      emit_insn (gen_vec_interleave_lowv4si
+			 (gen_rtx_SUBREG (V4SImode, vreg, 0),
+			  gen_rtx_SUBREG (V4SImode, vreg, 0),
+			  gen_rtx_SUBREG (V4SImode, tmp, 0)));
 	    }
-	  else
-	    emit_move_insn (scopy, reg);
-
-	  rtx_insn *seq = get_insns ();
-	  end_sequence ();
-	  emit_conversion_insns (seq, insn);
-
-	  if (dump_file)
-	    fprintf (dump_file,
-		     "  Copied r%d to a scalar register r%d for insn %d\n",
-		     regno, REGNO (scopy), INSN_UID (insn));
 	}
-    }
-
-  for (ref = DF_REG_USE_CHAIN (regno); ref; ref = DF_REF_NEXT_REG (ref))
-    if (bitmap_bit_p (insns, DF_REF_INSN_UID (ref)))
-      {
-	if (bitmap_bit_p (conv, DF_REF_INSN_UID (ref)))
-	  {
-	    rtx_insn *insn = DF_REF_INSN (ref);
+      else
+	emit_insn (gen_rtx_SET (gen_rtx_SUBREG (vmode, vreg, 0),
+				gen_gpr_to_xmm_move_src (vmode, reg)));
+      rtx_insn *seq = get_insns ();
+      end_sequence ();
+      rtx_insn *insn = DF_REF_INSN (ref);
+      emit_conversion_insns (seq, insn);
 
-	    rtx def_set = single_set (insn);
-	    gcc_assert (def_set);
+      if (dump_file)
+	fprintf (dump_file,
+		 "  Copied r%d to a vector register r%d for insn %d\n",
+		 regno, REGNO (vreg), INSN_UID (insn));
+    }
+}
 
-	    rtx src = SET_SRC (def_set);
-	    rtx dst = SET_DEST (def_set);
+/* Copy the definition SRC of INSN inside the chain to DST for
+   scalar uses outside of the chain.  */
 
-	    if (!MEM_P (dst) || !REG_P (src))
-	      replace_with_subreg_in_insn (insn, reg, reg);
+void
+general_scalar_chain::convert_reg (rtx_insn *insn, rtx dst, rtx src)
+{
+  start_sequence ();
+  if (!TARGET_INTER_UNIT_MOVES_FROM_VEC)
+    {
+      rtx tmp = assign_386_stack_local (smode, SLOT_STV_TEMP);
+      emit_move_insn (tmp, src);
+      if (!TARGET_64BIT && smode == DImode)
+	{
+	  emit_move_insn (gen_rtx_SUBREG (SImode, dst, 0),
+			  adjust_address (tmp, SImode, 0));
+	  emit_move_insn (gen_rtx_SUBREG (SImode, dst, 4),
+			  adjust_address (tmp, SImode, 4));
+	}
+      else
+	emit_move_insn (dst, copy_rtx (tmp));
+    }
+  else if (!TARGET_64BIT && smode == DImode)
+    {
+      if (TARGET_SSE4_1)
+	{
+	  rtx tmp = gen_rtx_PARALLEL (VOIDmode,
+				      gen_rtvec (1, const0_rtx));
+	  emit_insn
+	      (gen_rtx_SET
+	       (gen_rtx_SUBREG (SImode, dst, 0),
+		gen_rtx_VEC_SELECT (SImode,
+				    gen_rtx_SUBREG (V4SImode, src, 0),
+				    tmp)));
+
+	  tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (1, const1_rtx));
+	  emit_insn
+	      (gen_rtx_SET
+	       (gen_rtx_SUBREG (SImode, dst, 4),
+		gen_rtx_VEC_SELECT (SImode,
+				    gen_rtx_SUBREG (V4SImode, src, 0),
+				    tmp)));
+	}
+      else
+	{
+	  rtx vcopy = gen_reg_rtx (V2DImode);
+	  emit_move_insn (vcopy, gen_rtx_SUBREG (V2DImode, src, 0));
+	  emit_move_insn (gen_rtx_SUBREG (SImode, dst, 0),
+			  gen_rtx_SUBREG (SImode, vcopy, 0));
+	  emit_move_insn (vcopy,
+			  gen_rtx_LSHIFTRT (V2DImode,
+					    vcopy, GEN_INT (32)));
+	  emit_move_insn (gen_rtx_SUBREG (SImode, dst, 4),
+			  gen_rtx_SUBREG (SImode, vcopy, 0));
+	}
+    }
+  else
+    emit_move_insn (dst, src);
 
-	    bitmap_clear_bit (conv, INSN_UID (insn));
-	  }
-      }
-    /* Skip debug insns and uninitialized uses.  */
-    else if (DF_REF_CHAIN (ref)
-	     && NONDEBUG_INSN_P (DF_REF_INSN (ref)))
-      {
-	gcc_assert (scopy);
-	replace_rtx (DF_REF_INSN (ref), reg, scopy);
-	df_insn_rescan (DF_REF_INSN (ref));
-      }
+  rtx_insn *seq = get_insns ();
+  end_sequence ();
+  emit_conversion_insns (seq, insn);
 
-  BITMAP_FREE (conv);
+  if (dump_file)
+    fprintf (dump_file,
+	     "  Copied r%d to a scalar register r%d for insn %d\n",
+	     REGNO (src), REGNO (dst), INSN_UID (insn));
 }
 
 /* Convert operand OP in INSN.  We should handle
@@ -921,16 +824,6 @@ general_scalar_chain::convert_op (rtx *o
     }
   else if (REG_P (*op))
     {
-      /* We may have not converted register usage in case
-	 this register has no definition.  Otherwise it
-	 should be converted in convert_reg.  */
-      df_ref ref;
-      FOR_EACH_INSN_USE (ref, insn)
-	if (DF_REF_REGNO (ref) == REGNO (*op))
-	  {
-	    gcc_assert (!DF_REF_CHAIN (ref));
-	    break;
-	  }
       *op = gen_rtx_SUBREG (vmode, *op, 0);
     }
   else if (CONST_INT_P (*op))
@@ -975,6 +868,32 @@ general_scalar_chain::convert_op (rtx *o
 void
 general_scalar_chain::convert_insn (rtx_insn *insn)
 {
+  /* Generate copies for out-of-chain uses of defs.  */
+  for (df_ref ref = DF_INSN_DEFS (insn); ref; ref = DF_REF_NEXT_LOC (ref))
+    if (bitmap_bit_p (defs_conv, DF_REF_REGNO (ref)))
+      {
+	df_link *use;
+	for (use = DF_REF_CHAIN (ref); use; use = use->next)
+	  if (DF_REF_REG_MEM_P (use->ref)
+	      || !bitmap_bit_p (insns, DF_REF_INSN_UID (use->ref)))
+	    break;
+	if (use)
+	  convert_reg (insn, DF_REF_REG (ref),
+		       *defs_map.get (regno_reg_rtx [DF_REF_REGNO (ref)]));
+      }
+
+  /* Replace uses in this insn with the defs we use in the chain.  */
+  for (df_ref ref = DF_INSN_USES (insn); ref; ref = DF_REF_NEXT_LOC (ref))
+    if (!DF_REF_REG_MEM_P (ref))
+      if (rtx *vreg = defs_map.get (regno_reg_rtx[DF_REF_REGNO (ref)]))
+	{
+	  /* Also update a corresponding REG_DEAD note.  */
+	  rtx note = find_reg_note (insn, REG_DEAD, DF_REF_REG (ref));
+	  if (note)
+	    XEXP (note, 0) = *vreg;
+	  *DF_REF_REAL_LOC (ref) = *vreg;
+	}
+
   rtx def_set = single_set (insn);
   rtx src = SET_SRC (def_set);
   rtx dst = SET_DEST (def_set);
@@ -988,6 +907,20 @@ general_scalar_chain::convert_insn (rtx_
       emit_conversion_insns (gen_move_insn (dst, tmp), insn);
       dst = gen_rtx_SUBREG (vmode, tmp, 0);
     }
+  else if (REG_P (dst))
+    {
+      /* Replace the definition with a SUBREG to the definition we
+         use inside the chain.  */
+      rtx *vdef = defs_map.get (dst);
+      if (vdef)
+	dst = *vdef;
+      dst = gen_rtx_SUBREG (vmode, dst, 0);
+      /* IRA doesn't like to have REG_EQUAL/EQUIV notes when the SET_DEST
+         is a non-REG_P.  So kill those off.  */
+      rtx note = find_reg_equal_equiv_note (insn);
+      if (note)
+	remove_note (insn, note);
+    }
 
   switch (GET_CODE (src))
     {
@@ -1045,20 +978,15 @@ general_scalar_chain::convert_insn (rtx_
     case COMPARE:
       src = SUBREG_REG (XEXP (XEXP (src, 0), 0));
 
-      gcc_assert ((REG_P (src) && GET_MODE (src) == DImode)
-		  || (SUBREG_P (src) && GET_MODE (src) == V2DImode));
-
-      if (REG_P (src))
-	subreg = gen_rtx_SUBREG (V2DImode, src, 0);
-      else
-	subreg = copy_rtx_if_shared (src);
+      gcc_assert (REG_P (src) && GET_MODE (src) == DImode);
+      subreg = gen_rtx_SUBREG (V2DImode, src, 0);
       emit_insn_before (gen_vec_interleave_lowv2di (copy_rtx_if_shared (subreg),
 						    copy_rtx_if_shared (subreg),
 						    copy_rtx_if_shared (subreg)),
 			insn);
       dst = gen_rtx_REG (CCmode, FLAGS_REG);
-      src = gen_rtx_UNSPEC (CCmode, gen_rtvec (2, copy_rtx_if_shared (src),
-					       copy_rtx_if_shared (src)),
+      src = gen_rtx_UNSPEC (CCmode, gen_rtvec (2, copy_rtx_if_shared (subreg),
+					       copy_rtx_if_shared (subreg)),
 			    UNSPEC_PTEST);
       break;
 
@@ -1217,16 +1145,15 @@ timode_scalar_chain::convert_insn (rtx_i
   df_insn_rescan (insn);
 }
 
+/* Generate copies from defs used by the chain but not defined therein.
+   Also populates defs_map which is used later by convert_insn.  */
+
 void
 general_scalar_chain::convert_registers ()
 {
   bitmap_iterator bi;
   unsigned id;
-
-  EXECUTE_IF_SET_IN_BITMAP (defs, 0, id, bi)
-    convert_reg (id);
-
-  EXECUTE_IF_AND_COMPL_IN_BITMAP (defs_conv, defs, 0, id, bi)
+  EXECUTE_IF_SET_IN_BITMAP (defs_conv, 0, id, bi)
     make_vector_copies (id);
 }
 
Index: gcc/config/i386/i386-features.h
===================================================================
--- gcc/config/i386/i386-features.h	(revision 274843)
+++ gcc/config/i386/i386-features.h	(working copy)
@@ -171,12 +171,11 @@ class general_scalar_chain : public scal
     : scalar_chain (smode_, vmode_) {}
   int compute_convert_gain ();
  private:
+  hash_map<rtx, rtx> defs_map;
   void mark_dual_mode_def (df_ref def);
-  rtx replace_with_subreg (rtx x, rtx reg, rtx subreg);
-  void replace_with_subreg_in_insn (rtx_insn *insn, rtx reg, rtx subreg);
   void convert_insn (rtx_insn *insn);
   void convert_op (rtx *op, rtx_insn *insn);
-  void convert_reg (unsigned regno);
+  void convert_reg (rtx_insn *insn, rtx dst, rtx src);
   void make_vector_copies (unsigned regno);
   void convert_registers ();
   int vector_const_cost (rtx exp);


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]