This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

PATCH to be committed: Fix for 4.1 regression 13931


PR 13931 describes a compile-time regression in distribute_notes in
combine.c.  In the test case in the patch the compiler spends over 50%
of its time in the combine pass.  The regression is due to this patch:
    http://gcc.gnu.org/ml/gcc-patches/2003-05/msg01358.html

2003-05-14  Eric Christopher  <echristo@redhat.com>

	* combine.c: Fix header comments.
	(distribute_notes): Remove usage of elim_i1, elim_i2. Propagate
	to all calls and prototype.

That patch was to fix this bug report on hppa64-hp-hpux11:
    http://gcc.gnu.org/ml/gcc-patches/2003-05/msg00796.html
which is described in more detail here:
    http://gcc.gnu.org/ml/gcc-patches/2003-05/msg00813.html

Before Eric's patch above, combine had an optimization in which if it
saw a REG_DEAD note for a register which was set in an insn which it
was eliminating, it would simply discard the REG_DEAD note.  In other
cases combine would search for an insn to which to attach the REG_DEAD
note.  That optimization was misfiring in the test case on
hppa64-hp-hpux11, and eliminating a REG_DEAD note which was actually
meaningful.  Eric fixed the bug by eliminating the optimization.

However, as can be seen in PR 13931, eliminating the optimization
causes a significant compile time regression in some cases (basically,
programs with large basic blocks in which combine is able to combine
insns which set registers with insns which use them).  So it is
preferable to not eliminate the optimization entirely, but to instead
fix the optimization so that it does not apply inappropriately.

This turns out to be easy.  The optimization was misapplied to this
insn:
    (set (zero_extract:DI (subreg:DI (reg:SI 69) 0)
          (const_int 32 [0x20])
          (const_int 32 [0x20]))
         (reg:DI 68 [ u ]))
because combine reduced that insn to
    (set (reg:SI 69) (subreg:SI (reg:DI 68) 4))
This caused combine to assume that it was OK to eliminate the REG_DEAD
notice for reg 69, because it is set completely in the insn.  The
problem is that there was a previous insn which initialized register
69 prior to the set of the zero_extract, and there was now no REG_DEAD
note for that register setting.  In earlier versions of gcc that
caused an abort in register allocation.  The current register
allocator handles the situation correctly, but this is still
undesirable as we don't really want to allocate a register which is
not used.

Fixing this is simply a matter of detecting that reg 69 does not
appears to be completely set in the original insn--that is, before
combine's simplifications, reg 69 required some sort of
initialization.

This patch reverts Eric's patch to go back to the original code, and
adds some checks to avoid the erroneous case.  J. David Anglin tested
this for me on hppa64-hp-hpux11, and reported that it did not appear
to introduce any new problems, although unfortunately hppa64-hp-hpux11
is not currently bootstrapping.  I ran a bootstrap and testsuite run
on i686-pc-linux-gnu.

If nobody objects I plan to commit this patch tomorrow.

Ian


2005-10-10  Ian Lance Taylor  <ian@airs.com>

	PR rtl-optimization/13931
	* combine.c: Revert patch of 2003-05-14, and:
	(try_combine): Only set elim_i1 and elim_i2 if the destination is
	completely killed in the appropriate insn.
	(distribute_notes): Don't skip multiple hard register test for
	elim_i1 and elim_i2.


Index: combine.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/combine.c,v
retrieving revision 1.504
diff -p -u -r1.504 combine.c
--- combine.c	26 Sep 2005 17:42:16 -0000	1.504
+++ combine.c	9 Oct 2005 15:34:44 -0000
@@ -53,6 +53,10 @@ Software Foundation, 51 Franklin Street,
    flow.c aren't completely updated:
 
    - reg_live_length is not updated
+   - reg_n_refs is not adjusted in the rare case when a register is
+     no longer required in a computation
+   - there are extremely rare cases (see distribute_regnotes) when a
+     REG_DEAD note is lost
    - a LOG_LINKS entry that refers to an insn with multiple SETs may be
      removed because there is no way to know which register it was
      linking
@@ -410,7 +414,7 @@ static void reg_dead_at_p_1 (rtx, rtx, v
 static int reg_dead_at_p (rtx, rtx);
 static void move_deaths (rtx, rtx, int, rtx, rtx *);
 static int reg_bitfield_target_p (rtx, rtx);
-static void distribute_notes (rtx, rtx, rtx, rtx);
+static void distribute_notes (rtx, rtx, rtx, rtx, rtx, rtx);
 static void distribute_links (rtx);
 static void mark_used_regs_combine (rtx);
 static int insn_cuid (rtx);
@@ -1730,6 +1734,7 @@ try_combine (rtx i3, rtx i2, rtx i1, int
   rtx i2pat;
   /* Indicates if I2DEST or I1DEST is in I2SRC or I1_SRC.  */
   int i2dest_in_i2src = 0, i1dest_in_i1src = 0, i2dest_in_i1src = 0;
+  int i2dest_killed = 0, i1dest_killed = 0;
   int i1_feeds_i3 = 0;
   /* Notes that must be added to REG_NOTES in I3 and I2.  */
   rtx new_i3_notes, new_i2_notes;
@@ -1838,6 +1843,7 @@ try_combine (rtx i3, rtx i2, rtx i1, int
 
 	      added_sets_2 = added_sets_1 = 0;
 	      i2dest = SET_SRC (PATTERN (i3));
+	      i2dest_killed = dead_or_set_p (i2, i2dest);
 
 	      /* Replace the dest in I2 with our dest and make the resulting
 		 insn the new pattern for I3.  Then skip to where we
@@ -1912,6 +1918,7 @@ try_combine (rtx i3, rtx i2, rtx i1, int
       subst_low_cuid = INSN_CUID (i2);
       added_sets_2 = added_sets_1 = 0;
       i2dest = SET_DEST (temp);
+      i2dest_killed = dead_or_set_p (i2, i2dest);
 
       SUBST (SET_SRC (temp),
 	     immed_double_const (lo, hi, GET_MODE (SET_DEST (temp))));
@@ -1982,6 +1989,8 @@ try_combine (rtx i3, rtx i2, rtx i1, int
   i2dest_in_i2src = reg_overlap_mentioned_p (i2dest, i2src);
   i1dest_in_i1src = i1 && reg_overlap_mentioned_p (i1dest, i1src);
   i2dest_in_i1src = i1 && reg_overlap_mentioned_p (i2dest, i1src);
+  i2dest_killed = dead_or_set_p (i2, i2dest);
+  i1dest_killed = i1 && dead_or_set_p (i1, i1dest);
 
   /* See if I1 directly feeds into I3.  It does if I1DEST is not used
      in I2SRC.  */
@@ -2737,7 +2746,7 @@ try_combine (rtx i3, rtx i2, rtx i1, int
 	  REG_N_DEATHS (REGNO (XEXP (note, 0)))++;
 
       distribute_notes (new_other_notes, undobuf.other_insn,
-			undobuf.other_insn, NULL_RTX);
+			undobuf.other_insn, NULL_RTX, NULL_RTX, NULL_RTX);
     }
 #ifdef HAVE_cc0
   /* If I2 is the CC0 setter and I3 is the CC0 user then check whether
@@ -2812,6 +2821,17 @@ try_combine (rtx i3, rtx i2, rtx i1, int
     rtx i3links, i2links, i1links = 0;
     rtx midnotes = 0;
     unsigned int regno;
+    /* Compute which registers we expect to eliminate.  newi2pat may be setting
+       either i3dest or i2dest, so we must check it.  Also, i1dest may be the
+       same as i3dest, in which case newi2pat may be setting i1dest.  */
+    rtx elim_i2 = ((newi2pat && reg_set_p (i2dest, newi2pat))
+		   || i2dest_in_i2src || i2dest_in_i1src
+		   || !i2dest_killed
+		   ? 0 : i2dest);
+    rtx elim_i1 = (i1 == 0 || i1dest_in_i1src
+		   || (newi2pat && reg_set_p (i1dest, newi2pat))
+		   || !i1dest_killed
+		   ? 0 : i1dest);
 
     /* Get the old REG_NOTES and LOG_LINKS from all our insns and
        clear them.  */
@@ -2936,13 +2956,17 @@ try_combine (rtx i3, rtx i2, rtx i1, int
 
     /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3.  */
     if (i3notes)
-      distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL_RTX);
+      distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL_RTX,
+			elim_i2, elim_i1);
     if (i2notes)
-      distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL_RTX);
+      distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL_RTX,
+			elim_i2, elim_i1);
     if (i1notes)
-      distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL_RTX);
+      distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL_RTX,
+			elim_i2, elim_i1);
     if (midnotes)
-      distribute_notes (midnotes, NULL_RTX, i3, newi2pat ? i2 : NULL_RTX);
+      distribute_notes (midnotes, NULL_RTX, i3, newi2pat ? i2 : NULL_RTX,
+			elim_i2, elim_i1);
 
     /* Distribute any notes added to I2 or I3 by recog_for_combine.  We
        know these are REG_UNUSED and want them to go to the desired insn,
@@ -2955,7 +2979,7 @@ try_combine (rtx i3, rtx i2, rtx i1, int
 	  if (REG_P (XEXP (temp, 0)))
 	    REG_N_DEATHS (REGNO (XEXP (temp, 0)))++;
 
-	distribute_notes (new_i2_notes, i2, i2, NULL_RTX);
+	distribute_notes (new_i2_notes, i2, i2, NULL_RTX, NULL_RTX, NULL_RTX);
       }
 
     if (new_i3_notes)
@@ -2964,7 +2988,7 @@ try_combine (rtx i3, rtx i2, rtx i1, int
 	  if (REG_P (XEXP (temp, 0)))
 	    REG_N_DEATHS (REGNO (XEXP (temp, 0)))++;
 
-	distribute_notes (new_i3_notes, i3, i3, NULL_RTX);
+	distribute_notes (new_i3_notes, i3, i3, NULL_RTX, NULL_RTX, NULL_RTX);
       }
 
     /* If I3DEST was used in I3SRC, it really died in I3.  We may need to
@@ -2982,11 +3006,12 @@ try_combine (rtx i3, rtx i2, rtx i1, int
 	if (newi2pat && reg_set_p (i3dest_killed, newi2pat))
 	  distribute_notes (gen_rtx_EXPR_LIST (REG_DEAD, i3dest_killed,
 					       NULL_RTX),
-			    NULL_RTX, i2, NULL_RTX);
+			    NULL_RTX, i2, NULL_RTX, elim_i2, elim_i1);
 	else
 	  distribute_notes (gen_rtx_EXPR_LIST (REG_DEAD, i3dest_killed,
 					       NULL_RTX),
-			    NULL_RTX, i3, newi2pat ? i2 : NULL_RTX);
+			    NULL_RTX, i3, newi2pat ? i2 : NULL_RTX,
+			    elim_i2, elim_i1);
       }
 
     if (i2dest_in_i2src)
@@ -2996,10 +3021,11 @@ try_combine (rtx i3, rtx i2, rtx i1, int
 
 	if (newi2pat && reg_set_p (i2dest, newi2pat))
 	  distribute_notes (gen_rtx_EXPR_LIST (REG_DEAD, i2dest, NULL_RTX),
-			    NULL_RTX, i2, NULL_RTX);
+			    NULL_RTX, i2, NULL_RTX, NULL_RTX, NULL_RTX);
 	else
 	  distribute_notes (gen_rtx_EXPR_LIST (REG_DEAD, i2dest, NULL_RTX),
-			    NULL_RTX, i3, newi2pat ? i2 : NULL_RTX);
+			    NULL_RTX, i3, newi2pat ? i2 : NULL_RTX,
+			    NULL_RTX, NULL_RTX);
       }
 
     if (i1dest_in_i1src)
@@ -3009,10 +3035,11 @@ try_combine (rtx i3, rtx i2, rtx i1, int
 
 	if (newi2pat && reg_set_p (i1dest, newi2pat))
 	  distribute_notes (gen_rtx_EXPR_LIST (REG_DEAD, i1dest, NULL_RTX),
-			    NULL_RTX, i2, NULL_RTX);
+			    NULL_RTX, i2, NULL_RTX, NULL_RTX, NULL_RTX);
 	else
 	  distribute_notes (gen_rtx_EXPR_LIST (REG_DEAD, i1dest, NULL_RTX),
-			    NULL_RTX, i3, newi2pat ? i2 : NULL_RTX);
+			    NULL_RTX, i3, newi2pat ? i2 : NULL_RTX,
+			    NULL_RTX, NULL_RTX);
       }
 
     distribute_links (i3links);
@@ -11906,11 +11933,16 @@ reg_bitfield_target_p (rtx x, rtx body)
    as appropriate.  I3 and I2 are the insns resulting from the combination
    insns including FROM (I2 may be zero).
 
+   ELIM_I2 and ELIM_I1 are either zero or registers that we know will
+   not need REG_DEAD notes because they are being substituted for.  This
+   saves searching in the most common cases.
+
    Each note in the list is either ignored or placed on some insns, depending
    on the type of note.  */
 
 static void
-distribute_notes (rtx notes, rtx from_insn, rtx i3, rtx i2)
+distribute_notes (rtx notes, rtx from_insn, rtx i3, rtx i2, rtx elim_i2,
+		  rtx elim_i1)
 {
   rtx note, next_note;
   rtx tem;
@@ -12188,6 +12220,11 @@ distribute_notes (rtx notes, rtx from_in
 		   && reg_referenced_p (XEXP (note, 0), PATTERN (i2)))
 	    place = i2;
 
+	  if (place == 0
+	      && (rtx_equal_p (XEXP (note, 0), elim_i2)
+		  || rtx_equal_p (XEXP (note, 0), elim_i1)))
+	    break;
+
 	  if (place == 0)
 	    {
 	      basic_block bb = this_basic_block;
@@ -12249,7 +12286,8 @@ distribute_notes (rtx notes, rtx from_in
 			  PATTERN (tem) = pc_rtx;
 			  REG_NOTES (tem) = NULL;
 
-			  distribute_notes (old_notes, tem, tem, NULL_RTX);
+			  distribute_notes (old_notes, tem, tem, NULL_RTX,
+					    NULL_RTX, NULL_RTX);
 			  distribute_links (LOG_LINKS (tem));
 
 			  SET_INSN_DELETED (tem);
@@ -12263,7 +12301,8 @@ distribute_notes (rtx notes, rtx from_in
 			      REG_NOTES (cc0_setter) = NULL;
 
 			      distribute_notes (old_notes, cc0_setter,
-						cc0_setter, NULL_RTX);
+						cc0_setter, NULL_RTX,
+						NULL_RTX, NULL_RTX);
 			      distribute_links (LOG_LINKS (cc0_setter));
 
 			      SET_INSN_DELETED (cc0_setter);
@@ -12398,7 +12437,7 @@ distribute_notes (rtx notes, rtx from_in
 				= gen_rtx_EXPR_LIST (REG_DEAD, piece, NULL_RTX);
 
 			      distribute_notes (new_note, place, place,
-						NULL_RTX);
+						NULL_RTX, NULL_RTX, NULL_RTX);
 			    }
 			  else if (! refers_to_regno_p (i, i + 1,
 							PATTERN (place), 0)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]