This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RFA: another version of patch to solve the PR37535


Lin Weiliang reported a big degradation (about 20%) for SPEC2006 omnetpp after submitting previous version of the patch to solve PR37535. The patch was ok with safety point of view but it introduced
additional conflicts in some cases. The conflicts
were not necessary. Please, look at one example


(insn:HI 11 10 12 2 libs/sim/cenum.cc:157 (parallel [
          (set (reg:SI 72)
              (div:SI (reg:SI 69)
                  (reg:SI 64 [ D.26312 ])))
          (set (reg:SI 71)
              (mod:SI (reg:SI 69)
                  (reg:SI 64 [ D.26312 ])))
          (clobber (reg:CC 17 flags))
      ]) 354 {*divmodsi4_nocltd} (expr_list:REG_DEAD (reg:SI 69)
      (expr_list:REG_UNUSED (reg:SI 72)
          (expr_list:REG_UNUSED (reg:CC 17 flags)
              (nil)))))

and the insn description

(define_insn "*divmodsi4_nocltd"
[(set (match_operand:SI 0 "register_operand" "=&a,?a")
  (div:SI (match_operand:SI 2 "register_operand" "1,0")
      (match_operand:SI 3 "nonimmediate_operand" "rm,rm")))
 (set (match_operand:SI 1 "register_operand" "=&d,&d")
  (mod:SI (match_dup 2) (match_dup 3)))
 (clobber (reg:CC FLAGS_REG))]
"optimize_function_for_speed_p (cfun) && !TARGET_USE_CLTD"
"#"
[(set_attr "type" "multi")])

IRA after the patch makes 71 and 72 conflicting with 69 because 71
and 72 are early clobbers.  That results in degradation.  The old RA
checks conflicts more accurately.  It checks classes of inputs and
early clobbers.  For the first alternative and zero and 2nd operand or
the 2nd alternative and 1st and 2nd operands, they are DREG and AREG
and therefore can not conflict.  For 1st alternative and 1st and 2nd
operands, they are the same operands and therefore can not conflict
again.

The following patch solves omnetpp performance degradation by checking conflicts for early clobber as accurate as possible. It checks register classes in insn constraints to avoid unnecessary conflicts. The old register allocator checks analogously (see ra-conflicts.c) although I found the old RA does it wrong processing operand matches (0-9).

The patch was thorougly tested as the previous version (x86, x86_64, ppc64, ppc64 in 32-bit mode, itanium, x86 under darwin) and no additional failures are introduced.

Ok to commit?

2008-10-14 Vladimir Makarov <vmakarov@redhat.com>

	PR middle-end/37535
	* ira-lives.c (mark_early_clobbers): Remove.
	(make_pseudo_conflict, check_and_make_def_use_conflicts,
	check_and_make_def_conflicts,
	make_early_clobber_and_input_conflicts,
	mark_hard_reg_early_clobbers): New functions.
	(process_bb_node_lives): Call
	make_early_clobber_and_input_conflicts and
	mark_hard_reg_early_clobbers.  Make hard register inputs live
	again.

	* doc/rtl.texi (clobber): Change descriotion of RA behaviour for
	early clobbers of pseudo-registers.
	


Index: ira-lives.c
===================================================================
--- ira-lives.c	(revision 141047)
+++ ira-lives.c	(working copy)
@@ -349,39 +349,169 @@ mark_ref_dead (struct df_ref *def)
   mark_reg_dead (reg);
 }
 
-/* Mark early clobber registers of the current INSN as live (if
-   LIVE_P) or dead.  Return true if there are such registers.  */
+/* Make pseudo REG conflicting with pseudo DREG, if the 1st pseudo
+   class is intersected with class CL.  Advance the current program
+   point before making the conflict if ADVANCE_P.  Return TRUE if we
+   will need to advance the current program point.  */
 static bool
-mark_early_clobbers (rtx insn, bool live_p)
+make_pseudo_conflict (rtx reg, enum reg_class cl, rtx dreg, bool advance_p)
 {
-  int alt;
-  int def;
-  struct df_ref **def_rec;
-  bool set_p = false;
+  ira_allocno_t a;
 
-  for (def = 0; def < recog_data.n_operands; def++)
-    {
-      rtx dreg = recog_data.operand[def];
-      
-      if (GET_CODE (dreg) == SUBREG)
-	dreg = SUBREG_REG (dreg);
-      if (! REG_P (dreg))
-	continue;
+  if (GET_CODE (reg) == SUBREG)
+    reg = SUBREG_REG (reg);
+  
+  if (! REG_P (reg) || REGNO (reg) < FIRST_PSEUDO_REGISTER)
+    return advance_p;
+  
+  a = ira_curr_regno_allocno_map[REGNO (reg)];
+  if (! reg_classes_intersect_p (cl, ALLOCNO_COVER_CLASS (a)))
+    return advance_p;
 
-      for (alt = 0; alt < recog_data.n_alternatives; alt++)
-	if ((recog_op_alt[def][alt].earlyclobber)
-	    && (recog_op_alt[def][alt].cl != NO_REGS))
-	  break;
+  if (advance_p)
+    curr_point++;
 
-      if (alt >= recog_data.n_alternatives)
-	continue;
+  mark_reg_live (reg);
+  mark_reg_live (dreg);
+  mark_reg_dead (reg);
+  mark_reg_dead (dreg);
+
+  return false;
+}
 
-      if (live_p)
-	mark_reg_live (dreg);
+/* Check and make if necessary conflicts for pseudo DREG of class
+   DEF_CL of the current insn with input operand USE of class USE_CL.
+   Advance the current program point before making the conflict if
+   ADVANCE_P.  Return TRUE if we will need to advance the current
+   program point.  */
+static bool
+check_and_make_def_use_conflict (rtx dreg, enum reg_class def_cl,
+				 int use, enum reg_class use_cl,
+				 bool advance_p)
+{
+  if (! reg_classes_intersect_p (def_cl, use_cl))
+    return advance_p;
+  
+  advance_p = make_pseudo_conflict (recog_data.operand[use],
+				    use_cl, dreg, advance_p);
+  /* Reload may end up swapping commutative operands, so you
+     have to take both orderings into account.  The
+     constraints for the two operands can be completely
+     different.  (Indeed, if the constraints for the two
+     operands are the same for all alternatives, there's no
+     point marking them as commutative.)  */
+  if (use < recog_data.n_operands + 1
+      && recog_data.constraints[use][0] == '%')
+    advance_p
+      = make_pseudo_conflict (recog_data.operand[use + 1],
+			      use_cl, dreg, advance_p);
+  if (use >= 1
+      && recog_data.constraints[use - 1][0] == '%')
+    advance_p
+      = make_pseudo_conflict (recog_data.operand[use - 1],
+			      use_cl, dreg, advance_p);
+  return advance_p;
+}
+
+/* Check and make if necessary conflicts for definition DEF of class
+   DEF_CL of the current insn with input operands.  Process only
+   constraints of alternative ALT.  */
+static void
+check_and_make_def_conflict (int alt, int def, enum reg_class def_cl)
+{
+  int use, use_match;
+  ira_allocno_t a;
+  enum reg_class use_cl, acl;
+  bool advance_p;
+  rtx dreg = recog_data.operand[def];
+	
+  if (def_cl == NO_REGS)
+    return;
+  
+  if (GET_CODE (dreg) == SUBREG)
+    dreg = SUBREG_REG (dreg);
+  
+  if (! REG_P (dreg) || REGNO (dreg) < FIRST_PSEUDO_REGISTER)
+    return;
+  
+  a = ira_curr_regno_allocno_map[REGNO (dreg)];
+  acl = ALLOCNO_COVER_CLASS (a);
+  if (! reg_classes_intersect_p (acl, def_cl))
+    return;
+  
+  advance_p = true;
+  
+  for (use = 0; use < recog_data.n_operands; use++)
+    {
+      if (use == def || recog_data.operand_type[use] == OP_OUT)
+	return;
+      
+      if (recog_op_alt[use][alt].anything_ok)
+	use_cl = ALL_REGS;
       else
-	mark_reg_dead (dreg);
-      set_p = true;
+	use_cl = recog_op_alt[use][alt].cl;
+      
+      advance_p = check_and_make_def_use_conflict (dreg, def_cl, use,
+						   use_cl, advance_p);
+      
+      if ((use_match = recog_op_alt[use][alt].matches) >= 0)
+	{
+	  if (use_match == def)
+	    return;
+	  
+	  if (recog_op_alt[use_match][alt].anything_ok)
+	    use_cl = ALL_REGS;
+	  else
+	    use_cl = recog_op_alt[use_match][alt].cl;
+	  advance_p = check_and_make_def_use_conflict (dreg, def_cl, use,
+						       use_cl, advance_p);
+	}
     }
+}
+
+/* Make conflicts of early clobber pseudo registers of the current
+   insn with its inputs.  Avoid introducing unnecessary conflicts by
+   checking classes of the constraints and pseudos because otherwise
+   significant code degradation is possible for some targets.  */
+static void
+make_early_clobber_and_input_conflicts (void)
+{
+  int alt;
+  int def, def_match;
+  enum reg_class def_cl;
+
+  for (alt = 0; alt < recog_data.n_alternatives; alt++)
+    for (def = 0; def < recog_data.n_operands; def++)
+      {
+	def_cl = NO_REGS;
+	if (recog_op_alt[def][alt].earlyclobber)
+	  {
+	    if (recog_op_alt[def][alt].anything_ok)
+	      def_cl = ALL_REGS;
+	    else
+	      def_cl = recog_op_alt[def][alt].cl;
+	    check_and_make_def_conflict (alt, def, def_cl);
+	  }
+	if ((def_match = recog_op_alt[def][alt].matches) >= 0
+	    && (recog_op_alt[def_match][alt].earlyclobber
+		|| recog_op_alt[def][alt].earlyclobber))
+	  {
+	    if (recog_op_alt[def_match][alt].anything_ok)
+	      def_cl = ALL_REGS;
+	    else
+	      def_cl = recog_op_alt[def_match][alt].cl;
+	    check_and_make_def_conflict (alt, def, def_cl);
+	  }
+      }
+}
+
+/* Mark early clobber hard registers of the current INSN as live (if
+   LIVE_P) or dead.  Return true if there are such registers.  */
+static bool
+mark_hard_reg_early_clobbers (rtx insn, bool live_p)
+{
+  struct df_ref **def_rec;
+  bool set_p = false;
 
   for (def_rec = DF_INSN_DEFS (insn); *def_rec; def_rec++)
     if (DF_REF_FLAGS_IS_SET (*def_rec, DF_REF_MUST_CLOBBER))
@@ -792,25 +922,36 @@ process_bb_node_lives (ira_loop_tree_nod
 		}
 	    }
 	  
+	  make_early_clobber_and_input_conflicts ();
+
 	  curr_point++;
 
 	  /* Mark each used value as live.  */
 	  for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++)
 	    mark_ref_live (*use_rec);
 
-	  set_p = mark_early_clobbers (insn, true);
-
 	  process_single_reg_class_operands (true, freq);
 	  
+	  set_p = mark_hard_reg_early_clobbers (insn, true);
+
 	  if (set_p)
 	    {
-	      mark_early_clobbers (insn, false);
+	      mark_hard_reg_early_clobbers (insn, false);
 
-	      /* Mark each used value as live again.  For example, a
+	      /* Mark each hard reg as live again.  For example, a
 		 hard register can be in clobber and in an insn
 		 input.  */
 	      for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++)
-		mark_ref_live (*use_rec);
+		{
+		  rtx ureg = DF_REF_REG (*use_rec);
+		  
+		  if (GET_CODE (ureg) == SUBREG)
+		    ureg = SUBREG_REG (ureg);
+		  if (! REG_P (ureg) || REGNO (ureg) >= FIRST_PSEUDO_REGISTER)
+		    continue;
+		  
+		  mark_ref_live (*use_rec);
+		}
 	    }
 
 	  curr_point++;
Index: doc/rtl.texi
===================================================================
--- doc/rtl.texi	(revision 140978)
+++ doc/rtl.texi	(working copy)
@@ -2930,11 +2930,12 @@ constituent instructions might not.
 When a @code{clobber} expression for a register appears inside a
 @code{parallel} with other side effects, the register allocator
 guarantees that the register is unoccupied both before and after that
-insn if it is a hard register clobber or the @samp{&} constraint
-is specified for at least one alternative (@pxref{Modifiers}) of the
-clobber.  However, the reload phase may allocate a register used for
-one of the inputs unless the @samp{&} constraint is specified for the
-selected alternative.  You can clobber either a specific hard
+insn if it is a hard register clobber.  For pseudo-register clobber,
+the register allocator and the reload pass do not assign the same hard
+register to the clobber and the input operands if there is an insn
+alternative containing the @samp{&} constraint (@pxref{Modifiers}) for
+the clobber and the hard register is in register classes of the
+clobber in the alternative.  You can clobber either a specific hard
 register, a pseudo register, or a @code{scratch} expression; in the
 latter two cases, GCC will allocate a hard register that is available
 there for use as a temporary.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]