[PATCH v2] In the pipeline, USE or CLOBBER should delay execution if it starts a new live range.

Jin Ma jinma@linux.alibaba.com
Mon Aug 14 11:22:55 GMT 2023


CLOBBER and USE does not represent real instructions, but in the
process of pipeline optimization, they will wait for transmission
in ready list like other insns, without considering resource
conflicts and cycles. This results in a multi-issue CPU architecture
that can be issued at any time if other regular insns have resource
conflicts or cannot be launched for other reasons. As a result,
its position is advanced in the generated insns sequence, which
will affect register allocation and often lead to more redundant
mov instructions.

A simple example:
https://github.com/majin2020/gcc-test/blob/master/test.c
This is a function in the dhrystone benchmark.

https://github.com/majin2020/gcc-test/blob/0b08c1a13de9663d7d9aba7539b960ec0607ca24/test.c.299r.sched1
This is a log of the pass 'sched1' When -mtune=rocket but issue_rate == 2.

The pipeline is:
;; | insn | prio |
;; |  17  |  3   | r142=a0 alu
;; |  14  |  0   | clobber r136 nothing
;; |  13  |  0   | clobber a0 nothing
;; |  18  |  2   | r143=a1 alu
...
;; |  12  |  0   | a0=r136 alu
;; |  15  |  0   | use a0 nothing

In this log, insn 13 and 14 are much ahead of schedule, which risks generating
redundant mov instructions, which seems unreasonable.

Therefore, I submit patch again on the basis of the last review
opinions to try to solve this problem.

https://github.com/majin2020/gcc-test/commit/efcb43e3369e771bde702955048bfe3f501263dd#diff-805031b1be5092a2322852a248d0b0f92eef7cad5784a8209f4dfc6221407457L189
This is the diff log of shed1 after patch is added.

The new pipeline is:
;; | insn | prio |
;; |  17  |  3   | r142=a0 alu
...
;; |  10  |  0   | [r144]=r141 alu
;; |  13  |  0   | clobber a0 nothing
;; |  14  |  0   | clobber r136 nothing
;; |  12  |  0   | a0=r136 alu
;; |  15  |  0   | use a0 nothing

gcc/ChangeLog:
	* haifa-sched.cc (use_or_clobber_starts_range_p): New.
	(prune_ready_list): USE or CLOBBER should delay execution
	if it starts a new live range.
---
 gcc/haifa-sched.cc | 55 +++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 50 insertions(+), 5 deletions(-)

diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc
index 8e8add709b3..47ad09457c7 100644
--- a/gcc/haifa-sched.cc
+++ b/gcc/haifa-sched.cc
@@ -765,6 +765,23 @@ real_insn_for_shadow (rtx_insn *insn)
   return pair->i1;
 }
 
+/* Return TRUE if INSN (a USE or CLOBBER) starts a new live
+    range, FALSE otherwise.  */
+
+static bool
+use_or_clobber_starts_range_p (rtx_insn *insn)
+{
+  gcc_assert (insn);
+
+  if ((GET_CODE (PATTERN (insn)) == CLOBBER
+       || GET_CODE (PATTERN (insn)) == USE)
+      && !sd_lists_empty_p (insn, SD_LIST_FORW)
+      && sd_lists_empty_p (insn, SD_LIST_BACK))
+    return true;
+
+  return false;
+}
+
 /* For a pair P of insns, return the fixed distance in cycles from the first
    insn after which the second must be scheduled.  */
 static int
@@ -6320,11 +6337,39 @@ prune_ready_list (state_t temp_state, bool first_cycle_insn_p,
 	    }
 	  else if (recog_memoized (insn) < 0)
 	    {
-	      if (!first_cycle_insn_p
-		  && (GET_CODE (PATTERN (insn)) == ASM_INPUT
-		      || asm_noperands (PATTERN (insn)) >= 0))
-		cost = 1;
-	      reason = "asm";
+	      if (GET_CODE (PATTERN (insn)) == ASM_INPUT
+		  || asm_noperands (PATTERN (insn)) >= 0)
+		{
+		  reason = "asm";
+		  if (!first_cycle_insn_p)
+		    cost = 1;
+		}
+	      else if (use_or_clobber_starts_range_p (insn))
+		{
+		  /* If USE or CLOBBER opens an active range, its execution should
+		     be delayed so as to be closer to the relevant instructions and
+		     avoid the generation of some redundant mov instructions.
+		     Otherwise, it should be executed as soon as possible.  */
+		  reason = "unrecog insn";
+		  if (!first_cycle_insn_p)
+		    /* If USE or CLOBBER is not in the first cycle, simply delay it
+		       by one cycle.  */
+		    cost = 1;
+		  else
+		    {
+		      /* If the USE or CLOBBER is in the first cycle and there are no
+			 other non-USE or non-CLOBBER instructions after it, we need
+			 to execute it immediately, otherwise we need to execute the
+			 non-USE or non-CLOBBER instructions first and postpone the
+			 execution of the USE or CLOBBER instructions.  */
+		      int j = i;
+		      while (n > ++j)
+			if (!use_or_clobber_starts_range_p (ready_element (&ready, j)))
+			  break;
+
+		      cost = (j == n) ? 0 : 1;
+		    }
+		}
 	    }
 	  else if (sched_pressure != SCHED_PRESSURE_NONE)
 	    {

base-commit: c944ded09595946290778a26794074e69cc65f3e
-- 
2.17.1



More information about the Gcc-patches mailing list