This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] Modulo-scheduling improvements.
- From: "Alexander Monakov" <monoid at ispras dot ru>
- To: gcc-patches at gcc dot gnu dot org
- Cc: "Revital1 Eres" <ERES at il dot ibm dot com>, "Ayal Zaks" <ZAKS at il dot ibm dot com>, "Mircea Namolaru" <NAMOLARU at il dot ibm dot com>, "Andrey Belevantsev" <abel at ispras dot ru>, "Dmitry Zhurikhin" <zhur at ispras dot ru>, "Zdenek Dvorak" <rakdver at kam dot mff dot cuni dot cz>, dje at watson dot ibm dot com, andrew_pinski at playstation dot sony dot com
- Date: Mon, 20 Aug 2007 19:26:23 +0400
- Subject: Re: [PATCH] Modulo-scheduling improvements.
- References: <OFD551E32A.189B4CD3-ONC2257331.00526D01-C2257338.005EA442@il.ibm.com>
Please consider the following two patches.
1. Use MAX_ASAP in MAX_II estimation.
Currently SMS estimates maximum initiation interval by multiplying minimal
II estimation by hard-coded constant, MAXII_FACTOR, which is currently
defined as 2. This may sometimes forbid performing SMS on loops which
could benefit from modulo-scheduling. However, MAX_ASAP calculated in
calculate_order_params is equal to length of critical path for the basic
block, so this basic block cannot be scheduled in less than MAX_ASAP ticks
by scheduler not performing inter-iteration code motion. Therefore, we
can use MAX_ASAP as more accurate estimation of maximal II for SMS. The
patch below takes a hybrid approach, allowing II as great as MAXII_FACTOR
* min_ii, even if MAX_ASAP is smaller.
--- gcc/modulo-sched.c (/gcc-local/trunk) (revision 30596)
+++ gcc/modulo-sched.c (/gcc-local/export-ddg) (revision 30596)
@@ -793,6 +803,10 @@ canon_loop (struct loop *loop)
/* Used to calculate the upper bound of ii. */
#define MAXII_FACTOR 2
+/* Maximal ASAP saved for estimating maxII. */
+static int global_max_asap;
+"Revital1 Eres" <ERES@il.ibm.com>,
/* Main entry point, perform SMS scheduling on the loops of the function
that consist of single basic blocks. */
@@ -1019,9 +1033,10 @@ sms_schedule (void)
node_order = XNEWVEC (int, g->num_nodes);
mii = 1; /* Need to pass some estimate of mii. */
+ global_max_asap = 0;
rec_mii = sms_order_nodes (g, mii, node_order);
mii = MAX (res_MII (g), rec_mii);
- maxii = MAXII_FACTOR * mii;
+ maxii = MAX (global_max_asap, MAXII_FACTOR * mii);
fprintf (dump_file, "SMS iis %d %d %d (rec_mii, mii, maxii)\n",
@@ -1681,6 +1703,7 @@ calculate_order_params (ddg_ptr g, int m
+ global_max_asap = max_asap;
2. Change logic for MAX_PRECEDE, MAX_FOLLOW calculation.
These two sbitmaps are used to determine order of instructions that are
scheduled on the same row (and, therefore, in the same instruction
group). However, the way they are calculated is questionable: their
calculation is performed out of the loop that iterates over possible
positions in scheduling window, and uses different clock ticks for
incoming and outgoing arcs; also, all instructions from incoming edges are
marked in must_precede bitmap, and all insns from outgoing edges are
marked in must_follow bitmap, which may lead to same insn marked in both
bitmaps. We propose moving calculation of these bitmaps into loop that
iterates over positions in scheduling window, and fill the bitmaps in the
way so precedence is forced according to anti-dependencies. Patch
These two patches together with fix posted earlier by Andrey Belevantsev
allow fine modulo-scheduling of simple loops on Itanium, where it would
previously fail. To quantify, number of successfully modulo-scheduled
loops (without -fmodulo-sched-allow-regmoves) raises from 32 to 129 on
files from tree-vectorizer testsuite (number of failures lowers from 108
to 21; the difference in totals is due to ICE's in trunk's version of SMS,
which are fixed by Andrey's patch).
Alexander Monakov, Dmitry Zhurikhin
Description: Binary data
Description: Binary data