This is the mail archive of the
mailing list for the GCC project.
Re: Bug in ra-colorize.c:merge_moves?
- From: "Christian Ehrhardt" <ehrhardt at mathematik dot uni-ulm dot de>
- To: Michael Matz <matz at suse dot de>
- Cc: gcc at gcc dot gnu dot org, gcc-patches at gcc dot gnu dot org
- Date: Thu, 27 Feb 2003 17:39:35 +0100
- Subject: Re: Bug in ra-colorize.c:merge_moves?
- References: <email@example.com> <Pine.LNX.firstname.lastname@example.org>
On Tue, Feb 25, 2003 at 05:57:15PM +0100, Michael Matz wrote:
> > this is merge_moves from ra-colorize.c which has somewhat broken list
> > handling IMHO.
> Hmm, I must have been half asleep, when I wrote that. Although note, that
> with the default setting of flags noone will notice that error, as
> web->moves is empty with optimistic coalescing. I'll commit the below
> patch, once bootstrapping/regtesting on i686-linux is done (regalloc
> branch and HEAD). I'm unclear if I need permission for 3.3.
Thanks for the patch. Another mostly unrelated issue:
The behaviour of the register allocator with the following testcase
(and others) is IMHO suboptimal:
extern void byte_store_op2 (int op, unsigned char *loc, int arg1, int arg2);
byte_insert_op2 (op, loc, arg1, arg2, end)
unsigned char *loc;
int arg1, arg2;
unsigned char *end;
register unsigned char *pfrom = end;
register unsigned char *pto = end + 1 + 2 * 2;
while (pfrom != loc)
*--pto = *--pfrom;
byte_store_op2 (op, loc, arg1, arg2);
With -O2 this testcase gives 7 non-precolored webs that all conflict with
each other, i.e. the conflict graph is a complete graph with 7 nodes.
With the current code it takes no less than 8 passes to get this graph
colored. The problem is that the allocator allows coalescing between
spill slots and non spill slots.
With the testcase above (compile with -O2) we decide that we have to
spill some web (Web 0) and insert proper spill code. However, the
first thing that the next pass does is to coalesce all the moves to
and from the newly inserted spill slot. As a consequence we are faced
with almost the same graph again.
This is a huge compile time performance problem. Adding the following
test to aggressive coalesce reduced compile time dramatically (> 80%
on my favourite test case, see below):
RCS file: /cvsroot/gcc/gcc/gcc/ra-colorize.c,v
retrieving revision 188.8.131.52
diff -u -r184.108.40.206 ra-colorize.c
--- ra-colorize.c 1 Feb 2003 17:27:08 -0000 220.127.116.11
+++ ra-colorize.c 27 Feb 2003 16:16:13 -0000
@@ -2683,7 +2683,8 @@
if (s != t
&& t->type != PRECOLORED
&& !TEST_BIT (sup_igraph, s->id * num_webs + t->id)
- && !TEST_BIT (sup_igraph, t->id * num_webs + s->id))
+ && !TEST_BIT (sup_igraph, t->id * num_webs + s->id)
+ && (SPILL_SLOT_P (s->regno) == SPILL_SLOT_P (t->regno)))
if ((s->type == PRECOLORED && ok (t, s))
|| s->type != PRECOLORED)
Here are timeings as reported by -ftime-report with and without the patch.
Testcase is a preprocessed regex.c from libiberty compiled with -O2:
vanilla with patch saved
local alloc 31.21 2.23 92.8%
TOTAL 34.16 5.19 84.8%
I realize that this patch may change the generated code. Do you
have some kind of testsuite that can be used to measure the run
time performance impact of this patch? For those (admittedly few)
test cases that I looked in detail, the same code was generated with
and without the patch.
Finally let me say a few words about my rational:
* We do want to coalesce normal moves. In this case SPILL_SLOT_P is
zero for both webs involved.
* We do want to coalesce moves between different spill slots to avoid
unnecessary mem-mem moves, in this case SPILL_SLOT_P is one for both
* I really think we don't want to coalesce moves that result from spills.
These moves normally move a non SPILL_SLOT_P register to or from a
THAT'S ALL FOLKS!