This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[new-regalloc-branch] bunch of things [7/10]
- To: <gcc-patches at gcc dot gnu dot org>
- Subject: [new-regalloc-branch] bunch of things [7/10]
- From: Michael Matz <matzmich at cs dot tu-berlin dot de>
- Date: Tue, 17 Jul 2001 18:46:53 +0200 (MET DST)
- cc: Michael Matz <matzmich at cs dot tu-berlin dot de>
The seventh.
Kill the semi-duplication of visited[] and visit_trace[] by a per block
info attached (conceptually) to the first insn.
We need to enter all predecessors of a basic-block with the same set of
undefined bits for a use, so remember use->undefined (because it gets
changed in place).
A change in reload is required, becauseit handles REG_DEAD notes
incorrectly. In the main-pass it first replaces all pseudo-reg-rtx's with
it's reg_renumber hardreg. This also includes REG_DEAD notes. Then when
it tries to get a reg_rtx for an input reload, it tries to use hardregs
which are dead in this insn. Unfortunately with the new regalloc the
REG_DEAD notes really only applied to the pseudo-regs, not to it's
reg_renumber equivalents. I.e. a REG_DEAD note for hardreg 0 does not
mean, that hardreg 0 really is dead in all cases. I did investigate a
bit, but then simply disabled the code in push_reload() doing this.
This all makes 164.gzip (of SPECint2000) compile/run, which completes the
set of benchmarks (besides eon, as explained in the first mail). Also
186.crafty (evaluate.c specifically) needs to be compiled with
-fno-rename-registers (when -O3 was used in CFLAGS). I believe this to be
a bug in regrename.c, but it's only exposed with the new ra. I don't
remember right now, what exactly was the issue, but it renamed one reg
chain to a hardreg which clearly was live at that point. So I didn't care
about this one too much.
2001-07-06 Michael Matz <matzmich@cs.tu-berlin.de>
* ra.c : (struct bb_begin_info): New.
(visited): Remove.
(live_in): Before processing predecessors of a block, check if that
wasn't already done.
Remember use-undefined over multiple predecessors blocks.
(build_web_parts_and_conflicts): Don't allocate/free visited.
Allocate and free a bb_begin_info entry for all basic blocks.
* reload.c : (push_reload): Don't use REG_DEAD notes for finding
a reg_rtx for input reloads.
* reload1.c : (scan_paradoxical_subregs): Commenting wrongness of
reg_max_ref_width[] setting.
--
*** ra.c 2001/07/17 09:24:33 1.16
--- ra.c 2001/07/17 09:25:22 1.18
***************
*** 46,62 ****
* handle REG_NO_CONFLICTS blocks correctly (the current ad hoc approach
might miss some conflicts due to insns which only seem to be in a
REG_NO_CONLICTS block)
! * we really _need_ to handle SUBREGs as only taking one hardreg.
* create definitions of ever-life regs at the beginning of
the insn chain
* create webs for all hardregs, not just those actually defined
(so we later can use that to implement every constraint)
- * insert only one spill per insn and use/def
* insert loads as soon, stores as late as possile
* insert spill insns as outward as possible (either looptree, or LCM)
* reuse stack-slots
* use the frame-pointer, when we can
! * delete coalesced insns
* don't insert hard-regs, but a limited set of pseudo-reg
in emit_colors, and setup reg_renumber accordingly (done, but this
needs reload, which I want to go away)
--- 46,63 ----
* handle REG_NO_CONFLICTS blocks correctly (the current ad hoc approach
might miss some conflicts due to insns which only seem to be in a
REG_NO_CONLICTS block)
! -- Don't necessary anymore, I believe, because SUBREG tracking is
! implemented.
* create definitions of ever-life regs at the beginning of
the insn chain
* create webs for all hardregs, not just those actually defined
(so we later can use that to implement every constraint)
* insert loads as soon, stores as late as possile
* insert spill insns as outward as possible (either looptree, or LCM)
* reuse stack-slots
* use the frame-pointer, when we can
! * delete coalesced insns. Partly done. The rest can only go, when we get
! rid of reload.
* don't insert hard-regs, but a limited set of pseudo-reg
in emit_colors, and setup reg_renumber accordingly (done, but this
needs reload, which I want to go away)
***************
*** 64,76 ****
is possible, as we don't use global liveness
* don't destroy coalescing information completely when spilling
* use the constraints from asms
- * correctly handle SUBREG as being one hardreg on it's
- own, to handle such things:
- (set (subreg:SI (reg:DI 40) 0) (...))
- (set (reg:SI 41) (...))
- where it's clear from constraints, that 40 should go to
- 0 and 41 to 1. For now they conflict for the code below, although
- they don't in reality.
* implement spill coalescing/propagation
* implement optimistic coalescing
*/
--- 65,70 ----
*************** static sbitmap igraph;
*** 364,374 ****
conflicting webs, were only parts of them are in conflict. */
static sbitmap sup_igraph;
! /* XXX use Briggs sparse bitset, or eliminate visited alltogether (by
! marking only block ends; this would work, as we also use
! visit_trace[] for a similar thing. */
static unsigned int visited_pass;
- static unsigned int *visited;
static sbitmap move_handled;
static struct web_part *web_parts;
--- 358,370 ----
conflicting webs, were only parts of them are in conflict. */
static sbitmap sup_igraph;
! struct bb_begin_info
! {
! unsigned int pass;
! unsigned HOST_WIDE_INT undefined;
! void *old_aux;
! };
static unsigned int visited_pass;
static sbitmap move_handled;
static struct web_part *web_parts;
*************** live_out_1 (df, use, insn)
*** 952,958 ****
not be colored in a way which would conflict with
the USE. This is only true for partial overlap,
because only then the DEF and USE have bits in common,
! which makes the DEF move, if the USE moves.
If they have no bits in common (lap == -1), they are
really independent. Therefore we there make a
conflict. */
--- 948,955 ----
not be colored in a way which would conflict with
the USE. This is only true for partial overlap,
because only then the DEF and USE have bits in common,
! which makes the DEF move, if the USE moves, making them
! aligned.
If they have no bits in common (lap == -1), they are
really independent. Therefore we there make a
conflict. */
*************** live_in (df, use, insn)
*** 1035,1041 ****
rtx insn;
{
unsigned int loc_vpass = visited_pass;
- unsigned int *loc_v = visited;
/* Note, that, even _if_ we are called with use->wp a root-part, this might
become non-root in the for() loop below (due to live_out() unioning
--- 1032,1037 ----
*************** live_in (df, use, insn)
*** 1044,1068 ****
while (1)
{
int uid = INSN_UID (insn);
rtx p;
- if (loc_v[uid] == loc_vpass)
- return;
- loc_v[uid] = loc_vpass;
number_seen[uid]++;
p = prev_real_insn (insn);
if (!p)
return;
! if (BLOCK_FOR_INSN (insn) != BLOCK_FOR_INSN (p))
{
edge e;
! /* All but the last predecessor are handled recursively. */
! for (e = BLOCK_FOR_INSN (insn)->pred; e && e->pred_next;
! e = e->pred_next)
! if (live_out (df, use, e->src->end))
! live_in (df, use, e->src->end);
! if (!e)
return;
p = e->src->end;
}
if (live_out (df, use, p))
--- 1040,1075 ----
while (1)
{
int uid = INSN_UID (insn);
+ basic_block bb = BLOCK_FOR_INSN (insn);
rtx p;
number_seen[uid]++;
p = prev_real_insn (insn);
if (!p)
return;
! if (bb != BLOCK_FOR_INSN (p))
{
edge e;
! unsigned HOST_WIDE_INT undef = use->undefined;
! struct bb_begin_info *info = (struct bb_begin_info *)bb->aux;
! if ((e = bb->pred) == NULL)
! return;
! /* We now check, if we already traversed the predecessors of this
! block for the current pass and the current set of undefined
! bits. If yes, we don't need to check the predecessors again.
! I.e. conceptually this information is tagged to the first
! insn of a basic block. */
! if (info->pass == loc_vpass && (undef & ~info->undefined) == 0)
return;
+ info->pass = loc_vpass;
+ info->undefined = undef;
+ /* All but the last predecessor are handled recursively. */
+ for (; e->pred_next; e = e->pred_next)
+ {
+ if (live_out (df, use, e->src->end))
+ live_in (df, use, e->src->end);
+ use->undefined = undef;
+ }
p = e->src->end;
}
if (live_out (df, use, p))
*************** build_web_parts_and_conflicts (df)
*** 1115,1120 ****
--- 1122,1128 ----
{
struct df_link *link;
struct curr_use use;
+ int b;
/* Setup copy cache, for copy_insn_p (). */
copy_cache = (struct copy_p_cache *)
*************** build_web_parts_and_conflicts (df)
*** 1122,1129 ****
number_seen = (int *) xcalloc (get_max_uid (), sizeof (int));
visit_trace = (struct visit_trace *) xcalloc (get_max_uid (),
sizeof (visit_trace[0]));
- visited = (unsigned int *) xcalloc (get_max_uid (), sizeof (unsigned int));
/* Here's the main loop.
It goes through all insn's, connects web parts along the way, notes
conflicts between webparts, and remembers move instructions. */
--- 1130,1146 ----
number_seen = (int *) xcalloc (get_max_uid (), sizeof (int));
visit_trace = (struct visit_trace *) xcalloc (get_max_uid (),
sizeof (visit_trace[0]));
+ for (b = 0; b < n_basic_blocks + 2; b++)
+ {
+ basic_block bb = (b == n_basic_blocks) ? ENTRY_BLOCK_PTR :
+ (b == n_basic_blocks + 1) ? EXIT_BLOCK_PTR :
+ BASIC_BLOCK (b);
+ struct bb_begin_info *info = (struct bb_begin_info *) xmalloc (sizeof
+ *info);
+ info->old_aux = bb->aux;
+ bb->aux = (void *)info;
+ }
/* Here's the main loop.
It goes through all insn's, connects web parts along the way, notes
conflicts between webparts, and remembers move instructions. */
*************** build_web_parts_and_conflicts (df)
*** 1145,1157 ****
}
dump_number_seen ();
! free (visited);
free (visit_trace);
free (number_seen);
free (copy_cache);
/* Catch prohibited uses of copy_insn_p () early. */
copy_cache = NULL;
- visited = NULL;
}
/* Handle tricky asm insns. */
--- 1162,1181 ----
}
dump_number_seen ();
! for (b = 0; b < n_basic_blocks + 2; b++)
! {
! basic_block bb = (b == n_basic_blocks) ? ENTRY_BLOCK_PTR :
! (b == n_basic_blocks + 1) ? EXIT_BLOCK_PTR :
! BASIC_BLOCK (b);
! struct bb_begin_info *info = (struct bb_begin_info *) bb->aux;
! bb->aux = info->old_aux;
! free (info);
! }
free (visit_trace);
free (number_seen);
free (copy_cache);
/* Catch prohibited uses of copy_insn_p () early. */
copy_cache = NULL;
}
/* Handle tricky asm insns. */
Index: reload.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/reload.c,v
retrieving revision 1.142
diff -u -c -p -r1.142 reload.c
*** reload.c 2001/01/24 23:50:56 1.142
--- reload.c 2001/07/17 12:07:35
*************** push_reload (in, out, inloc, outloc, cla
*** 1461,1466 ****
--- 1461,1475 ----
if (REG_NOTE_KIND (note) == REG_DEAD
&& GET_CODE (XEXP (note, 0)) == REG
&& (regno = REGNO (XEXP (note, 0))) < FIRST_PSEUDO_REGISTER
+ /* We can't do this with the new regalloc. A REG_DEAD note
+ does not mean, that the hardreg really dies here. It meant,
+ that the pseudo dies here. Still reg_renumber[] was set up
+ for this pseudo, so it was included in the REG_DEAD note
+ and now it looks like the hardreg dies. I'm not sure, what
+ the old allocator did. Either for those pseudos it was
+ reg_renumber[]==-1, or there were no REG_DEAD notes for these.
+ */
+ && 0
&& reg_mentioned_p (XEXP (note, 0), in)
&& ! refers_to_regno_for_reload_p (regno,
(regno
Index: reload1.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/reload1.c,v
retrieving revision 1.255.2.4
diff -u -c -p -r1.255.2.4 reload1.c
*** reload1.c 2001/02/19 16:50:38 1.255.2.4
--- reload1.c 2001/07/17 12:07:41
*************** scan_paradoxical_subregs (x)
*** 3772,3777 ****
--- 3772,3779 ----
case SUBREG:
if (GET_CODE (SUBREG_REG (x)) == REG
&& GET_MODE_SIZE (GET_MODE (x)) > GET_MODE_SIZE (GET_MODE (SUBREG_REG (x))))
+ /* XXX this is not calculating the max width, but instead simply
+ overwriting it. (matz) */
reg_max_ref_width[REGNO (SUBREG_REG (x))]
= GET_MODE_SIZE (GET_MODE (x));
return;