This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

[new-regalloc-branch] bunch of things [7/10]


The seventh.

Kill the semi-duplication of visited[] and visit_trace[] by a per block
info attached (conceptually) to the first insn.

We need to enter all predecessors of a basic-block with the same set of
undefined bits for a use, so remember use->undefined (because it gets
changed in place).

A change in reload is required, becauseit handles REG_DEAD notes
incorrectly.  In the main-pass it first replaces all pseudo-reg-rtx's with
it's reg_renumber hardreg.  This also includes REG_DEAD notes.  Then when
it tries to get a reg_rtx for an input reload, it tries to use hardregs
which are dead in this insn.  Unfortunately with the new regalloc the
REG_DEAD notes really only applied to the pseudo-regs, not to it's
reg_renumber equivalents.  I.e. a REG_DEAD note for hardreg 0 does not
mean, that hardreg 0 really is dead in all cases.  I did investigate a
bit, but then simply disabled the code in push_reload() doing this.

This all makes 164.gzip (of SPECint2000) compile/run, which completes the
set of benchmarks (besides eon, as explained in the first mail).  Also
186.crafty (evaluate.c specifically) needs to be compiled with
-fno-rename-registers (when -O3 was used in CFLAGS).  I believe this to be
a bug in regrename.c, but it's only exposed with the new ra.  I don't
remember right now, what exactly was the issue, but it renamed one reg
chain to a hardreg which clearly was live at that point.  So I didn't care
about this one too much.


2001-07-06  Michael Matz <matzmich@cs.tu-berlin.de>

	* ra.c : (struct bb_begin_info): New.
	(visited): Remove.
	(live_in): Before processing predecessors of a block, check if that
	wasn't already done.
	Remember use-undefined over multiple predecessors blocks.
	(build_web_parts_and_conflicts): Don't allocate/free visited.
	Allocate and free a bb_begin_info entry for all basic blocks.

	* reload.c : (push_reload): Don't use REG_DEAD notes for finding
	a reg_rtx for input reloads.

	* reload1.c : (scan_paradoxical_subregs): Commenting wrongness of
	reg_max_ref_width[] setting.

-- 

*** ra.c	2001/07/17 09:24:33	1.16
--- ra.c	2001/07/17 09:25:22	1.18
***************
*** 46,62 ****
     * handle REG_NO_CONFLICTS blocks correctly (the current ad hoc approach
       might miss some conflicts due to insns which only seem to be in a
       REG_NO_CONLICTS block)
!    * we really _need_ to handle SUBREGs as only taking one hardreg.
     * create definitions of ever-life regs at the beginning of
       the insn chain
     * create webs for all hardregs, not just those actually defined
       (so we later can use that to implement every constraint)
-    * insert only one spill per insn and use/def
     * insert loads as soon, stores as late as possile
     * insert spill insns as outward as possible (either looptree, or LCM)
     * reuse stack-slots
     * use the frame-pointer, when we can
!    * delete coalesced insns
     * don't insert hard-regs, but a limited set of pseudo-reg
       in emit_colors, and setup reg_renumber accordingly (done, but this
       needs reload, which I want to go away)
--- 46,63 ----
     * handle REG_NO_CONFLICTS blocks correctly (the current ad hoc approach
       might miss some conflicts due to insns which only seem to be in a
       REG_NO_CONLICTS block)
!      -- Don't necessary anymore, I believe, because SUBREG tracking is
!      implemented.
     * create definitions of ever-life regs at the beginning of
       the insn chain
     * create webs for all hardregs, not just those actually defined
       (so we later can use that to implement every constraint)
     * insert loads as soon, stores as late as possile
     * insert spill insns as outward as possible (either looptree, or LCM)
     * reuse stack-slots
     * use the frame-pointer, when we can
!    * delete coalesced insns.  Partly done.  The rest can only go, when we get
!      rid of reload.
     * don't insert hard-regs, but a limited set of pseudo-reg
       in emit_colors, and setup reg_renumber accordingly (done, but this
       needs reload, which I want to go away)
***************
*** 64,76 ****
       is possible, as we don't use global liveness
     * don't destroy coalescing information completely when spilling
     * use the constraints from asms
-    * correctly handle SUBREG as being one hardreg on it's
-      own, to handle such things:
-      (set (subreg:SI (reg:DI 40) 0) (...))
-      (set (reg:SI 41) (...))
-      where it's clear from constraints, that 40 should go to
-      0 and 41 to 1.  For now they conflict for the code below, although
-      they don't in reality.
     * implement spill coalescing/propagation
     * implement optimistic coalescing
    */
--- 65,70 ----
*************** static sbitmap igraph;
*** 364,374 ****
     conflicting webs, were only parts of them are in conflict.  */
  static sbitmap sup_igraph;

! /* XXX use Briggs sparse bitset, or eliminate visited alltogether (by
!    marking only block ends; this would work, as we also use
!    visit_trace[] for a similar thing.  */
  static unsigned int visited_pass;
- static unsigned int *visited;
  static sbitmap move_handled;

  static struct web_part *web_parts;
--- 358,370 ----
     conflicting webs, were only parts of them are in conflict.  */
  static sbitmap sup_igraph;

! struct bb_begin_info
! {
!   unsigned int pass;
!   unsigned HOST_WIDE_INT undefined;
!   void *old_aux;
! };
  static unsigned int visited_pass;
  static sbitmap move_handled;

  static struct web_part *web_parts;
*************** live_out_1 (df, use, insn)
*** 952,958 ****
  		       not be colored in a way which would conflict with
  		       the USE.  This is only true for partial overlap,
  		       because only then the DEF and USE have bits in common,
! 		       which makes the DEF move, if the USE moves.
  		       If they have no bits in common (lap == -1), they are
  		       really independent.  Therefore we there make a
  		       conflict.  */
--- 948,955 ----
  		       not be colored in a way which would conflict with
  		       the USE.  This is only true for partial overlap,
  		       because only then the DEF and USE have bits in common,
! 		       which makes the DEF move, if the USE moves, making them
! 		       aligned.
  		       If they have no bits in common (lap == -1), they are
  		       really independent.  Therefore we there make a
  		       conflict.  */
*************** live_in (df, use, insn)
*** 1035,1041 ****
       rtx insn;
  {
    unsigned int loc_vpass = visited_pass;
-   unsigned int *loc_v = visited;

    /* Note, that, even _if_ we are called with use->wp a root-part, this might
       become non-root in the for() loop below (due to live_out() unioning
--- 1032,1037 ----
*************** live_in (df, use, insn)
*** 1044,1068 ****
    while (1)
      {
        int uid = INSN_UID (insn);
        rtx p;
-       if (loc_v[uid] == loc_vpass)
- 	return;
-       loc_v[uid] = loc_vpass;
        number_seen[uid]++;

        p = prev_real_insn (insn);
        if (!p)
  	return;
!       if (BLOCK_FOR_INSN (insn) != BLOCK_FOR_INSN (p))
  	{
  	  edge e;
! 	  /* All but the last predecessor are handled recursively.  */
! 	  for (e = BLOCK_FOR_INSN (insn)->pred; e && e->pred_next;
! 	       e = e->pred_next)
! 	    if (live_out (df, use, e->src->end))
! 	      live_in (df, use, e->src->end);
! 	  if (!e)
  	    return;
  	  p = e->src->end;
  	}
        if (live_out (df, use, p))
--- 1040,1075 ----
    while (1)
      {
        int uid = INSN_UID (insn);
+       basic_block bb = BLOCK_FOR_INSN (insn);
        rtx p;
        number_seen[uid]++;

        p = prev_real_insn (insn);
        if (!p)
  	return;
!       if (bb != BLOCK_FOR_INSN (p))
  	{
  	  edge e;
! 	  unsigned HOST_WIDE_INT undef = use->undefined;
! 	  struct bb_begin_info *info = (struct bb_begin_info *)bb->aux;
! 	  if ((e = bb->pred) == NULL)
! 	    return;
! 	  /* We now check, if we already traversed the predecessors of this
! 	     block for the current pass and the current set of undefined
! 	     bits.  If yes, we don't need to check the predecessors again.
! 	     I.e. conceptually this information is tagged to the first
! 	     insn of a basic block.  */
! 	  if (info->pass == loc_vpass && (undef & ~info->undefined) == 0)
  	    return;
+ 	  info->pass = loc_vpass;
+ 	  info->undefined = undef;
+ 	  /* All but the last predecessor are handled recursively.  */
+ 	  for (; e->pred_next; e = e->pred_next)
+ 	    {
+ 	      if (live_out (df, use, e->src->end))
+ 	        live_in (df, use, e->src->end);
+ 	      use->undefined = undef;
+ 	    }
  	  p = e->src->end;
  	}
        if (live_out (df, use, p))
*************** build_web_parts_and_conflicts (df)
*** 1115,1120 ****
--- 1122,1128 ----
  {
    struct df_link *link;
    struct curr_use use;
+   int b;

    /* Setup copy cache, for copy_insn_p ().  */
    copy_cache = (struct copy_p_cache *)
*************** build_web_parts_and_conflicts (df)
*** 1122,1129 ****
    number_seen = (int *) xcalloc (get_max_uid (), sizeof (int));
    visit_trace = (struct visit_trace *) xcalloc (get_max_uid (),
  					      sizeof (visit_trace[0]));
-   visited = (unsigned int *) xcalloc (get_max_uid (), sizeof (unsigned int));

    /* Here's the main loop.
       It goes through all insn's, connects web parts along the way, notes
       conflicts between webparts, and remembers move instructions.  */
--- 1130,1146 ----
    number_seen = (int *) xcalloc (get_max_uid (), sizeof (int));
    visit_trace = (struct visit_trace *) xcalloc (get_max_uid (),
  					      sizeof (visit_trace[0]));

+   for (b = 0; b < n_basic_blocks + 2; b++)
+     {
+       basic_block bb = (b == n_basic_blocks) ? ENTRY_BLOCK_PTR :
+ 	  (b == n_basic_blocks + 1) ? EXIT_BLOCK_PTR :
+ 	  BASIC_BLOCK (b);
+       struct bb_begin_info *info = (struct bb_begin_info *) xmalloc (sizeof
+ 								     *info);
+       info->old_aux = bb->aux;
+       bb->aux = (void *)info;
+     }
    /* Here's the main loop.
       It goes through all insn's, connects web parts along the way, notes
       conflicts between webparts, and remembers move instructions.  */
*************** build_web_parts_and_conflicts (df)
*** 1145,1157 ****
  	}

    dump_number_seen ();
!   free (visited);
    free (visit_trace);
    free (number_seen);
    free (copy_cache);
    /* Catch prohibited uses of copy_insn_p () early.  */
    copy_cache = NULL;
-   visited = NULL;
  }

  /* Handle tricky asm insns.  */
--- 1162,1181 ----
  	}

    dump_number_seen ();
!   for (b = 0; b < n_basic_blocks + 2; b++)
!     {
!       basic_block bb = (b == n_basic_blocks) ? ENTRY_BLOCK_PTR :
! 	  (b == n_basic_blocks + 1) ? EXIT_BLOCK_PTR :
! 	  BASIC_BLOCK (b);
!       struct bb_begin_info *info = (struct bb_begin_info *) bb->aux;
!       bb->aux = info->old_aux;
!       free (info);
!     }
    free (visit_trace);
    free (number_seen);
    free (copy_cache);
    /* Catch prohibited uses of copy_insn_p () early.  */
    copy_cache = NULL;
  }

  /* Handle tricky asm insns.  */
Index: reload.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/reload.c,v
retrieving revision 1.142
diff -u -c -p -r1.142 reload.c
*** reload.c	2001/01/24 23:50:56	1.142
--- reload.c	2001/07/17 12:07:35
*************** push_reload (in, out, inloc, outloc, cla
*** 1461,1466 ****
--- 1461,1475 ----
  	if (REG_NOTE_KIND (note) == REG_DEAD
  	    && GET_CODE (XEXP (note, 0)) == REG
  	    && (regno = REGNO (XEXP (note, 0))) < FIRST_PSEUDO_REGISTER
+ 	    /* We can't do this with the new regalloc.  A REG_DEAD note
+ 	       does not mean, that the hardreg really dies here.  It meant,
+ 	       that the pseudo dies here.  Still reg_renumber[] was set up
+ 	       for this pseudo, so it was included in the REG_DEAD note
+ 	       and now it looks like the hardreg dies.  I'm not sure, what
+ 	       the old allocator did.  Either for those pseudos it was
+ 	       reg_renumber[]==-1, or there were no REG_DEAD notes for these.
+ 	       */
+             && 0
  	    && reg_mentioned_p (XEXP (note, 0), in)
  	    && ! refers_to_regno_for_reload_p (regno,
  					       (regno
Index: reload1.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/reload1.c,v
retrieving revision 1.255.2.4
diff -u -c -p -r1.255.2.4 reload1.c
*** reload1.c	2001/02/19 16:50:38	1.255.2.4
--- reload1.c	2001/07/17 12:07:41
*************** scan_paradoxical_subregs (x)
*** 3772,3777 ****
--- 3772,3779 ----
      case SUBREG:
        if (GET_CODE (SUBREG_REG (x)) == REG
  	  && GET_MODE_SIZE (GET_MODE (x)) > GET_MODE_SIZE (GET_MODE (SUBREG_REG (x))))
+ 	/* XXX this is not calculating the max width, but instead simply
+ 	   overwriting it.  (matz) */
  	reg_max_ref_width[REGNO (SUBREG_REG (x))]
  	  = GET_MODE_SIZE (GET_MODE (x));
        return;


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]