This is the mail archive of the egcs@egcs.cygnus.com mailing list for the EGCS project. See the EGCS home page for more information.


[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index] [Subject Index] [Author Index] [Thread Index]

Re: strength reduction example



> The current code always does this:
>             VARRAY_CHAR (may_not_optimize, new_regno) = 0;
> The patched code only does this if an increment is eliminated.  I don't
> understand how that can be right unless the current code is actually wrong.

Yes, this is really only important when the increment is converted but
not eliminated (because only then is the register used at all), but best
done unconditionally.  I changed the code accordingly.

> The patch calls validate_subst for every instruction, regardless of whether
> it uses old_reg.  validate_subst calls subst and recog for every instruction
> regardless of whether any change was made.  This seems wasteful.  Shouldn't
> we call reg_mentioned_p first, and if it returns true, only then call
> validate_subst?  That might not work if there is a REG_EQUAL note that
> refers to old_reg but the insn doesn't.  Perhaps the reg_mentioned_p check
> should be in validate_subst?

Ok.

> Since we really want to create address givs, maybe we should only do this
> substitution inside MEMs?  Will something useful happen if it gets replaced
> elsewhere?  I guess that will result in creation of a DEST_REG giv, so it
> should still work.

It can change DEST_REG givs using add or lea, and on processors like the
m68k it might even create the odd pea.

> The code in validate_subst to set previous_undos from undos looks unnecessary.
> previous_undos is only used if we call subst multiple times.  If we want
> to undo the effect of the last subst call, then we can restore undos from
> previous_undos.  This doesn't apply to validate_subst.

It is necessary because gen_rtx_combine looks at previous_undos.

> validate_subst assumes that undobuf.undos has some reasonable value at
> the beginning.  I don't think that is a safe assumption.  You should set
> undobuf.undos to zero at the beginning.

Yes.

> Why do you call validate_change in validate_subst?  It would be simpler to
> call recog_insn.  You could then get rid of 3 other statements that fiddle
> wtih PATTERN (insn).  The only reason I see for this is because validate_change
> tries adding/removing clobbers.  In that case, you should call
> recog_for_combine.  It looks safe, and will probably fix a bug, since
> recog_for_combine looks for the (clobber (const_int 0)) stuff that
> combine sometimes creates for invalid substitutions.  These clobbers would
> just be ignored by validate_change.

Originally the ida was that you can use the in_group mechanism of
validate_change - but that wouldn't actually work since subst does
destructive changes.  But yes, the clobber handling is better than calling
recog directly.
I can't use recog_for_combine because it uses data flow information.
In oparticular, reg_dead_at_p looks at basic_block_live_at_start[block].

> validate_subst only tries to fix a REG_EQUAL note if validate_change
> succeeded.  But what if we have an insn that doesn't use the register
> but has a REG_EQUAL note that does?  This could happen for a libcall
> sequence.  It might be possible in other cases.  In loop, the
> DEST_REG giv case immediately after the validate_subst call handles
> reg notes independently.  Either the code in validate_subst is wrong,
> or the code in loop is unnecessarily inefficient.  I think the validate_subst
> code is wrong.

validate_change will succeed also if the register is not mentioned,
since the pattern was forced to be different.

The code in loop handles a different substitution, namely substituting
the biv with a DEST_REG giv.

In the meantime Jeffrey Law has sent me a testcase that pointed out a bug
in the code that I was patching: if a DEST_REG giv is created, we need
v->insn and last_use_insn to have valid luids.  There was no provision to
guarantee this, so that we can get failures when an inner loop has been
unrolled, and we then process the outer loop which contains a number of
insns with uids > max_uid_for_loop.

This has made a change of plans necessary: it must be possible to undo
all substitutions if we'd otherwise end up with a DEST_REG giv without
valid LUIDs.

Sat Feb 27 00:56:34 1999  J"orn Rennecke <amylaar@cygnus.co.uk>

        * rtl.h (validate_subst, validate_subst_start, validate_subst_undo):
	Declare.
        * loop.c (strength_reduce): In code to convert biv increments to givs,
	try to create address givs.
        * combine.c (undo_last): New functions.
        (validate_subst, validate_subst_start, validate_subst_undo): Likewise.
        (combine_instructions): Before returning, clear reg_last_set_value.
        (nonzero_bits, num_sign_bit_copies, reg_last_value):
        Handle case when reg_last_set_value is not set.

Index: rtl.h
===================================================================
RCS file: /cvs/cvsfiles/devo/gcc/rtl.h,v
retrieving revision 1.105.2.3
diff -p -r1.105.2.3 rtl.h
*** rtl.h	1999/02/25 19:28:24	1.105.2.3
--- rtl.h	1999/02/27 00:56:26
*************** extern void add_clobbers		PROTO ((rtx, i
*** 1369,1374 ****
--- 1369,1377 ----
  extern void combine_instructions	PROTO ((rtx, int));
  extern int extended_count		PROTO ((rtx, enum machine_mode, int));
  extern rtx remove_death			PROTO ((int, rtx));
+ extern int validate_subst		PROTO((rtx, rtx, rtx));
+ extern void validate_subst_start	PROTO((void));
+ extern void validate_subst_undo		PROTO((void));
  #ifdef BUFSIZ
  extern void dump_combine_stats		PROTO ((FILE *));
  extern void dump_combine_total_stats	PROTO ((FILE *));
Index: loop.c
===================================================================
RCS file: /cvs/cvsfiles/devo/gcc/loop.c,v
retrieving revision 1.166.2.37
diff -p -r1.166.2.37 loop.c
*** loop.c	1999/02/25 19:28:25	1.166.2.37
--- loop.c	1999/02/27 00:56:28
*************** strength_reduce (scan_start, end, loop_t
*** 4162,4168 ****
  	  for (vp = &bl->biv, next = *vp; v = next, next = v->next_iv;)
  	    {
  	      HOST_WIDE_INT offset;
! 	      rtx set, add_val, old_reg, dest_reg, last_use_insn;
  	      int old_regno, new_regno;
  
  	      if (! v->always_executed
--- 4195,4201 ----
  	  for (vp = &bl->biv, next = *vp; v = next, next = v->next_iv;)
  	    {
  	      HOST_WIDE_INT offset;
! 	      rtx set, src, add_val, old_reg, dest_reg, last_use_insn;
  	      int old_regno, new_regno;
  
  	      if (! v->always_executed
*************** strength_reduce (scan_start, end, loop_t
*** 4182,4187 ****
--- 4215,4222 ----
  	      add_val = plus_constant (next->add_val, offset);
  	      old_reg = v->dest_reg;
  	      dest_reg = gen_reg_rtx (v->mode);
+ 	      old_regno = REGNO (old_reg);
+ 	      new_regno = REGNO (dest_reg);
      
  	      /* Unlike reg_iv_type / reg_iv_info, the other three arrays
  		 have been allocated with some slop space, so we may not
*************** strength_reduce (scan_start, end, loop_t
*** 4194,4208 ****
  		  VARRAY_GROW (n_times_set, nregs);
  		  VARRAY_GROW (may_not_optimize, nregs);
  		}
      
! 	      validate_change (v->insn, &SET_DEST (set), dest_reg, 1);
! 	      validate_change (next->insn, next->location, add_val, 1);
! 	      if (! apply_change_group ())
  		{
  		  vp = &v->next_iv;
  		  continue;
  		}
  	      next->add_val = add_val;
  	      v->dest_reg = dest_reg;
  	      v->giv_type = DEST_REG;
  	      v->location = &SET_SRC (set);
--- 4229,4306 ----
  		  VARRAY_GROW (n_times_set, nregs);
  		  VARRAY_GROW (may_not_optimize, nregs);
  		}
+ 	      VARRAY_CHAR (may_not_optimize, new_regno) = 0;
      
! 	      if (! validate_change (next->insn, next->location, add_val, 0))
  		{
  		  vp = &v->next_iv;
  		  continue;
  		}
+ 
+ 	      src = SET_SRC (set);
+ 	      /* Try to replace all uses of OLD_REG with SRC.  This will
+ 		 mostly win when it generates / changes address givs, but it
+ 		 might also change some DEST_REG givs or create the odd
+ 		 PEA on an 68k.  */
+ 	      last_use_insn = NULL_RTX;
+ 	      validate_subst_start ();
+ 	      for (p = NEXT_INSN (v->insn); p != next->insn; p = NEXT_INSN (p))
+ 		{
+ 		  rtx newpat;
+ 
+ 		  if (GET_RTX_CLASS (GET_CODE (p)) != 'i')
+ 		    continue;
+ 		  if (! validate_subst (p, old_reg, src))
+ 		    last_use_insn = p;
+ 		}
+ 	      /* If some uses remain, we'd like to make this a DEST_REG
+ 		 giv.  However, after loop unrolling, V->INSN or LAST_USE_INSN
+ 		 might have no valid luid.  We need these not only for
+ 		 calculating the lifetime now, but also in recombine_givs when
+ 		 doing giv derivation, to find givs with non-overlapping
+ 		 lifetimes.  So if we don't have LUIDs available, or if we
+ 		 can't calculate the giv, leave the biv increment alone.  */
+ 	      if (last_use_insn
+ 		  && (INSN_UID (v->insn) >= max_uid_for_loop
+ 		      || INSN_UID (last_use_insn) >= max_uid_for_loop
+ 		      || ! validate_change (v->insn, &SET_DEST (set),
+ 					    dest_reg, 0)))
+ 		{
+ 		  /* Change the increment at NEXT back to what it was.  */
+ 		  if (! validate_change (next->insn, next->location,
+ 		      next->add_val, 0))
+ 		    abort ();
+ 
+ 		  /* Undo all the substitutions made by validate_subst above,
+ 		     since the biv does hold the incremented value after
+ 		     all.  */
+ 		  validate_subst_undo ();
+ 
+ 		  vp = &v->next_iv;
+ 		  continue;
+ 		}
+ 
+ 	      /* Remove the increment from the list of biv increments.  */
+ 	      *vp = next;
+ 	      bl->biv_count--;
+ 	      VARRAY_INT (set_in_loop, old_regno)--;
+ 	      VARRAY_INT (n_times_set, old_regno)--;
  	      next->add_val = add_val;
+ 
+ 	      if (! last_use_insn)
+ 		{
+ 		  if (loop_dump_stream)
+ 		    fprintf (loop_dump_stream,
+ 			     "Increment %d of biv %d eliminated.\n\n",
+ 			 INSN_UID (v->insn), old_regno);
+ 		  PUT_CODE (v->insn, NOTE);
+ 		  NOTE_LINE_NUMBER (v->insn) = NOTE_INSN_DELETED;
+ 		  NOTE_SOURCE_FILE (v->insn) = 0;
+ 		  VARRAY_INT (set_in_loop, new_regno) = 0;
+ 		  VARRAY_INT (n_times_set, new_regno) = 0;
+ 		  continue;
+ 		}
+ 
  	      v->dest_reg = dest_reg;
  	      v->giv_type = DEST_REG;
  	      v->location = &SET_SRC (set);
*************** strength_reduce (scan_start, end, loop_t
*** 4224,4244 ****
  	      v->replaceable = 1;
  	      v->no_const_addval = 0;
      
- 	      old_regno = REGNO (old_reg);
- 	      new_regno = REGNO (dest_reg);
- 	      VARRAY_INT (set_in_loop, old_regno)--;
  	      VARRAY_INT (set_in_loop, new_regno) = 1;
- 	      VARRAY_INT (n_times_set, old_regno)--;
  	      VARRAY_INT (n_times_set, new_regno) = 1;
- 	      VARRAY_CHAR (may_not_optimize, new_regno) = 0;
      
  	      REG_IV_TYPE (new_regno) = GENERAL_INDUCT;
  	      REG_IV_INFO (new_regno) = v;
      
! 	      /* Remove the increment from the list of biv increments,
! 		 and record it as a giv.  */
! 	      *vp = next;
! 	      bl->biv_count--;
  	      v->next_iv = bl->giv;
  	      bl->giv = v;
  	      bl->giv_count++;
--- 4322,4334 ----
  	      v->replaceable = 1;
  	      v->no_const_addval = 0;
      
  	      VARRAY_INT (set_in_loop, new_regno) = 1;
  	      VARRAY_INT (n_times_set, new_regno) = 1;
      
  	      REG_IV_TYPE (new_regno) = GENERAL_INDUCT;
  	      REG_IV_INFO (new_regno) = v;
      
! 	      /* Record V as a giv.  */
  	      v->next_iv = bl->giv;
  	      bl->giv = v;
  	      bl->giv_count++;
Index: combine.c
===================================================================
RCS file: /cvs/cvsfiles/devo/gcc/combine.c,v
retrieving revision 1.193
diff -p -r1.193 combine.c
*** combine.c	1999/01/20 17:50:18	1.193
--- combine.c	1999/02/27 00:56:32
*************** static int sets_function_arg_p	PROTO((rt
*** 397,402 ****
--- 397,403 ----
  static int combinable_i3pat	PROTO((rtx, rtx *, rtx, rtx, int, rtx *));
  static rtx try_combine		PROTO((rtx, rtx, rtx));
  static void undo_all		PROTO((void));
+ static void undo_last		PROTO((void));
  static rtx *find_split_point	PROTO((rtx *, rtx));
  static rtx subst		PROTO((rtx, rtx, rtx, int, int));
  static rtx simplify_rtx		PROTO((rtx, enum machine_mode, int, int));
*************** combine_instructions (f, nregs)
*** 674,679 ****
--- 675,681 ----
    total_successes += combine_successes;
  
    nonzero_sign_valid = 0;
+   reg_last_set_value = 0;
  
    /* Make recognizer allow volatile MEMs again.  */
    init_recog ();
*************** undo_all ()
*** 2682,2687 ****
--- 2727,2758 ----
       affected.  */
    subst_prev_insn = NULL_RTX;
  }
+ 
+ /* Undo all the modifications recorded in undobuf after previous_undos.  */
+ 
+ static void
+ undo_last ()
+ {
+   struct undo *undo, *next;
+ 
+   for (undo = undobuf.undos; undo != undobuf.previous_undos; undo = next)
+     {
+       next = undo->next;
+       if (undo->is_int)
+ 	*undo->where.i = undo->old_contents.i;
+       else
+ 	*undo->where.r = undo->old_contents.r;
+ 
+       undo->next = undobuf.frees;
+       undobuf.frees = undo;
+     }
+ 
+   undobuf.undos = undobuf.previous_undos;
+ 
+   /* Clear this here, so that subsequent get_last_value calls are not
+      affected.  */
+   subst_prev_insn = NULL_RTX;
+ }
  
  /* Find the innermost point within the rtx at LOC, possibly LOC itself,
     where we have an arithmetic expression and return that point.  LOC will
*************** nonzero_bits (x, mode)
*** 7623,7628 ****
--- 7694,7703 ----
  	}
  #endif
  
+       /* If called from loop, the reg_last_* arrays are not set.  */
+       if (! reg_last_set_value)
+ 	return nonzero;
+ 
        /* If X is a register whose nonzero bits value is current, use it.
  	 Otherwise, if X is a register whose value we can find, use that
  	 value.  Otherwise, use the previously-computed global nonzero bits
*************** num_sign_bit_copies (x, mode)
*** 8018,8023 ****
--- 8093,8102 ----
  	return GET_MODE_BITSIZE (Pmode) - GET_MODE_BITSIZE (ptr_mode) + 1;
  #endif
  
+       /* If called from loop, the reg_last_* arrays are not set.  */
+       if (! reg_last_set_value)
+ 	return 1;
+ 
        if (reg_last_set_value[REGNO (x)] != 0
  	  && reg_last_set_mode[REGNO (x)] == mode
  	  && (REG_N_SETS (REGNO (x)) == 1
*************** get_last_value (x)
*** 10924,10930 ****
        && (value = get_last_value (SUBREG_REG (x))) != 0)
      return gen_lowpart_for_combine (GET_MODE (x), value);
  
!   if (GET_CODE (x) != REG)
      return 0;
  
    regno = REGNO (x);
--- 11003,11010 ----
        && (value = get_last_value (SUBREG_REG (x))) != 0)
      return gen_lowpart_for_combine (GET_MODE (x), value);
  
!   /* If called from loop, the reg_last_* arrays are not set.  */
!   if (GET_CODE (x) != REG || ! reg_last_set_value)
      return 0;
  
    regno = REGNO (x);
*************** insn_cuid (insn)
*** 12086,12091 ****
--- 12166,12251 ----
      abort ();
  
    return INSN_CUID (insn);
+ }
+ 
+ void
+ validate_subst_start ()
+ {
+   undobuf.undos = 0;
+ 
+   /* Save the current high-water-mark so we can free storage if we didn't
+      accept this set of combinations.  */
+   undobuf.storage = (char *) oballoc (0);
+ }
+ 
+ void
+ validate_subst_undo ()
+ {
+   undo_all ();
+ }
+ 
+ /* Replace FROM with to throughout in INSN, and make simplifications.
+    Return nonzero for success.  */
+ int
+ validate_subst (insn, from, to)
+      rtx insn, from, to;
+ {
+   rtx pat, new_pat;
+   int i;
+ 
+   pat = PATTERN (insn);
+ 
+   /* If from is not mentioned in PAT, we don't need to grind it through
+      subst.  This can safe some time, and also avoids suprious failures
+      when a simplification is not recognized as a valid insn.  */
+   if (! reg_mentioned_p (from, pat))
+     {
+       /* But we still have to make sure a REG_EQUAL note gets updated.  */
+       rtx note = find_reg_note (insn, REG_EQUAL, NULL_RTX);
+ 
+       if (note)
+ 	XEXP (note, 0) = subst (XEXP (note, 0), from, to, 0, 1);
+       return 1;
+     }
+ 
+   /* We have to set previous_undos to prevent gen_rtx_combine from re-using
+      some piece of shared rtl.  */
+   undobuf.previous_undos = undobuf.undos;
+ 
+   subst_insn = insn;
+ 
+   new_pat = subst (pat, from, to, 0, 1);
+ 
+   /* If PAT is a PARALLEL, check to see if it contains the CLOBBER
+      we use to indicate that something didn't match.  If we find such a
+      thing, force rejection.
+      This is the same test as in recog_for_combine; we can't use the that
+      function here because it tries to use data flow information.  */
+   if (GET_CODE (pat) == PARALLEL)
+     for (i = XVECLEN (pat, 0) - 1; i >= 0; i--)
+       if (GET_CODE (XVECEXP (pat, 0, i)) == CLOBBER
+ 	  && XEXP (XVECEXP (pat, 0, i), 0) == const0_rtx)
+     {
+       undo_last ();
+       return 0;
+     }
+ 
+   /* Change INSN to a nop so that validate_change is forced to re-recognize.  */
+   PATTERN (insn) = const0_rtx;
+   if (validate_change (insn, &PATTERN (insn), new_pat, 0))
+     {
+       rtx note = find_reg_note (insn, REG_EQUAL, NULL_RTX);
+ 
+       if (note)
+ 	XEXP (note, 0) = subst (XEXP (note, 0), from, to, 0, 1);
+       return 1;
+     }
+   else
+     {
+       PATTERN (insn) = pat;
+       undo_last ();
+       return 0;
+     }
  }
  
  void