moving constants out of loops

Jeffrey A Law law@cygnus.com
Tue Jun 23 01:32:00 GMT 1998


First note, some of the giv/invariant issues you're raising here are
already handled by egcs.  Furthermore, we found that exposing more
givs and invariants wasn't always a win.  In fact, it was a huge loss
on the PA for a while (see alias discussion below).

You might want to try looking at simplify_giv_expr in egcs before
spending too much more time on this code :-) :-)

  > -	    tem = 0;
  > -	    if (CONSTANT_P (arg0) && GET_CODE (arg1) == CONST_INT)
  > +	    if (GET_CODE (arg1) == CONST_INT)
  >  	      {
  > +		int size = GET_MODE_SIZE (GET_MODE (arg0));
  > +		
  > +		if (!CONSTANT_P (arg0))
  > +		  return 0;
  >  		tem = plus_constant (arg0, INTVAL (arg1));
  >  		if (GET_CODE (tem) != CONST_INT)
  >  		  tem = gen_rtx (USE, mode, tem);
  > +	        return tem;
  >  	      }
  >  
  > -	    return tem;
  > +	    /* Both invariant. Generate invariant and try to move this
  > +	       out of the loop. */
  > +	    if (GET_CODE (arg1) == USE)
  > +	      arg1 = XEXP (arg1, 0);
  > +	    return gen_rtx (USE, mode, gen_rtx (PLUS, mode, arg0, arg1));
egcs already does something very similar to this code.  I think egcs
actually handles one case your code doesn't:

  (plus (invariant) (const_int)

Seems to me like you "return 0" for that case.  egcs will wrap that
into (use (plus (invariant) (const_int)).  Which is then available as
an invariant itself in other potential giv/invariant expressions.

I also don't see how you're using "size" If you don't use the variable
there's no sense in setting it :-)



  >  	  case REG:
  >  	  case MULT:
  > @@ -5386,8 +5394,12 @@ simplify_giv_expr (x, benefit)
  >  	  return GEN_INT (INTVAL (arg0) * INTVAL (arg1));
  >  
  >  	case USE:
  > -	  /* invar * invar.  Not giv.  */
  > -	  return 0;
  > +	  /* Both invariant. Generate invariant and try to move this
  > +	     out of the loop. */
  > +	  if (GET_CODE (arg1) == USE)
  > +	    arg1 = XEXP (arg1, 0);
  > +	  return gen_rtx (USE, mode,
  > +			  gen_rtx (MULT, mode, XEXP (arg0, 0), arg1));
I think you need to make sure to strip any USE off both arg0 and arg1
since both arg0 and arg1 could be invariant registers/expressions
(at least that's the case for egcs).

Oh wait, I guess you're catching that by stripping it off arg0 in
the gen_rtx (MULT (... )).

I think something like this looks better:

          /* invar * invar must be an invariant.  */
          if (GET_CODE (arg0) == USE)
            arg0 = XEXP (arg0, 0);
          if (GET_CODE (arg1) == USE)
            arg1 = XEXP (arg1, 0);

          return gen_rtx_USE (mode, gen_rtx_MULT (mode, arg0, arg1));

Interestingly enough, adding that code makes the loop one instruction
longer on my hppa, but according to the scheduler we can actually schedule
things better (enough to make it two cycles faster.  Take that with
a grain of salt :-)

Three issues you need to be aware of when exposing more givs and
loop invariants:

	* Register pressure.  It's kind of neat to see gcc finding
	all the givs and invariants until you start looking at
	regressions and realize that by finding all the givs/invariants
	the register pressure in the loop has skyrocketed and blown
	out the entire register file on your risc machine.  Just
	imagine what it can do on an x86 :(  egcs has punted this
	issue temporarily, but the giv/invariant code needs to be
	more register pressure aware in the future.

	* loop/cse2 will not clean up the loop pre-header very well.
	Basically it'll leave lots of redundant operations in the
	pre-header because they weren't emitted in such a way as to
	expose the common subexpressions to cse2.  This may be addressed
	by some changes rth is working on for egcs.  While these are
	outside the loop, cleaning them up can net you another percent 
	or two in some cases.

	* Exposing more complicated givs really confuses the alias
	code, particularly if you've got a machine which does
	scheduling *and* has autoincrement address modes (PA).  We
	were actually seeing some pretty significant regressions
	on the PA because of the alias issues.  

	We (I) beat on the alias code in egcs to address these
	issues.  I doubt the existing alias code in gcc2 is up to
	the task.  So you might actually be hurting performance on
	some platforms.

The expr.c changes seem pretty reasonable.  There's a similar hunk
of code in store_constructor that you might want to fix, though
I doubt it matters as much as the two cases you did address.

It made no difference on my PA.  I'm not enough of an x86 expert to
know if the new code is better than the old code or not :-)

jeff



More information about the Gcc-patches mailing list