This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Improve unrolled size estimates


> 
> Hum.  I find 1/3 future optimization a bit gross, but read on...

Well, me too, but it is only constant that works with current testsuite
:)
> > +      /* First make fast look if we see constant array inside.  */
> > +      while (handled_component_p (base) || TREE_CODE (base) == ARRAY_REF)
> 
> ARRAY_REF is in handled_component_p.
Even for non-constant offsets?
> 
> > +	base = TREE_OPERAND (base, 0);
> > +      if ((DECL_P (base) && TREE_CONSTANT (base) && DECL_INITIAL (base))
> > +	  || TREE_CODE (base) == STRING_CST)
> > +	{
> > +	  /* If so, see if we understand all the indices.  */
> > +	  base = op;
> > +	  while (handled_component_p (base)
> > +		 || (TREE_CODE (base) == ARRAY_REF
> > +		     && constant_after_peeling (TREE_OPERAND (base, 1), stmt, loop)))
> > +	    base = TREE_OPERAND (base, 0);
> > +	  return (DECL_P (base) || TREE_CODE (base) == STRING_CST);
> > +	}
> > +      return false;
> > +    }
> 
> Is it worth handling this very special case?  Note that you are 
> recursively asking for stuff again, so this may become quadratic, 
> especially as simple_iv isn't exactly very cheap.

Well, I visit every occurence of SSA_NAME once.  I added it because
there was couple testcases in testsuite that relied on loop reading from
constant array/string to be fully unrolled.

The problem I run into here was that with my fix to estimate_num_insns
those loads/stores are no longer estimated to cost of 0 (they are not
:).  In this we prevent unrolling of loop body then.  I can also simply
ignore the testcases and add PR as discussed earlier.
> 
> So you are basically doing constant propagation on the loop body with
> initial values from the entry edge.  Why not use CCP for this if you
> go through all this hassle?
> 
> IMHO the right thing to do is to enhance the SSA propagator to be
> able to work on SESE or SEME regions.  You should be able to seed
> SSA names before iterating (in this case with the entry edge values)
> and not call substitute_and_fold.

Hmm, I am doing very stupid cprop, I estimate size of the last iteration
based on dominance of exit edge and I ignore exit condition.  The later
actually makes more difference in practice, the cprop is primarily there
to not count increment of induction variable(s).

Feeding in CCP here seems to me like quite some work, but in addition to
that also noticeably more expensive - if I add interface to pass bitmap
representing loop body, still we will need to build lattices for all SSA
name in function and walk whole body to discover dominator order etc.

One option would be also to drop the cprop bits from patch, it will make
2 or 3 more testsuite failures with the cost patch as long as I recall.
Honza
> 
> If we want to go this way making heuristics fancy.
> 
> Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]