This is the mail archive of the
mailing list for the GCC project.
Re: [RFC] Fix PR19401: always completely peel loops
> > > This minimal patch unconditionally enables complete loop peeling
> > > at the tree level.
> > I do not see any data numbers baking up this chain. Ideally, I would like to
> > see how this affects the bootstrap time, how it affects the SPEC scores, how it
> > affects compile/run time on POOMA. Also, it should be tested at -Os too, like
> > on CSiBE (or similar), since it should not make GCC regress when optimizing for
> > size.
> Well, it's motivated two-ways - for one, we currently generate very
> much worse code for std::pow(x, 2) than for std::pow(x, 2.0) if you
> do not specify -funroll-loops -- for another, in gcc 3.4 I could get
> the loops from PR19401 unrolled with specifying -fpeel-loops and not
> get hurt by side-effects of -funroll-loops. For the tree peeler, I
> cannot do this, as its use is guarded by flag_unroll_loops, not
> flag_peel_loops (which would be an alternative patch, I'd propose
> if complete unrolling cannot be enabled by default).
I am not persuaded that enabling complete loop unrolling unconditionally
as done by this patch is a good idea.
I would prefer:
1) Unrolling loops completely whenever this does not cause a code
2) Making all possible loops to be completely unrolled at -O3.
3) Possibly having a separate flag controlling complete loop unrolling.