This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

complete_unrolli / complete_unroll

From: Albert Cohen <Albert dot Cohen at inria dot fr>
To: gcc at gcc dot gnu dot org
Date: Wed, 19 Aug 2009 13:53:22 +0200
Subject: complete_unrolli / complete_unroll
References: <E1MdhYW-0005WQ-00.aserg2004-list-ru@f143.mail.ru>

When debugging graphite, we ran into code bloat issues due to
pass_complete_unrolli being called very early in the non-ipa
optimization sequence. Much later, the full-blown pass_complete_unroll
is scheduled, and this one does not do any harm.

Strangely, this early unrolling pass (tuned to only unroll inner loops)
is only enabled at -O3, independently of the -funroll-loops flag.

Does anyone remember why it is there, for which platform it is useful,
and what are the perf regressions if we remove it?

My guess is that it may only harm... disabling or damaging the
effectivenesss of the (loop-level) vectorizer and increasing compilation
time.

Thanks,
Albert

PS: When this question is solved, it will also be interesting to start a
serious discussion on how to improve the flexibility in customizing pass
ordering and parameterization of passes depending on the target. Grigori
Fursin's work shows the strong benefits and already provides a working
prototype. This question is independent of whether the customization is
done by experts or machine-learning/statistical techniques.

Follow-Ups:
- Re: complete_unrolli / complete_unroll
  - From: Richard Guenther

References:
- mips64 gcc 3.3.6 problem
  - From: Sergey Anosov

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]