This is the mail archive of the
mailing list for the GCC project.
Re: Loop peeling
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Tejas Belagod <tejas dot belagod at arm dot com>
- Cc: Evandro Menezes <e dot menezes at samsung dot com>, Jan Hubicka <hubicka at ucw dot cz>, GCC Development <gcc at gcc dot gnu dot org>
- Date: Wed, 29 Oct 2014 12:57:58 +0100
- Subject: Re: Loop peeling
- Authentication-results: sourceware.org; auth=none
- References: <033101cff2c7$96bff550$c43fdff0$ at samsung dot com> <CAFiYyc23OhPmV3DJa9z62DB5jwR1JWKowqQDKRyCbrBiKai0xA at mail dot gmail dot com> <5450D521 dot 1060500 at arm dot com>
On Wed, Oct 29, 2014 at 12:53 PM, Tejas Belagod <email@example.com> wrote:
> On 29/10/14 09:32, Richard Biener wrote:
>> On Tue, Oct 28, 2014 at 4:55 PM, Evandro Menezes <firstname.lastname@example.org>
>>> While doing some benchmark flag mining on AArch64, I noticed that
>>> -fpeel-loops was a mined option often. As a matter of fact, when using
>>> always, even without FDO, it seemed to raise most benchmarks and to leave
>>> almost all of the rest flat, with a barely noticeable cost in code-size.
>>> seems to me that it might be safe enough to be implied perhaps at -O3.
>>> there any reason why this never came into being?
> Loop peeling is done by default on AArch64 unless, IIRC,
> -fvect-cost-model=cheap is specified which switches it off. There was a
> general thread on loop peeling around the same time last year
> (https://gcc.gnu.org/ml/gcc/2013-11/msg00307.html) where Richard suggested
> that peeling vs. non-peeling should be factored into the vector cost model
> and is a more generic improvement.
Oh, you are talking about the vectorizer pro-/epilogue loops where we
know a (low) upper bound for the number of iterations. I think that
is enabled by default at -O3 as it is a "completely peeling" operation.
Only regular peeling which looks at the _estimated_ loop trip count
(peeling that number of times) is guarded by -fpeel-loops.
>> Not sure, but peeling is/was very stupid (peeling 8 times unconditionally
>> or not at all). At least without FDO (and with -fprofile-use it is
>> Similar case for -funroll-loops.
>> For GCC 5 peeling now moved to GIMPLE, so maybe things changed
>> for that (but I'd doubt that). Honza?