This is the mail archive of the
mailing list for the GCC project.
Re: Tweak loop peeling limits
- From: Joern Rennecke <joern dot rennecke at superh dot com>
- To: rakdver at atrey dot karlin dot mff dot cuni dot cz (Zdenek Dvorak)
- Cc: joern dot rennecke at superh dot com (Joern Rennecke), jh at suse dot cz (Jan Hubicka), gcc-patches at gcc dot gnu dot org, rth at redhat dot com, zack at codesourcery dot com
- Date: Fri, 20 Feb 2004 15:07:26 +0000 (GMT)
- Subject: Re: Tweak loop peeling limits
> I don't know exactly why, but just peeling the loops showed up to
> be almost as effective as unrolling them in some tests on x86_64.
> I call it a magic :-).
I suppose that must have to do with peculiarities of the x86_64
microarchitecture - that it can do non-taken branches in otherwise
straightline code quickly, but can't predict taken branches in a loop
If that is the case, naiive unrolling should also work fine for x86_64.
But for processors where the actual instruction count is still an issue,
there is no substitute for getting rid of some compares and branches.