[Bug middle-end/29256] [4.2 regression] loop unrolling performance regression
rakdver at gcc dot gnu dot org
gcc-bugzilla@gcc.gnu.org
Thu Sep 28 11:34:00 GMT 2006
------- Comment #5 from rakdver at gcc dot gnu dot org 2006-09-28 11:34 -------
(In reply to comment #4)
> On x86_64 4.2 decides to unroll 9 times while on 4.1 it unrolls 8 times. This
> is
> a code-size regression, but other than that? The 4.2 version runs slightly
> faster than the 4.1 version, though the difference may be in the noise.
Choosing 9 instead of 8 looks weird, though :-). The reason is following:
jump threading in vrp2 pass peels one iteration of the loop. With this change,
unrolling by factor of 9 creates smaller code (only one extra iteration needs
to be peeled to make the number of iterations divisible by 9, while one would
need to peel 7 more iterations to make it divisible by 8).
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256
More information about the Gcc-bugs
mailing list