This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug tree-optimization/51017] GCC 4.6 performance regression (vs. 4.4/4.5), PRE increases register pressure

From: "solar-gcc at openwall dot com" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Tue, 17 Feb 2015 03:11:11 +0000
Subject: [Bug tree-optimization/51017] GCC 4.6 performance regression (vs. 4.4/4.5), PRE increases register pressure
Auto-submitted: auto-generated
References: <bug-51017-4 at http dot gcc dot gnu dot org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51017

--- Comment #14 from Alexander Peslyak <solar-gcc at openwall dot com> ---
For completeness, here are the results for 4.7.x, 4.8.x, and 4.9.0:

4.7.0o - 2142K c/s, 29692 bytes, 1267 movaps, 465 movups
4.7.0h - 2823K c/s, 29692 bytes, 1732 movaps, 0 movups
4.7.4o - 2144K c/s, 29692 bytes, 1267 movaps, 465 movups
4.7.4h - 2827K c/s, 29692 bytes, 1732 movaps, 0 movups
4.8.0o - 1825K c/s, 27813 bytes, 1341 movaps, 721 movups
4.8.0h - 2792K c/s, 27813 bytes, 2062 movaps, 0 movups
4.8.4o - 1827K c/s, 27807 bytes, 1341 movaps, 721 movups
4.8.4h - 2786K c/s, 27807 bytes, 2062 movaps, 0 movups
4.9.0o - 1852K c/s, 28262 bytes, 1319 movaps, 721 movups
4.9.0h - 2685K c/s, 28262 bytes, 2040 movaps, 0 movups

4.8 produces the smallest code so far, but even with the aligned loads hack is
still 6% slower than 4.3.

All of these are with "-O2 -fomit-frame-pointer -Os -funroll-loops
-finline-functions", like similar results I had posted before.  Xeon E5420,
x86_64.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]