This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug target/38306] [4.4 Regression] 15% slowdown of computational kernel

From: "jv244 at cam dot ac dot uk" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: 30 Nov 2008 16:17:19 -0000
Subject: [Bug target/38306] [4.4 Regression] 15% slowdown of computational kernel
References: <bug-38306-6642@http.gcc.gnu.org/bugzilla/>
Reply-to: gcc-bugzilla at gcc dot gnu dot org


------- Comment #4 from jv244 at cam dot ac dot uk  2008-11-30 16:17 -------
(In reply to comment #2)
> Due to the high density of branches in the code this is easily a code layout
> and/or padding issue.  Different architectures have different constraints on
> their decoders and branch predictors related to branch density.  Core
> introduces other branch limitations for loops that engage the loop stream
> detector.
> We do not at all try to properly optimize (or even model) this apart
> from inserting nops.  YMMV with -fschedule-insns.

I'm not expert enough to understand this, but you have it right. However, it
remains a regression (on opteron)

4.4: 
-O3 -march=native -funroll-loops  -ffast-math                  ==> 5.064s
-O3 -march=native -funroll-loops  -ffast-math -fschedule-insns ==> 4.396

4.3:
-O3 -march=native -funroll-loops  -ffast-math                  ==> 4.376
-O3 -march=native -funroll-loops  -ffast-math -fschedule-insns ==> 3.372

-fno-tree-reassoc has no effect.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38306

References:
- [Bug target/38306] New: [4.4 Regression] 15% slowdown of computational kernel
  - From: jv244 at cam dot ac dot uk

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]