This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X


> 
> On 3/12/06, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> > > Yes, why is the benchmark not valid?
> >
> > It is valid.  We should understand why this behavior has changed so drastically.
> This benchmark maybe useless, it still exposes a weakness of gcc4. At
> least it's not news to me:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21195
> 
> So that PR has been closed when gcc-devs marked all those intrinsics
> as force_inline. That's also the kludge i use with my code. The real
> problem is once you start marking some functions as force_inline, you
> upset the inlining heuristic even more creating even more silly
> inlining misses, rince, repeat.
> At the end of the day, everything is marked either force_inline or
> noinline and you'd be better off without a heuristic at all.

Actually the best way of improving the inline heuristics is to get
a real testcase (and not some benchmark) where  the inline heuristics
is messed up.  Now SSE intrinsics are special in that they should be
always inlined and that fact should be hidden from the user.  Maybe
they should be rewritten so that they are just like the altivec
intrinsics in that it is just a plain #define and nothing special to
the user and no worrying about the inlining heuristic.  I should
note that always inline was added for altivec intrinsics in the 
first place and they have now since been rewritten.  Also the
kernel uses always inline but I and other feels that is a mistake.

-- Pinski


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]