This is the mail archive of the
mailing list for the GCC project.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, Paolo Bonzini <email@example.com> wrote:
>Wait wait. PR/21195 is about inlining
> the SSE builtins.
No. PR/21195 was really about inline heuristic going ballistic.
Those intrinsics are thin wrappers around builtins, and ultimately
resolve to a couple of operations. Typical C++ (accessors/ctors) also
presents lots of such small functions.
And guess what, same cause same symptom.
There's no sensible metric by which code i've quoted in previous mail
makes sense. Size? Nope. Execution time? Certainly not.
Again whether or not SSE ops are involved was and is still irrelevant.
> Your case seems to be different, because it involves inlining user
> routines. Again, you need to give us the preprocessed source code for
> us to look at your bug effectively.
Thanks for the tip, but i'll pass. I've done my duty already.
Months ago there was 2 options for fixing PR/21195:
a) Fix the inlining heuristic.
b) Kludge all intrinsics with always_inline.
I've tried to argue a bit but to no avail. So, while you remain
convinced everything's fine with the inliner, i'll keep tagging every
function in my code with always_inline/noinline where performance