This is the mail archive of the
mailing list for the GCC project.
Re: 100x perfomance regression between gcc 3.4.5 and gcc 4.X
On 3/13/06, Andrew Pinski <email@example.com> wrote:
Ah, you mean a brand new testcase because PR-21195 wasn't good enough?
Actually the best way of improving the inline heuristics is to get
a real testcase (and not some benchmark) where the inline heuristics
is messed up.
show up in GCC 4.1 except for Wait wait. PR/21195 is about inlining
the SSE builtins. These are special because, for example, you probably
would prefer GDB to not step into them, but just execute them. As
Andrew said, it is only an implementation choice (subject to revision)
that they are implemented as inline functions at all. For example, if
an older GCC had a similar bug with Altivec intrinsics, it would have
showed up only in C++ (because Altivec intrinsics were never implemented
as inlines in C) and would not show up anymore in GCC 4.1 except for a
handful of intrinsics (because most Altivec intrinsics are not inlines
at all anymore).
memset/memcpy is different from SSE builtins because the choice of
whether to inline or not is target dependent, and because glibc also
decides whether or not to provide its own inlining, depending on the GCC
version you're using. So the best way to report the problem is to file
a *preprocessed* testcase into Bugzilla (i.e. the output of "gcc -E
testcase.c > testcase.i" or equivalently "gcc -save-temps testcase.c",
and to include the output of
gcc -v testcase.c -O2
of the bug report. Using preprocessed source code at least makes sure
that the glibc choices are not influencing the comparison between 3.4.x
and 4.0.x. This information is present in the "how to file a bug"
chapter of the manual.
Your case seems to be different, because it involves inlining user
routines. Again, you need to give us the preprocessed source code for
us to look at your bug effectively.