This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/21628] GCC much slower than ICL. Lack of inlining?
- From: "rguenth at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 16 Nov 2007 18:00:32 -0000
- Subject: [Bug middle-end/21628] GCC much slower than ICL. Lack of inlining?
- References: <bug-21628-3458@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #3 from rguenth at gcc dot gnu dot org 2007-11-16 18:00 -------
Note that for completely inlining kernels you can use the
__attribute__((flatten))
on the *calling* function. Usually with expression templates that is the
function
containing the loops, like
void __attribute__((flatten)) doit()
{
for (;;)
lots_of_calls_to_inline ();
}
and it will make sure to inline all calls done in doit (recursively, so no
calls
will be left in the final version). Also starting with GCC 4.2 (and much
improved on trunk which will become 4.3) using profile-feedback will
improve inline performance a lot (use -fprofile-generate, run, -fprofile-use).
I'll close this bug as worksforme as it doesn't have a useful testcase and
from my experience with tramp3d-v4 performance of ICC sucks compared to
GCC because ICC inlines too little ;)
--
rguenth at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rguenth at gcc dot gnu dot
| |org
Status|UNCONFIRMED |RESOLVED
Resolution| |WORKSFORME
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21628