This is the mail archive of the
mailing list for the GCC project.
Re: 50% slowdown with LTO
On Mon, Aug 13, 2012 at 8:27 AM, <Paul_Koning@dell.com> wrote:
> I'm not sure what LTO is supposed to do -- the documentation is not exactly clear. But I assumed it should make things faster and/or smaller.
> So I tried using it on an application -- a processor emulator, CPU intensive code, a lot of 64 bit integer arithmetic.
> Using a compile/assembler run on the emulated system as a benchmark, I compared the code on x86_64-linux, gcc 4.7.0, -O2 plain, -O2 -fprofile-use (after having done -fprofile-generate), and -O2 -fprofile-use -flto (using a separate set of profile data files from -fprofile-generate -flto).
> Results: profiling speeds things up about 8%, but LTO is 50% (!) slower than without.
> Any suggestions of what to look at for this?
LTO lets the compiler see all the code at once, enabling optimizations
like inlining function calls across different source files. Like any
optimization, there are cases where it will cause code to slow down
rather than speed up. A 50% slowdown is certainly unusual, and
suggests some systematic error.
Figuring out what has gone wrong is like optimizing any program. Get
a profile for your program, e.g., using -pg. Build the program with
and without -flto, run it, and look at the resulting profiles. A 50%
slowdown should be fairly obvious. I would guess that GCC has made a
poor inlining decision, but the profile should show the problem for