This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

missed IPA/whopr optimization?



Hello all,


In the work I'm doing on my new book, I'm trying to show how modern compiler optimizations can eliminate a good deal of the overhead introduced by an modular/unit-testable design. In verifying some of my text, I found that GCC 4.4 and 4.5 (20091018, Ubuntu 9.10 package) isn't doing an optimization that I expected it to do:

class Calculable
{
public:
        virtual unsigned char calculate() = 0;
};

class X : public Calculable
{
public:
        unsigned char calculate() { return 1; }
};

class Y : public Calculable
{
public:
        unsigned char calculate() { return 2; }
};

static void print(Calculable& c)
{
        printf("%d\n", c.calculate());
        printf("+1: %d\n", c.calculate() + 1);
}

int main()
{
        X x;
        Y y;

        print(x);
        print(y);

        return 0;
}

GCC 4.5 (and 4.4.1) generates this approximate code:

~/src $ /usr/lib/gcc-snapshot/bin/g++ -O3 -ftree-loop-ivcanon -fivopts -ftree-loop-im -fwhole-program -fipa-struct-reorg -fipa-matrix-reorg -fgcse-sm -fgcse-las -fgcse-after-reload --param max-gcse-memory=100000000 --param max-pending-list-length=100000 folding-test-interface.cpp -o folding-test-interface_gcc450_20091018-O3-kitchen-sink

~/src$ objdump -Mintel -S folding-test-interface_gcc450_20091018-O3-kitchen-sink | less -p \<main

0000000000400310 <main>:
400310: 53 push rbx
400311: 48 83 ec 20 sub rsp,0x20
400315: 48 8d 5c 24 10 lea rbx,[rsp+0x10]
40031a: 48 c7 44 24 10 c0 04 mov QWORD PTR [rsp+0x10],0x4004c0
400321: 40 00
400323: 48 c7 04 24 00 05 40 mov QWORD PTR [rsp],0x400500
40032a: 00
40032b: 48 89 df mov rdi,rbx
40032e: ff 15 8c 01 00 00 call QWORD PTR [rip+0x18c] # 4004c0 <_ZTV1X+0x10>
400334: bf ac 04 40 00 mov edi,0x4004ac
400339: 0f b6 f0 movzx esi,al
40033c: 31 c0 xor eax,eax
40033e: e8 a5 03 00 00 call 4006e8 <printf@plt>
400343: 48 8b 44 24 10 mov rax,QWORD PTR [rsp+0x10]
400348: 48 89 df mov rdi,rbx
40034b: ff 10 call QWORD PTR [rax]
40034d: 0f b6 f0 movzx esi,al
400350: bf a4 04 40 00 mov edi,0x4004a4
400355: 31 c0 xor eax,eax
400357: 83 c6 01 add esi,0x1
40035a: e8 89 03 00 00 call 4006e8 <printf@plt>
[...]


as seen here, GCC isn't folding/inlining the constants returned across the virtual function boundary, even though they are visible in the compilation unit and -O3 -fwhole-program is being used. (Note that I started with just that commandline, and added things in an attempt to induce the optimization I was hoping for.)

I was able to induce the optimization by removing a level of indirection via two ways: 1) By having two print() methods, one overloaded to accept X& and a second overload to accept Y&; and 2) by replacing the classes with single-level indirection function pointers:
--
#include <stdio.h>


typedef unsigned char(*Calculable)(void);

unsigned char one() { return 1; }
unsigned char two() { return 2; }

static void print(Calculable calculate)
{
        printf("%d\n", calculate());
        printf("+1: %d\n", calculate() + 1);
}

int main()
{
        print(one);
        print(two);

return 0;
}
--
For completeness, this code is generated from the function-pointer example optimizes in the way I expect:
0000000000400390 <main>:
400390: 48 83 ec 08 sub rsp,0x8
400394: ba 01 00 00 00 mov edx,0x1
400399: be e4 04 40 00 mov esi,0x4004e4
40039e: bf 01 00 00 00 mov edi,0x1
4003a3: 31 c0 xor eax,eax
4003a5: e8 c6 02 00 00 call 400670 <__printf_chk@plt>
4003aa: ba 02 00 00 00 mov edx,0x2
4003af: be dc 04 40 00 mov esi,0x4004dc
4003b4: bf 01 00 00 00 mov edi,0x1
4003b9: 31 c0 xor eax,eax
4003bb: e8 b0 02 00 00 call 400670 <__printf_chk@plt>




Modifying this last example to include two function pointer indirections once again causes the optimization to be missed.

So, my questions are:
0) Am I missing some existing commandline parameter that would induce the optimization? (e.g. a bad connection between my chair and keyboard)
1) Is this a missed optimization bug, or is this a missing feature?
2) Either way, what are the steps to correct the issue?


Thanks in advance for insights and/or help!



PS: I would test with a newer 4.5.0 build, but I'm having trouble bootstrapping. Any help is appreciated on that email (sent yesterday), as well.

--
tangled strands of DNA explain the way that I behave.
http://www.clock.org/~matt


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]