Bug 55477 - [devirt] trunk fails inline-devirt tests #2 and and #3 whereas they pass in google/4_7
Summary: [devirt] trunk fails inline-devirt tests #2 and and #3 whereas they pass in g...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.9.0
: P3 normal
Target Milestone: ---
Assignee: Jan Hubicka
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2012-11-26 22:04 UTC by Matt Hargett
Modified: 2014-01-28 01:11 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2012-11-27 00:00:00


Attachments
test case that fails in trunk but passes with google/gcc-4_7 (386 bytes, application/octet-stream)
2012-11-26 22:04 UTC, Matt Hargett
Details
diff against trunk adding new testcases (1.82 KB, application/octet-stream)
2013-02-14 21:28 UTC, Matt Hargett
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matt Hargett 2012-11-26 22:04:16 UTC
Created attachment 28782 [details]
test case that fails in trunk but passes with google/gcc-4_7

When compiling Maxim's inline-devirt-2.C and inline-devirt-3.C test cases with both current trunk (r193828) and google/gcc-4_7, the Google branch correctly optimizes and trunk does not. Test cases are here:
http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02589.html

My trunk configure line:
$ ../gcc-trunk/configure --program-suffix=-4.8 --prefix=/u/mhargett --disable-bootstrap --enable-lto --with-fpmath=sse --disable-libmudflap --disable-libssp --enable-gold=yes --with-mpc=/u/mhargett --with-cloog=/u/mhargett/ --with-ppl=/u/mhargett/ --with-gmp=/u/mhargett/ --with-isl=/u/mhargett --with-mpfr=/u/mhargett/ --enable-cloog-backend=isl --enable-languages=c,c++,lto

$ g++-4.8 -O2 inline-devirt-2.C

trunk objdump for the second testcase:
000000000400630 <main>:
  400630:       48 83 ec 28             sub    rsp,0x28
  400634:       48 89 e7                mov    rdi,rsp
  400637:       48 c7 04 24 b0 09 40    mov    QWORD PTR [rsp],0x4009b0
  40063e:       00 
  40063f:       48 c7 44 24 10 f0 09    mov    QWORD PTR [rsp+0x10],0x4009f0
  400646:       40 00 
  400648:       e8 23 01 00 00          call   400770 <_ZL5printR10Calculable>
  40064d:       48 8d 7c 24 10          lea    rdi,[rsp+0x10]
  400652:       e8 19 01 00 00          call   400770 <_ZL5printR10Calculable>
  400657:       31 c0                   xor    eax,eax
  400659:       48 83 c4 28             add    rsp,0x28
  40065d:       c3                      ret    
  40065e:       66 90                   xchg   ax,ax

and for google/gcc-4_7:
0000000000400630 <main>:
  400630:       48 83 ec 08             sub    rsp,0x8
  400634:       be 01 00 00 00          mov    esi,0x1
  400639:       bf c8 08 40 00          mov    edi,0x4008c8
  40063e:       31 c0                   xor    eax,eax
  400640:       e8 ab ff ff ff          call   4005f0 <printf@plt>
  400645:       be 02 00 00 00          mov    esi,0x2
  40064a:       bf c4 08 40 00          mov    edi,0x4008c4
  40064f:       31 c0                   xor    eax,eax
  400651:       e8 9a ff ff ff          call   4005f0 <printf@plt>
  400656:       be 02 00 00 00          mov    esi,0x2
  40065b:       bf c8 08 40 00          mov    edi,0x4008c8
  400660:       31 c0                   xor    eax,eax
  400662:       e8 89 ff ff ff          call   4005f0 <printf@plt>
  400667:       be 03 00 00 00          mov    esi,0x3
  40066c:       bf c4 08 40 00          mov    edi,0x4008c4
  400671:       31 c0                   xor    eax,eax
  400673:       e8 78 ff ff ff          call   4005f0 <printf@plt>
  400678:       31 c0                   xor    eax,eax
  40067a:       48 83 c4 08             add    rsp,0x8
  40067e:       c3                      ret    
  40067f:       90                      nop
Comment 1 Jan Hubicka 2012-11-26 22:39:17 UTC
Martin,
can you take a look?
Matt, did you identify the patch on google branch enabling the devirtualization here?
Comment 2 Richard Biener 2012-11-27 10:19:18 UTC
I suppose pristine 4.7 branch fails to devirtualize as well, right?
Comment 3 Jan Hubicka 2012-11-27 11:08:45 UTC
Hmm, devirt-2 is also the case we do not inline because function is called once.
With main renamed to main2 I get at -O3

int main2() ()
{
<bb 2>:
  printf ("%d\n", 1);
  printf ("+1: %d\n", 2);
  printf ("%d\n", 2);
  printf ("+1: %d\n", 3);
  return 0;

}

from GCC-4.7 as well as from the mainline.
Similarly for devirt-3.C.  I will check why this is not matched by the heavy heuristic I invented for PR55478, but otherwise I think it is non-bug.
Also in devirt-3 we no longer unroll the loop of 3 printf calls, that is IMO fine.

We should add the updated testcases (with renamed main) to the testsuite IMO.

Honza
Comment 4 Matt Hargett 2012-11-27 17:37:01 UTC
I'll add a loop to the test that hopefully triggers the inlining (and does the unrolling).

Adding both variants (renamed main and with loop) to the test suite would be fantastic.

Thanks for the prompt attention!
Comment 5 Martin Jambor 2012-11-28 14:43:20 UTC
(In reply to comment #4)
> I'll add a loop to the test that hopefully triggers the inlining (and does the
> unrolling).

Unrolling?

> 
> Adding both variants (renamed main and with loop) to the test suite would be
> fantastic.

Does this mean you are going to do it or is it a request?  FWIW, I
think that having just one variant is sufficient.
Comment 6 Matt Hargett 2013-02-11 01:55:51 UTC
I just tested with latest trunk (4.8.0 20130210). inline-devirt-2.C does indeed pass when adding an outer loop, but only at -O3. That is probably fine, but I could have sworn it used to work with the outer loop with just -O2.

inline-devirt-3.C has seemingly regressed further since my last test, for some reason. I *have* to supply both -O3 *and* -funroll-loops now to see the correct number of inlined printf statements in main():

#include <stdio.h>

class Calculable
{
public:
        virtual unsigned char calculate() const = 0;
        virtual ~Calculable() {}
};

class X : public Calculable
{
public:
        virtual unsigned char calculate() const { return 0; }
};

class Y : public Calculable
{
public:
        virtual unsigned char calculate() const { return 3; }
};

static void print(const Calculable& c)
{
        for (int i = 0; i < c.calculate(); i++)
        {
                printf("%d\n", c.calculate());
        }
}

int main()
{
        X x;
        Y y;

        for (int i=3; i > 0; i--)
        {
                print(x);
                print(y);
        }

        return 0;
}
Comment 7 Matt Hargett 2013-02-14 21:28:33 UTC
Created attachment 29455 [details]
diff against trunk adding new testcases
Comment 8 Matt Hargett 2014-01-28 01:11:41 UTC
this is resolved, except for the test cases being added. it looks like there are variants that may cover the cases, but someone should verify the attached testcases don't exercise slightly different things.