Bug 93115 - gcc fails to emit inline function on llvm-roc project: -O1 -fPIC -fdevirtualize -fdevirtualize-speculatively -fipa-cp -fipa-cp-clone -fvisibility-inlines-hidden
Summary: gcc fails to emit inline function on llvm-roc project: -O1 -fPIC -fdevirtuali...
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: ipa (show other bugs)
Version: 9.2.0
: P3 normal
Target Milestone: ---
Assignee: Jan Hubicka
URL:
Keywords: link-failure, visibility, wrong-code
Depends on:
Blocks: 100424 visibility
  Show dependency treegraph
 
Reported: 2020-01-01 22:12 UTC by Sergei Trofimovich
Modified: 2024-04-08 21:16 UTC (History)
5 users (show)

See Also:
Host:
Target: x86_64-pc-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-01-02 00:00:00


Attachments
bug.cpp (656 bytes, text/x-csrc)
2020-01-01 22:12 UTC, Sergei Trofimovich
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sergei Trofimovich 2020-01-01 22:12:43 UTC
Created attachment 47577 [details]
bug.cpp

Original build failure found and diagnosed by Jan Ziak in https://bugs.gentoo.org/704252. There gcc-9.2.0 fails to link an LLVM library due to missing inline function definition.

It looks like two main triggers are -fdevirtualize-speculatively and -fipa-cp-clone trick gcc into not emitting inline function body. And only presence of -fvisibility-inlines-hidden makes it noticeable.

I've managed to reproduce it on gcc-master and on gcc-9.2.0:

This works:
$ g++-9.2.0 -fPIC -O1 -fdevirtualize -fdevirtualize-speculatively -fipa-cp -fipa-cp-clone -fPIC -shared -o libbug.so bug.cpp -DLIB_FILE
$ g++-9.2.0 -fPIC -O1 -fdevirtualize -fdevirtualize-speculatively -fipa-cp -fipa-cp-clone               -o main      bug.cpp -DMAIN_FILE -L. -lbug

This fails:
g++-9.2.0 -fPIC -O1 -fdevirtualize -fdevirtualize-speculatively -fipa-cp -fipa-cp-clone -fvisibility-inlines-hidden -fPIC -shared -o libbug.so bug.cpp -DLIB_FILE
g++-9.2.0 -fPIC -O1 -fdevirtualize -fdevirtualize-speculatively -fipa-cp -fipa-cp-clone -fvisibility-inlines-hidden               -o main      bug.cpp -DMAIN_FILE -L. -lbug
/usr/bin/ld: /tmp/ccQYknNI.o: in function `p()':
bug.cpp:(.text+0x25): undefined reference to `m::av() const'
/usr/bin/ld: bug.cpp:(.text+0x4e): undefined reference to `m::av() const'
collect2: error: ld returned 1 exit status

Note: we build both library libbug.so and main executables to observe that main lacks definition of inlinable 'm::av()'.
Comment 1 Sergei Trofimovich 2020-01-01 22:16:20 UTC
I've built gcc-master as:

$ ./xg++ -v
Using built-in specs.
COLLECT_GCC=./xg++
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --enable-languages=c,c++ --disable-bootstrap --with-multilib-list=m64 --prefix=/home/slyfox/dev/git/gcc-clean/../gcc-native-quick-installed --disable-nls --without-isl --disable-libsanitizer --disable-libvtv --disable-libgomp --disable-libstdcxx-pch --disable-libunwind-exceptions CFLAGS='-O1 ' CXXFLAGS='-O1 ' --with-sysroot=/usr/x86_64-HEAD-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.0.0 20200101 (experimental) (GCC)
Comment 2 Sergei Trofimovich 2020-01-01 22:18:21 UTC
bug.cpp is a trimmed down version of llvm-roc's codebase with creduce.
Comment 3 Martin Liška 2020-01-02 07:34:13 UTC
I can confirm it. If I see correctly, it started on trunk with r276416.
I can't reproduce it on GCC 9 branch (neither with GCC 9.2.0 release).
Comment 4 Jan Hubicka 2020-01-02 15:57:19 UTC
The problem here is that we produce ipa-cp clone to devirtualize v::av which also lead to devirtualization of m::av, but we miss this optimization. After inlining we remove m::av and while producing the ipa-cp clone we devirtualize to it which elads to undefined reference.

I am testing the following:
Index: ipa.c
===================================================================
--- ipa.c       (revision 279810)
+++ ipa.c       (working copy)
@@ -187,6 +187,7 @@ walk_polymorphic_call_targets (hash_set<
       for (i = 0; i < targets.length (); i++)
        {
          struct cgraph_node *n = targets[i];
+         bool added = false;
 
          /* Do not bother to mark virtual methods in anonymous namespace;
             either we will find use of virtual table defining it, or it is
@@ -212,11 +213,18 @@ walk_polymorphic_call_targets (hash_set<
                    && symtab->state < IPA_SSA_AFTER_INLINING)
                  reachable->add (body);
               reachable->add (n);
+              added = true;
             }
          /* Even after inlining we want to keep the possible targets in the
             boundary, so late passes can still produce direct call even if
-            the chance for inlining is lost.  */
-         enqueue_node (n, first, reachable);
+            the chance for inlining is lost.
+            Do not keep references to comdat groups - removing their definition
+            first and adding references later is going to give undefined
+            reference errors.  */
+         if (added || (!DECL_COMDAT (n->decl)
+                       || DECL_EXTERNAL (n->decl)
+                       || !TREE_PUBLIC (n->decl)))
+           enqueue_node (n, first, reachable);
        }
     }
Comment 5 Jan Hubicka 2020-01-02 16:30:23 UTC
OK, the missed optimization follows from the following:

1) ipa-cp creates specialized node for o. It is called only once from fn3
   p calls unspecialized o. I wonder why this happens since both calls in p and fn3 leads to devirtualization.
2) inliner inlines k to o. This does not enable devirtualization because o is not specialized.
3) at the end of inlining remove_unreachable_nodes removes the offiline copy of m::av
4) we inline o to p enabling devirtualization but it is too late.

Adding inline keyword to p makes inliner to inline it early but we still miss the devirutalization. So we have two issues

a) for some reason ipa-cp rules out reasonable specialization
I it is decided here:
Evaluating opportunities for void o(j&)/11.                                     
 - considering value &g for param #0 p1 (caller_count: 1)                       
     good_cloning_opportunity_p (time: 1, size: 36, freq_sum: 1000) -> evaluation: 27, threshold: 500
     good_cloning_opportunity_p (time: 199, size: 120, freq_sum: 1000) -> evaluation: 1658, threshold: 500
  Creating a specialized node of void o(j&)/11.                                      
     the new node is o.constprop/37.                                            
     known ctx 0 is     Outer type:struct j offset 0                            
 - considering value &e.D.2397 for param #0 p1 (caller_count: 1)                
     good_cloning_opportunity_p (time: 1, size: 36, freq_sum: 202) -> evaluation: 5, threshold: 500
     good_cloning_opportunity_p (time: 103, size: 120, freq_sum: 202) -> evaluation: 173, threshold: 500
I assume it is because freq_sum is 202 instead of 1000 because call is conditional, but that is really way too strict... 20% is outcome of:

Predictions for bb 2                                                            
  DS theory heuristics: 20.24%                                                  
  combined heuristics: 20.24%                                                   
  call heuristics of edge 2->3: 33.00%                                          
  early return (on trees) heuristics of edge 2->3: 34.00%                       
Predictions for bb 3  
                                                          
which seems reasonable.

b) we do not devirtualize after inlining. We combine context correctly:

Polymorphic call context combine:    Speculative outer type:struct j (or a derived type) at offset 0
With context:                        Outer type:struct m offset 0               
Updated as:                          Outer type:struct m offset 0 Speculative outer type:struct j (or a derived type) at offset 0

but I am not sure why it does not trigger devirt at this stage. We also do not need to have speculative outer type when we know outer type precisely, but it is cosmetic issue.
Comment 6 Sergei Trofimovich 2020-10-03 09:12:41 UTC
Still happens with gcc-11.
Comment 7 Sergei Trofimovich 2021-02-08 13:30:50 UTC
Looks like original test does not trigger the bug as is (probably due to tets fragility). Here is something shorter (but with warnings):

```c++
struct a {
  char at;
  char au;
  int d() { return av() + au - at; }
  virtual void f() {}
  virtual int av() { }
};
struct g : a {
  void f();
  char b;
  char c() { return b; }
} b;
#ifdef MAIN_FILE
g e;
void h() {
  if (b.c()) {
    e.d();
    return;
  }
}
int main() {}
#endif
#ifdef LIB_FILE
void g::f() {}
#endif
```

$ g++-11.0.0 -Wall -fPIC -O1 -fdevirtualize -fdevirtualize-speculatively -fipa-cp -fipa-cp-clone -fvisibility-inlines-hidden -fPIC -shared -o libbug.so bug.cpp -DLIB_FILE
bug.cpp: In member function 'virtual int a::av()':
bug.cpp:6:22: warning: no return statement in function returning non-void [-Wreturn-type]
    6 |   virtual int av() { }
      |                      ^

$ g++-11.0.0 -Wall -fPIC -O1 -fdevirtualize -fdevirtualize-speculatively -fipa-cp -fipa-cp-clone -fvisibility-inlines-hidden -o main bug.cpp -DMAIN_FILE -L. -lbug
bug.cpp: In member function 'virtual int a::av()':
bug.cpp:6:22: warning: no return statement in function returning non-void [-Wreturn-type]
    6 |   virtual int av() { }
      |                      ^
/usr/lib/gcc/x86_64-pc-linux-gnu/11.0.0/../../../../x86_64-pc-linux-gnu/bin/ld: /tmp/ccbaHYef.o: in function `h()':
bug.cpp:(.text+0x1a): undefined reference to `a::av()'
collect2: error: ld returned 1 exit status
Comment 8 Sergei Trofimovich 2021-02-08 17:16:21 UTC
Slightly better example without warnings (used `__builtin_trap();` to work around use of uninitialized value):

```c++
struct a {
  char ac1;
  char ac2;
  int d() { return av() + ac1 + ac2; }
  virtual void f() {}
  virtual int av() { __builtin_trap(); }

  a(){}
};

struct g : a {
  virtual void f2();

  g():a(){}
};

#ifdef LIB_FILE
void g::f2() {}
#endif
#ifdef MAIN_FILE
static g b;
static char bb;
char cc() { return bb; }

void h() {
  if (cc()) {
    b.d();
    return;
  }
}
int main() {}
#endif
```