[PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]
Jakub Jelinek
jakub@redhat.com
Wed Apr 17 12:32:35 GMT 2024
On Wed, Apr 17, 2024 at 11:04:26AM +0200, Jan Hubicka wrote:
> > The reason the optimization doesn't trigger when the constructor is
> > constexpr is that expand_or_defer_fn is called in that case much earlier
> > than when it is not constexpr; in the former case it is called when we
> > try to constant evaluate that constructor. But DECL_INTERFACE_KNOWN
> > is false in that case and comdat_linkage hasn't been called either
> > (I think it is desirable, because comdat group is stored in the cgraph
> > node and am not sure it is a good idea to create cgraph nodes for
> > something that might not be needed later on at all), so maybe_clone_body
> > clones the bodies, but doesn't make them as aliases.
>
> Thanks for working this out! Creating cgraph node with no body is
> harmless as it will be removed later.
That is actually for functions with bodies, maybe_clone_body creates the
bodies for those, but still when it happens early, the cdtors have
tentative_decl_linkage linkage, which in many cases means DECL_EXTERNAL,
DECL_NOT_REALLY_EXTERN (C++ FE special thing), DECL_DEFER_OUTPUT etc.
> > + tree comdat_group = cdtor_comdat_group (fns[1], fns[0]);
> > + n1 = cgraph_node::get (fns[1]);
> > + n0->set_comdat_group (comdat_group);
> > + if (n1->same_comdat_group)
> > + n1->remove_from_same_comdat_group ();
> > + n1->add_to_same_comdat_group (n0);
> > + if (fns[2])
> > + n2->add_to_same_comdat_group (n0);
> > + import_export_decl (fns[1]);
>
> So this is handling the case where symbol was already inserted into one
> comdat group and later we need to move it into the C5 group? As long as
> we move everythingf rom old comdat group, this should be fine.
The above is pretty much an adjusted copy of what maybe_clone_body does,
except it doesn't call cgraph_node::get{,_create} all the time and uses
import_export_decl rather than expand_or_defer_fn{,_1}.
> > + /* Remove the body now that it is an alias. */
> > + DECL_SAVED_TREE (fns[1]) = NULL_TREE;
> Maybe using release_function_body since it also knows how to remove
> DECL_STRUCT_FUNCTION that exists at this stage?
Guess I could try that, clearing of DECL_SAVED_TREE was what was done
in maybe_clone_body too.
> I was thinking how to solve the problem on cgrpah side. We definitely
> have long lasting bug where aliases are handled incorrectly for which I
> made WIP patch (but probably would like to hold it after release branch is
> created). When foo has alias bar and foo is praviled then the alias
> target is prevailed too. This is what causes the ICE about cross comdat
> section alias. However fixing this is not enough as I do not think we
> can handle incremental linking correctly (as discussed briefly on IRC
> technically we should keep both sections but that would require two
> symbols of same name in single .o file).
>
> With the aliasing fixed we turn the other symbol into static but keep
> alias, so we end up with one comdat group having the non-aliased
> constructor and second comdat group (C5) exporting only the alias, which
> is not quite reight either.
I've tried to see what actually happens during linking without LTO, so compiled
pr113208_0.C with -O1 -fkeep-inline-functions -std=c++20 with vanilla trunk
(so it has those 2 separate comdats, one for C2 and one for C1), though I've
changed the
void m(k);
line to
__attribute__((noipa)) void m(k) {}
in the testcase, then compiled
pr113208_1.C with -O2 -fkeep-inline-functions -std=c++20 -fno-omit-frame-pointer
so that one can clearly differentiate from where the implementation was
picked and finally added
template <typename _Tp> struct _Vector_base {
int g() const;
_Vector_base(int, int);
};
struct QualityValue;
template <>
_Vector_base<QualityValue>::_Vector_base(int, int) {}
template <>
int _Vector_base<QualityValue>::g() const { return 0; }
int main () {}
If I link this, I see _ZN6vectorI12QualityValueEC2ERKS1_ and
_ZN6vectorI12QualityValueEC1ERKS1_ as separate functions with the
omitted frame pointer bodies, so clearly the pr113208_0.C versions prevailed
in both cases. It is unclear why that isn't the case for LTO.
Jakub
More information about the Gcc-patches
mailing list