Tested on g++ (GCC) 13.1.1 20230429 on x86-64 Linux The following code doesn't compile without also enabling the target using compile flags (like -mavx2). It did compile in GCC 12.2.0. #pragma GCC target("avx2") #include <vector> int main() { std::vector<int> a; }
reducing ...
Reduced: ``` #pragma GCC target("avx2") struct aa { __attribute__((__always_inline__)) aa() noexcept {} }; aa _M_impl; ```
(In reply to Andrew Pinski from comment #2) > Reduced: > ``` > #pragma GCC target("avx2") > struct aa { > __attribute__((__always_inline__)) aa() noexcept {} > }; > aa _M_impl; > ``` This started to give an error with commit bef8491a658de9e8920acaeff6cb76ef4e946e2c Author: Sriraman Tallam <tmsriram@google.com> Date: Tue Jun 18 22:45:03 2013 +0000 Emit errors when always_inline functions cannot be inlined in -O0 mode. * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. * gcc.c-torture/compile/pr44043.c: Fix test to expect an error. * gcc.c-torture/compile/pr43791.c: Fix test to expect an error. From-SVN: r200179 I think the original problem must have been triggered by some libstdc++ change.
(In reply to Marek Polacek from comment #3) > I think the original problem must have been triggered by some libstdc++ > change. Yes r13-3816-g9d549401ae8ab3 .
I suspect the generated constructor for the clones is not getting the target options on it like it should be ...
That would be a C++ frontend issue then, no? Maybe it's also only some of the aliases? And the actual issue would be latent then.
*** Bug 110675 has been marked as a duplicate of this bug. ***
In local-fnsummary2: __attribute__((always_inline, target ("avx2"))) void aa::aa (struct aa * const this) { <bb 2> [local count: 1073741824]: return; } this seems correct. But: void __static_initialization_and_destruction_0 () { <bb 2> : aa::aa (&_M_impl); return; } Note that __static_initialization_and_destruction_0 is not attributed with avx2.
(In reply to Xi Ruoyao from comment #8) > In local-fnsummary2: > > __attribute__((always_inline, target ("avx2"))) > void aa::aa (struct aa * const this) > { > <bb 2> [local count: 1073741824]: > return; > > } > > this seems correct. But: > > void __static_initialization_and_destruction_0 () > { > <bb 2> : > aa::aa (&_M_impl); > return; > > } > > Note that __static_initialization_and_destruction_0 is not attributed with > avx2. I think that's correct. Maybe we need multiple CTOR/DTOR functions in such case.
GCC 13.2 is being released, retargeting bugs to GCC 13.3.
It seems there is nothing to bisect any more, please re-add the keyword if I am wrong.
I think this is a problem with two meanings of always_inline. One is "it must be inlined or otherwise we will not be able to generate code" other is "disregard inline limits". I guess practical solution here would be to ingore always inline for functions called from static construction wrappers (since they only optimize around array of function pointers). Question is how to communicate this down from FE to ipa-inline...
I think the issue might be that whoever is creating __static_initialization_and_destruction_0 fails to honor the active target pragma. Which means back to my suggestion to have multiple ones when different target options are on the individual CTORs and any of them have always-inline (with always-inline we can't rely on an out-of-line copy to exist). Yes, for libstdc++ purposes which seems to get more and more "always-inline" it would be good to have a different attribute that would only cause to disregard inline limits and not have "always-inline" semantics. [[inline_without_limits]] or [[inline_no_limits]]
> I think the issue might be that whoever is creating > __static_initialization_and_destruction_0 fails to honor the active > target pragma. Which means back to my suggestion to have multiple ones > when different target options are on the individual CTORs and any of them > have always-inline (with always-inline we can't rely on an out-of-line copy > to exist). If I remember right, __static_initialization_and_destruction_0 call all static constructors of priority 0. So it has really no active pragma since it may change across translation unit. We also have similar code in IPA where we collect constructors across whole program. Motivation was to get them inlined and obtain better code locality. Back then Firefox had iostram constructor in every object file and those ctors made whole text segment to be loaded on demand during startup. Probably correct solution would be to group construtor into groups by compatible optimization/target pgramas in the inliner's definition. A quick hack would be to generate wrapper calls which will "eat" the always inline...
Created attachment 57706 [details] one approach I tried just making implicit functions respect #pragma target, but that regresses pr105306 due to seeming internal confusion over whether -Ofast or #pragma optimize apply to the implicit ~C. I haven't tracked that down yet.
Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650439.html
GCC 13.3 is being released, retargeting bugs to GCC 13.4.
The trunk branch has been updated by Jason Merrill <jason@gcc.gnu.org>: https://gcc.gnu.org/g:eff00046409a7289bfdc1861e68b532895f91c0e commit r15-902-geff00046409a7289bfdc1861e68b532895f91c0e Author: Jason Merrill <jason@redhat.com> Date: Wed Feb 14 17:18:17 2024 -0500 c++: pragma target and static init [PR109753] #pragma target and optimize should also apply to implicitly-generated functions like static initialization functions and defaulted special member functions. The handle_optimize_attribute change is necessary to avoid regressing g++.dg/opt/pr105306.C; maybe_clone_body creates a cgraph_node for the ~B alias before handle_optimize_attribute, and the alias never goes through finalize_function, so we need to adjust semantic_interposition somewhere else. PR c++/109753 gcc/c-family/ChangeLog: * c-attribs.cc (handle_optimize_attribute): Set cgraph_node::semantic_interposition. gcc/cp/ChangeLog: * decl.cc (start_preparsed_function): Call decl_attributes. gcc/testsuite/ChangeLog: * g++.dg/opt/always_inline1.C: New test.
Fixed for 15 so far.
*** Bug 115556 has been marked as a duplicate of this bug. ***
PR115403 needs to be addressed before backporting?
The trunk branch has been updated by Jason Merrill <jason@gcc.gnu.org>: https://gcc.gnu.org/g:e397f8524a7982004eb616217477434ce350e80f commit r15-2332-ge397f8524a7982004eb616217477434ce350e80f Author: Jason Merrill <jason@redhat.com> Date: Thu Jul 25 17:36:09 2024 -0400 c++: #pragma target and deferred instantiation [PR115403] My patch for 109753 applies the current #pragma target/optimize to a function when we compile it, which was a problem for a template instantiation deferred until EOF, where different #pragmas are active. So let's only do this for artificial functions. PR c++/115403 PR c++/109753 gcc/cp/ChangeLog: * decl.cc (start_preparsed_function): Only call decl_attributes for artificial functions. gcc/testsuite/ChangeLog: * g++.dg/ext/pragma-target1.C: New test.