Bug 109753 - [13/14 Regression] pragma GCC target causes std::vector not to compile (always_inline on constructor)
Summary: [13/14 Regression] pragma GCC target causes std::vector not to compile (alway...
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 13.1.1
: P2 normal
Target Milestone: 13.4
Assignee: Jason Merrill
URL:
Keywords: rejects-valid
: 110675 115556 (view as bug list)
Depends on: 115403
Blocks: 115556
  Show dependency treegraph
 
Reported: 2023-05-05 21:30 UTC by Magnus Hokland Hegdahl
Modified: 2024-08-28 14:38 UTC (History)
12 users (show)

See Also:
Host:
Target: x86_64-linux-gnu
Build:
Known to work: 12.2.0
Known to fail: 13.1.0
Last reconfirmed: 2023-05-05 00:00:00


Attachments
one approach (839 bytes, patch)
2024-03-14 20:34 UTC, Jason Merrill
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Magnus Hokland Hegdahl 2023-05-05 21:30:43 UTC
Tested on g++ (GCC) 13.1.1 20230429 on x86-64 Linux

The following code doesn't compile without also enabling the target using compile flags (like -mavx2). It did compile in GCC 12.2.0.

#pragma GCC target("avx2")

#include <vector>

int main() {
  std::vector<int> a;
}
Comment 1 Andrew Pinski 2023-05-05 21:43:43 UTC
reducing ...
Comment 2 Andrew Pinski 2023-05-05 22:38:31 UTC
Reduced:
```
#pragma GCC target("avx2")
struct aa {
    __attribute__((__always_inline__)) aa() noexcept {}
};
aa _M_impl;
```
Comment 3 Marek Polacek 2023-05-05 22:44:50 UTC
(In reply to Andrew Pinski from comment #2)
> Reduced:
> ```
> #pragma GCC target("avx2")
> struct aa {
>     __attribute__((__always_inline__)) aa() noexcept {}
> };
> aa _M_impl;
> ```

This started to give an error with

commit bef8491a658de9e8920acaeff6cb76ef4e946e2c
Author: Sriraman Tallam <tmsriram@google.com>
Date:   Tue Jun 18 22:45:03 2013 +0000

    Emit errors when always_inline functions cannot be inlined in -O0 mode.
    
            * tree-inline.c (expand_call_inline): Allow the error to be flagged
            in early inline pass.
            * ipa-inline.c (inline_always_inline_functions): Pretend always_inline
            functions are inlined during failures to flag an error.
            * gcc.target/i386/inline_error.c: New test.
            * gcc.c-torture/compile/pr44043.c: Fix test to expect an error.
            * gcc.c-torture/compile/pr43791.c: Fix test to expect an error.
    
    From-SVN: r200179

I think the original problem must have been triggered by some libstdc++ change.
Comment 4 Andrew Pinski 2023-05-05 22:48:01 UTC
(In reply to Marek Polacek from comment #3)
> I think the original problem must have been triggered by some libstdc++
> change.

Yes r13-3816-g9d549401ae8ab3 .
Comment 5 Andrew Pinski 2023-05-06 00:12:05 UTC
I suspect the generated constructor for the clones is not getting the target options on it like it should be ...
Comment 6 Richard Biener 2023-05-08 07:04:19 UTC
That would be a C++ frontend issue then, no?  Maybe it's also only some of the aliases?  And the actual issue would be latent then.
Comment 7 Xi Ruoyao 2023-07-15 22:41:17 UTC
*** Bug 110675 has been marked as a duplicate of this bug. ***
Comment 8 Xi Ruoyao 2023-07-15 22:43:22 UTC
In local-fnsummary2:

__attribute__((always_inline, target ("avx2")))
void aa::aa (struct aa * const this)
{
  <bb 2> [local count: 1073741824]:
  return;

}

this seems correct.  But:

void __static_initialization_and_destruction_0 ()
{
  <bb 2> :
  aa::aa (&_M_impl);
  return;

}

Note that __static_initialization_and_destruction_0 is not attributed with avx2.
Comment 9 Richard Biener 2023-07-17 08:33:03 UTC
(In reply to Xi Ruoyao from comment #8)
> In local-fnsummary2:
> 
> __attribute__((always_inline, target ("avx2")))
> void aa::aa (struct aa * const this)
> {
>   <bb 2> [local count: 1073741824]:
>   return;
> 
> }
> 
> this seems correct.  But:
> 
> void __static_initialization_and_destruction_0 ()
> {
>   <bb 2> :
>   aa::aa (&_M_impl);
>   return;
> 
> }
> 
> Note that __static_initialization_and_destruction_0 is not attributed with
> avx2.

I think that's correct.  Maybe we need multiple CTOR/DTOR functions
in such case.
Comment 10 Richard Biener 2023-07-27 09:26:08 UTC
GCC 13.2 is being released, retargeting bugs to GCC 13.3.
Comment 11 Martin Jambor 2024-01-10 17:35:22 UTC
It seems there is nothing to bisect any more, please re-add the keyword if I am wrong.
Comment 12 Jan Hubicka 2024-01-10 19:02:48 UTC
I think this is a problem with two meanings of always_inline. One is "it must be inlined or otherwise we will not be able to generate code" other is "disregard inline limits".

I guess practical solution here would be to ingore always inline for functions called from static construction wrappers (since they only optimize around array of function pointers). Question is how to communicate this down from FE to ipa-inline...
Comment 13 Richard Biener 2024-01-11 08:09:53 UTC
I think the issue might be that whoever is creating __static_initialization_and_destruction_0 fails to honor the active
target pragma.  Which means back to my suggestion to have multiple ones
when different target options are on the individual CTORs and any of them
have always-inline (with always-inline we can't rely on an out-of-line copy
to exist).

Yes, for libstdc++ purposes which seems to get more and more "always-inline"
it would be good to have a different attribute that would only cause to
disregard inline limits and not have "always-inline" semantics.

[[inline_without_limits]] or [[inline_no_limits]]
Comment 14 Jan Hubicka 2024-01-11 12:59:24 UTC
> I think the issue might be that whoever is creating
> __static_initialization_and_destruction_0 fails to honor the active
> target pragma.  Which means back to my suggestion to have multiple ones
> when different target options are on the individual CTORs and any of them
> have always-inline (with always-inline we can't rely on an out-of-line copy
> to exist).

If I remember right, __static_initialization_and_destruction_0 call all
static constructors of priority 0. So it has really no active pragma
since it may change across translation unit.

We also have similar code in IPA where we collect constructors
across whole program.  Motivation was to get them inlined and obtain
better code locality. Back then Firefox had iostram constructor in every
object file and those ctors made whole text segment to be loaded on
demand during startup.

Probably correct solution would be to group construtor into groups by
compatible optimization/target pgramas in the inliner's definition.
A quick hack would be to generate wrapper calls which will "eat" the
always inline...
Comment 15 Jason Merrill 2024-03-14 20:34:16 UTC
Created attachment 57706 [details]
one approach

I tried just making implicit functions respect #pragma target, but that regresses pr105306 due to seeming internal confusion over whether -Ofast or #pragma optimize apply to the implicit ~C.  I haven't tracked that down yet.
Comment 16 Jason Merrill 2024-05-08 13:46:16 UTC
Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650439.html
Comment 17 Jakub Jelinek 2024-05-21 09:14:58 UTC
GCC 13.3 is being released, retargeting bugs to GCC 13.4.
Comment 18 GCC Commits 2024-05-29 13:51:49 UTC
The trunk branch has been updated by Jason Merrill <jason@gcc.gnu.org>:

https://gcc.gnu.org/g:eff00046409a7289bfdc1861e68b532895f91c0e

commit r15-902-geff00046409a7289bfdc1861e68b532895f91c0e
Author: Jason Merrill <jason@redhat.com>
Date:   Wed Feb 14 17:18:17 2024 -0500

    c++: pragma target and static init [PR109753]
    
     #pragma target and optimize should also apply to implicitly-generated
     functions like static initialization functions and defaulted special member
     functions.
    
    The handle_optimize_attribute change is necessary to avoid regressing
    g++.dg/opt/pr105306.C; maybe_clone_body creates a cgraph_node for the ~B
    alias before handle_optimize_attribute, and the alias never goes through
    finalize_function, so we need to adjust semantic_interposition somewhere
    else.
    
            PR c++/109753
    
    gcc/c-family/ChangeLog:
    
            * c-attribs.cc (handle_optimize_attribute): Set
            cgraph_node::semantic_interposition.
    
    gcc/cp/ChangeLog:
    
            * decl.cc (start_preparsed_function): Call decl_attributes.
    
    gcc/testsuite/ChangeLog:
    
            * g++.dg/opt/always_inline1.C: New test.
Comment 19 Jason Merrill 2024-05-29 13:55:16 UTC
Fixed for 15 so far.
Comment 20 Andrew Pinski 2024-06-20 15:07:53 UTC
*** Bug 115556 has been marked as a duplicate of this bug. ***
Comment 21 Richard Biener 2024-07-18 10:29:52 UTC
PR115403 needs to be addressed before backporting?
Comment 22 GCC Commits 2024-07-25 23:55:27 UTC
The trunk branch has been updated by Jason Merrill <jason@gcc.gnu.org>:

https://gcc.gnu.org/g:e397f8524a7982004eb616217477434ce350e80f

commit r15-2332-ge397f8524a7982004eb616217477434ce350e80f
Author: Jason Merrill <jason@redhat.com>
Date:   Thu Jul 25 17:36:09 2024 -0400

    c++: #pragma target and deferred instantiation [PR115403]
    
    My patch for 109753 applies the current #pragma target/optimize to a
    function when we compile it, which was a problem for a template
    instantiation deferred until EOF, where different #pragmas are active.  So
    let's only do this for artificial functions.
    
            PR c++/115403
            PR c++/109753
    
    gcc/cp/ChangeLog:
    
            * decl.cc (start_preparsed_function): Only call decl_attributes for
            artificial functions.
    
    gcc/testsuite/ChangeLog:
    
            * g++.dg/ext/pragma-target1.C: New test.