Bug 102554 - [11 Regression] Inlining missed at -O3 with non-default --param=early-inlining-insns and pragma optimize
Summary: [11 Regression] Inlining missed at -O3 with non-default --param=early-inlinin...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: ipa (show other bugs)
Version: 10.2.0
: P2 normal
Target Milestone: 11.5
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2021-10-01 15:52 UTC by John S
Modified: 2023-07-07 10:41 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work: 9.3.0
Known to fail: 10.1.0, 10.2.0, 10.3.0, 11.1.0, 11.2.0
Last reconfirmed: 2021-10-04 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John S 2021-10-01 15:52:17 UTC
GNU C++14 (GCC) version 10.2.0 (x86_64-pc-linux-gnu)
        compiled by GNU C version 10.2.0, GMP version 6.0.0, MPFR version 3.1.1, MPC version 1.0.1, isl version isl-0.16.1-GMP

Target: x86_64-pc-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.2.0 (GCC)

===========
=TEST CODE=
===========
cat test.cpp
#pragma GCC push_options
#pragma GCC optimize ("no-lifetime-dse")
class TestClass
{
public:
  static inline int should_inline() {
    return 10;
  }
};
#pragma GCC pop_options

int main() {
  return TestClass::should_inline() + 1;
}

===========
=cmd      =
===========
gcc-10 test.cpp -S --param=early-inlining-insns=30 -O3 -fno-lifetime-dse -Wall -Wextra

===========
=BAD ASM  =
===========
cat test.s
        .file   "test.cpp"
        .text
        .section        .text._ZN9TestClass13should_inlineEv,"axG",@progbits,_ZN9TestClass13should_inlineEv,comdat
        .p2align 4
        .weak   _ZN9TestClass13should_inlineEv
        .type   _ZN9TestClass13should_inlineEv, @function
_ZN9TestClass13should_inlineEv:
.LFB0:
        .cfi_startproc
        movl    $10, %eax
        ret
        .cfi_endproc
.LFE0:
        .size   _ZN9TestClass13should_inlineEv, .-_ZN9TestClass13should_inlineEv
        .section        .text.startup,"ax",@progbits
        .p2align 4
        .globl  main
        .type   main, @function
main:
.LFB1:
        .cfi_startproc
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        call    _ZN9TestClass13should_inlineEv
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        addl    $1, %eax
        ret
        .cfi_endproc
.LFE1:
        .size   main, .-main
        .ident  "GCC: (GNU) 10.2.0"
        .section        .note.GNU-stack,"",@progbits

===========
=info     =
===========
cat test.cpp.079i.inline
...
Deciding on inlining of small functions.  Starting with size 9.
Enqueueing calls in int main()/1.
test.cpp:13:34: missed:   not inlinable: int main()/1 -> static int TestClass::should_inline()/0, optimization level attribute mismatch

  param_early_inlining_insns (0x1e/0xe)
Enqueueing calls in static int TestClass::should_inline()/0.
node context cache: 0 hits, 0 misses, 1 initializations
...

===========
=GOOD ASM =
===========
gcc-10 test.cpp -S --param=early-inlining-insns=14 -O3 -fno-lifetime-dse -Wall -Wextra
        .file   "test.cpp"
        .text
        .section        .text.startup,"ax",@progbits
        .p2align 4
        .globl  main
        .type   main, @function
main:
.LFB1:
        .cfi_startproc
        movl    $11, %eax
        ret
        .cfi_endproc
.LFE1:
        .size   main, .-main
        .ident  "GCC: (GNU) 10.2.0"
        .section        .note.GNU-stack,"",@progbits


==========
=notes=
==========

Starting with gcc 10+ (gcc9 works correctly), the use of --param=early-inlining-insns=30 and -O3 on the command line combined with using a "#pragma GCC optimize" in source code, even one that does not change the effective optimization attributes, causes "optimization level attribute mismatch" to occur in the inliner.

In the example I placed both -fno-lifetime-dse on the command line and in the pragma gcc optimize ("no-lifetime-dse"),  so it has no impact at all to the effective optimization attributes.  

The issue is not specific to using just pragma GCC optimize "no-lifetime-dse", any pragma gcc optimize line will have this effect. Even "unrecognized" ones.  i.e.
#pragma GCC optimize ("fake_attribute")

Any value OTHER THAN --param=early-inlining-insns=14 on the command line when used with -O3 and pragma optimize will trigger this.
.. i.e.
======================
=optimize correctly  =
======================
gcc-10 test.cpp -S --param=early-inlining-insns=14 -O3 -fno-lifetime-dse -Wall -Wextra

gcc-10 test.cpp -S --param=early-inlining-insns=30 -O2 -fno-lifetime-dse -Wall -Wextra

gcc-9 test.cpp -S --param=early-inlining-insns=30 -O3 -fno-lifetime-dse -Wall -Wextra

======================
=missed optimize     =
======================
gcc-10 test.cpp -S --param=early-inlining-insns=12 -O3 -fno-lifetime-dse -Wall -Wextra

gcc-10 test.cpp -S --param=early-inlining-insns=17 -O3 -fno-lifetime-dse -Wall -Wextra
etc.

gcc-11 test.cpp -S --param=early-inlining-insns=30 -O3 -fno-lifetime-dse -Wall -Wextra
gcc-12 test.cpp -S --param=early-inlining-insns=30 -O3 -fno-lifetime-dse -Wall -Wextra
gcc-trunk test.cpp -S --param=early-inlining-insns=30 -O3 -fno-lifetime-dse -Wall -Wextra


Code path where CIF_OPTIMIZATION_MISMATCH is being set.

gcc/ipa-inline.c:
    568 can_early_inline_edge_p (struct cgraph_edge *e)
...
    593   if (!can_inline_edge_p (e, true, true)
    594       || !can_inline_edge_by_limits_p (e, true, false, true))

-------->

    428 can_inline_edge_by_limits_p (struct cgraph_edge *e, bool report,
    429                              bool disregard_limits = false, bool early = false)
...
    524       /* When user added an attribute to the callee honor it.  */
    525       else if (lookup_attribute ("optimize", DECL_ATTRIBUTES (callee->decl))
    526                && opts_for_fn (caller->decl) != opts_for_fn (callee->decl))
    527         {
    528           e->inline_failed = CIF_OPTIMIZATION_MISMATCH;
    529           inlinable = false;
    530         }


I suspect the change that moved the --params= options into the cl_optimization struct is related to this misssed optimization.
Comment 1 Richard Biener 2021-10-04 06:53:05 UTC
I suspect that the optimize() attribute resets the param value to its default.

Martin - can you investigate / bisect?
Comment 2 Martin Liška 2021-10-04 07:09:08 UTC
(In reply to Richard Biener from comment #1)
> I suspect that the optimize() attribute resets the param value to its
> default.

Yes, it's fixed on master with g:r12-4038-g6de9f0c13b27c343.

> 
> Martin - can you investigate / bisect?

Sure, it started with r10-4944-g1e83bd7003e03160.

I tend closing that as fixed, what do you think Richi?
Comment 3 John S 2021-10-04 14:05:26 UTC
(In reply to Martin Liška from comment #2)
> (In reply to Richard Biener from comment #1)
> > I suspect that the optimize() attribute resets the param value to its
> > default.
> 
> Yes, it's fixed on master with g:r12-4038-g6de9f0c13b27c343.
> 
> > 
> > Martin - can you investigate / bisect?
> 
> Sure, it started with r10-4944-g1e83bd7003e03160.
> 
> I tend closing that as fixed, what do you think Richi?

I can confirm I am seeing g:r12-4038-g6de9f0c13b27c343 resolve the issue.

Is it possible to get this applied into the upcoming 10.4, 11.3 releases?  It's making upgrading to 10.x / 11.x versions challenging in certain latency sensitive production environments.
Comment 4 Martin Liška 2021-10-04 15:09:44 UTC
> I can confirm I am seeing g:r12-4038-g6de9f0c13b27c343 resolve the issue.
> 
> Is it possible to get this applied into the upcoming 10.4, 11.3 releases? 

Sorry, but it won't be possible. It's a pretty significant change that can potentially break some software build. That's why it will target 12.1 release only.

> It's making upgrading to 10.x / 11.x versions challenging in certain latency
> sensitive production environments.
Comment 5 Jakub Jelinek 2022-06-28 10:46:31 UTC
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
Comment 6 Richard Biener 2023-07-07 10:41:05 UTC
GCC 10 branch is being closed.