[PATCH] keep scope blocks for all inlined functions (PR 98664)

Richard Biener richard.guenther@gmail.com
Mon Jan 18 13:25:07 GMT 2021


On Sun, Jan 17, 2021 at 1:46 AM Martin Sebor <msebor@gmail.com> wrote:
>
> On 1/15/21 12:44 AM, Richard Biener wrote:
> > On Thu, Jan 14, 2021 at 8:13 PM Martin Sebor via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> One aspect of PR 98465 - Bogus warning stringop-overread for std::string
> >> is the inconsistency between -g and -g0 which turns out to be due to
> >> GCC eliminating apparently unused scope blocks from inlined functions
> >> that aren't explicitly declared inline and artificial.  PR 98664 tracks
> >> just this part of PR 98465.
> >>
> >> To resolve just the PR 98664 subset the attached change has
> >> the tree-ssa-live.c pass preserve these blocks for all inlined
> >> functions, not just artificial ones.  Besides avoiding the interaction
> >> between -g and warnings it also seems to improve the inlining context
> >> by including more inlined call sites.  This can be seen in the adjusted
> >> tests.  (Its effect on PR 98465 is that the false positive is issued
> >> consistently, regardless of -g.  Avoiding the false positive is my
> >> next step.)
> >>
> >> Jakub, you raised a concern yesterday in PR 98465 c#13 about the memory
> >> footprint of this change.  Can you please comment on whether it's in
> >> line with what you were suggesting?
> >
> >       {
> >         tree ao = BLOCK_ABSTRACT_ORIGIN (block);
> > -      if (TREE_CODE (ao) == FUNCTION_DECL)
> > -       loc = BLOCK_SOURCE_LOCATION (block);
> > -      else if (TREE_CODE (ao) != BLOCK)
> > -       break;
> > +       if (TREE_CODE (ao) == FUNCTION_DECL)
> > +        loc = BLOCK_SOURCE_LOCATION (block);
> > +       else if (TREE_CODE (ao) != BLOCK)
> > +        break;
> >
> > you are replacing tabs with spaces?
> >
> > @@ -558,16 +558,13 @@ remove_unused_scope_block_p (tree scope, bool
> > in_ctor_dtor_block)
> >      else if (!flag_auto_profile && debug_info_level == DINFO_LEVEL_NONE
> >              && !optinfo_wants_inlining_info_p ())
> >        {
> > -       /* Even for -g0 don't prune outer scopes from artificial
> > -         functions, otherwise diagnostics using tree_nonartificial_location
> > -         will not be emitted properly.  */
> > +       /* Even for -g0 don't prune outer scopes from inlined functions,
> > +         otherwise late diagnostics from such functions will not be
> > +         emitted or suppressed properly.  */
> >          if (inlined_function_outer_scope_p (scope))
> >           {
> >             tree ao = BLOCK_ORIGIN (scope);
> > -          if (ao
> > -              && TREE_CODE (ao) == FUNCTION_DECL
> > -              && DECL_DECLARED_INLINE_P (ao)
> > -              && lookup_attribute ("artificial", DECL_ATTRIBUTES (ao)))
> > +          if (ao && TREE_CODE (ao) == FUNCTION_DECL)
> >               unused = false;
> >           }
> >        }
> >
> > so which inlined_function_outer_scope_p are you _not_ marking now?
> > BLOCK_ORIGIN is never NULL and all inlined scopes should have
> > an abstract origin - I believe always a FUNCTIN_DECL.  Which means
> > you could have simplified it further?
>
> Quite possibly.  I could find no documentation for these macros so
> I tried to keep my changes conservative.  I did put together some
> notes to document what I saw the macros evaluate to in my testing
> (below).  If/when it's close to accurate I'd like to add them to
> tree.h and to the internals manual.
>
> > And yes, the main reason for the code above is memory use for
> > C++ with lots of inlining.  I suggest to try the patch on tramp3d
> > for example (there's about 10 inline instances per emitted
> > assembly op).
>
> This one:
> https://github.com/llvm-mirror/test-suite/tree/master/MultiSource/Benchmarks/tramp3d-v4
> ?

yeah

> With the patch, 69,022 more blocks with distinct numbers are kept
> than without it.  I see some small differences in -fmem-report
> and -ftime-report output:
>
>    Total: 286 -> 288M  210 -> 211M  3993 -> 4019k
>
> I'm not really sure what to look at so I attach the two reports
> for you to judge for yourself.

A build with --enable-gather-detailed-mem-stats would have given
statistics on BLOCK trees I think, otherwise -fmem-report is
not too useful but I guess the above overall stat tell us the
overhead is manageable.

> I also attach an updated patch with the slight simplification you
> suggested.

So I was even suggesting to do

  if (inlined_function_outer_scope_p (scope))
    unused = false;

and maybe gcc_assert (TREE_CODE (orig) == FUNCTION_DECL)
but I think the patch is OK as updated.

> Martin
>
> PS Here are my notes on the macros and the two related functions:
>
> BLOCK: Denotes a lexical scope.  Contains BLOCK_VARS of variables
> declared in it, BLOCK_SUBBLOCKS of scopes nested in it, and
> BLOCK_CHAIN pointing to the next BLOCK.  Its BLOCK_SUPERCONTEXT
> point to the BLOCK of the enclosing scope.  May have
> a BLOCK_ABSTRACT_ORIGIN and a BLOCK_SOURCE_LOCATION.
>
> BLOCK_SUPERCONTEXT: The scope of the enclosing block, or FUNCTION_DECL
> for the "outermost" function scope.  Inlined functions are chained by
> this so that given expression E and its TREE_BLOCK(E) B,
> BLOCK_SUPERCONTEXT(B) is the scope (BLOCK) in which E has been made
> or into which E has been inlined.  In the latter case,
>
> BLOCK_ORIGIN(B) evaluates either to the enclosing BLOCK or to
> the enclosing function DECL.  It's never null.
>
> BLOCK_ABSTRACT_ORIGIN(B) is the FUNCTION_DECL of the function into
> which it has been inlined, or null if B is not inlined.

It's the BLOCK or FUNCTION it was inlined _from_, not were it was inlined to.
It's the "ultimate" source, thus the abstract copy of the block or function decl
(for the outermost scope, aka inlined_function_outer_scope_p).  It corresponds
to what you'd expect for the DWARF abstract origin.

BLOCK_ABSTRACT_ORIGIN can be NULL (in case it isn't an inline instance).

> BLOCK_ABSTRACT_ORIGIN: A BLOCK, or FUNCTION_DECL of the function
> into which a block has been inlined.  In a BLOCK immediately enclosing
> an inlined leaf expression points to the outermost BLOCK into which it
> has been inlined (thus bypassing all intermediate BLOCK_SUPERCONTEXTs).
>
> BLOCK_FRAGMENT_ORIGIN: ???
> BLOCK_FRAGMENT_CHAIN: ???

that's for scope blocks split by hot/cold partitioning and only temporarily
populated.

> bool inlined_function_outer_scope_p(BLOCK)   [tree.h]
>    Returns true if a BLOCK has a source location.
>    True for all but the innermost (no SUBBLOCKs?) and outermost blocks
>    into which an expression has been inlined. (Is this always true?)
>
> tree block_ultimate_origin(BLOCK)   [tree.c]
>    Returns BLOCK_ABSTRACT_ORIGIN(BLOCK), AO, after asserting that
>    (DECL_P(AO) && DECL_ORIGIN(AO) == AO) || BLOCK_ORIGIN(AO) == AO).


More information about the Gcc-patches mailing list