[WIP] Re-introduce 'TREE_USED' in tree streaming

Thomas Schwinge thomas@codesourcery.com
Fri Sep 15 13:01:05 GMT 2023


Hi!

On 2023-09-15T12:11:44+0200, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> On Fri, Sep 15, 2023 at 11:20 AM Thomas Schwinge
> <thomas@codesourcery.com> wrote:
>> Now, that was another quirky debug session: in
>> 'gcc/omp-low.cc:create_omp_child_function' we clearly do set
>> 'TREE_USED (t) = 1;' for '.omp_data_i', which ends up as formal parameter
>> for outlined '[...]._omp_fn.[...]' functions, pointing to the "OMP blob".
>> Yet, in offloading compilation, I only ever got '!TREE_USED' for the
>> formal parameter '.omp_data_i'.  This greatly disturbs a nvptx back end
>> expand-time transformation that I have implemented, that's active
>> 'if (!TREE_USED ([formal parameter]))'.
>>
>> After checking along all the host-side OMP handling, eventually (in
>> hindsight: "obvious"...) I found that, "simply", we're not streaming
>> 'TREE_USED'!  With that changed (see attached
>> "Re-introduce 'TREE_USED' in tree streaming"; no visible changes in
>> x86_64-pc-linux-gnu and powerpc64le-unknown-linux-gnu 'make check'), my
>> issue was quickly addressed -- if not for the question *why* 'TREE_USED'
>> isn't streamed (..., and apparently, that's a problem only for my
>> case..?), and then I found that it's *intentionally been removed*
>> in one-decade-old commit ee03e71d472a3f73cbc1a132a284309f36565972
>> (Subversion r200151) "Re-write LTO type merging again, do tree merging".
>>
>> At this point, I need help: is this OK to re-introduce unconditionally,
>> or in some conditionalized form (but, "ugh..."), or be done differently
>> altogether in the nvptx back end (is 'TREE_USED' considered "stale" at
>> some point in the compilation pipeline?), or do we need some logic in
>> tree stream read-in (?) to achieve the same thing that removing
>> 'TREE_USED' streaming apparently did achieve, or yet something else?
>> Indeed, from a quick look, most use of 'TREE_USED' seems to be "early",
>> but I saw no reason that it couldn't be used "late", either?
>
> TREE_USED is considered stale, it doesn't reflect reality and is used with
> different semantics throughout the pass pipeline

Aha, thanks.  Any suggestion about how to update 'gcc/tree.h:TREE_USED',
for next time, to detail at which stages the properties indicated there
are meaningful?  (..., and we shall also add some such comment in the two
tree streamer functions.)

> so it doesn't make much sense
> to stream it also because it will needlessly cause divergence between TUs
> during tree merging.

Right, that's what I'd assumed from quickly skimming the 2013 discussion.

> So we definitely do not want to stream TREE_USED for
> every tree.
>
> Why would you guard anything late on TREE_USED?  If you want to know
> whether a formal parameter is "used" (used in code generation?  used in the
> source?) you have to compute this property.  As you can see using TREE_USED
> is fragile.

The issue is: for function call outgoing/incoming arguments, the nvptx
back end has (to use) a mechanism different from usual targets.  For the
latter, the incoming arguments are readily available in registers or on
the stack, without requiring emission of any setup instructions.  For
nvptx, we have to generate boilerplate code for every function incoming
argument, to load the argument value into a local register.  (The latter
are then, at least for '-O0', spilled to and restored from the stack
frame, before the first actual use -- if there's any use at all.)

This generates some bulky PTX code, which goes so far that we run into
timeout or OOM-killed 'ptxas' for 'gcc.c-torture/compile/limits-fndefn.c'
at '-O0', for example, where we've got half a million lines of
boilerplate PTX code.  That one certainly is a rogue test case, but I
then found that if I conditionalize emission of that incoming argument
setup code on 'TREE_USED' of the respective element of the chain of
'DECL_ARGUMENTS', then I do get the desired behavior: zero-instructions
'limits-fndefn.S'.  So this "late" use of 'TREE_USED' does work -- just
that, as discussed, 'TREE_USED' isn't available in the offloading
setting.  ;-)

I'll look into computing "unused" locally, before/for nvptx expand time.
(To make the '-O0' case work, I figure this has to happen early, instead
of later DCEing the mess that we generated earlier.)  Any quick
suggestions?  My naïve first idea would be to simply in
'TARGET_FUNCTION_INCOMING_ARG' scan if the corresponding element of
'DECL_ARGUMENTS' is used in the function, or maybe do that once for all
'DECL_ARGUMENTS' in 'INIT_CUMULATIVE_INCOMING_ARGS'.


Grüße
 Thomas


>> Original discussion "not streaming and comparing TREE_USED":
>> <https://inbox.sourceware.org/alpine.LNX.2.00.1306131614000.26078@zhemvz.fhfr.qr>
>> "[RFC] Re-write LTO type merging again, do tree merging", continued
>> <https://inbox.sourceware.org/alpine.LNX.2.00.1306141240340.6998@zhemvz.fhfr.qr>
>> "Re-write LTO type merging again, do tree merging".
>>
>>
>> In 2013, offloading compilation was just around the corner --
>> <https://inbox.sourceware.org/1375103926.7129.7694.camel@triegel.csb>
>> "Summary of the Accelerator BOF at Cauldron" -- and you easily could've
>> foreseen this issue, no?  ;-P
>>
>>
>> Grüße
>>  Thomas
>>
>>
>> -----------------
>> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955


More information about the Gcc-patches mailing list