[PATCH] c++, dyninit: Optimize C++ dynamic initialization by constants into DECL_INITIAL adjustment [PR102876]
Jason Merrill
jason@redhat.com
Fri Nov 5 19:46:35 GMT 2021
On 11/5/21 13:06, Martin Sebor wrote:
> On 11/4/21 3:42 AM, Jakub Jelinek via Gcc-patches wrote:
>> Hi!
>>
>> When users don't use constexpr everywhere in initialization of namespace
>> scope non-comdat vars and the initializers aren't constant when FE is
>> looking at them, the FE performs dynamic initialization of those
>> variables.
>> But after inlining and some constant propagation, we often end up with
>> just storing constants into those variables in the _GLOBAL__sub_I_*
>> constructor.
>> C++ gives us permission to change some of that dynamic initialization
>> back into static initialization -
>> https://eel.is/c++draft/basic.start.static#3
>> For classes that need (dynamic) construction, I believe access to some
>> var
>> from other dynamic construction before that var is constructed is UB, but
>> as the example in the above mentioned spot of C++:
>> inline double fd() { return 1.0; }
>> extern double d1;
>> double d2 = d1; // unspecified:
>> // either statically initialized to 0.0 or
>> // dynamically initialized to 0.0 if d1 is
>> // dynamically initialized, or 1.0 otherwise
>> double d1 = fd(); // either initialized statically or dynamically to
>> 1.0
>> some vars can be used before they are dynamically initialized and the
>> implementation can still optimize those into static initialization.
>>
>> The following patch attempts to optimize some such cases back into
>> DECL_INITIAL initializers and where possible (originally const vars
>> without
>> mutable members) put those vars back to .rodata etc.
>>
>> Because we put all dynamic initialization from a single TU into one
>> single
>> function (well, originally one function per priority but typically inline
>> those back into one function), we can either have a simpler approach
>> (from the PR it seems that is what LLVM uses) where either we manage to
>> optimize all dynamic initializers into constant in the TU, or nothing,
>> or by adding some markup - in the form of a pair of internal functions in
>> this patch - around each dynamic initialization that can be optimized,
>> we can optimize each dynamic initialization separately.
>>
>> The patch adds a new pass that is invoked (through gate check) only on
>> DECL_ARTIFICIAL DECL_STATIC_CONSTRUCTOR functions, and looks there for
>> sequences like:
>> .DYNAMIC_INIT_START (&b, 0);
>> b = 1;
>> .DYNAMIC_INIT_END (&b);
>> or
>> .DYNAMIC_INIT_START (&e, 1);
>> # DEBUG this => &e.f
>> MEM[(struct S *)&e + 4B] ={v} {CLOBBER};
>> MEM[(struct S *)&e + 4B].a = 1;
>> MEM[(struct S *)&e + 4B].b = 2;
>> MEM[(struct S *)&e + 4B].c = 3;
>> # DEBUG BEGIN_STMT
>> MEM[(struct S *)&e + 4B].d = 6;
>> # DEBUG this => NULL
>> .DYNAMIC_INIT_END (&e);
>> (where between the pair of markers everything is either debug stmts or
>> stores of constants into the variables or their parts).
>> The pass needs to be done late enough so that after IPA all the needed
>> constant propagation and perhaps loop unrolling is done, on the other
>> side should be early enough so that if we can't optimize it, we can
>> remove those .DYNAMIC_INIT* internal calls that could prevent some
>> further optimizations (they have fnspec such that they pretend to read
>> the corresponding variable).
>
> In my work-in-progress patch to diagnose stores into constant
> objects (and subobjects) I deal with the same problem. I had
> considered a pair of markers like those above (David Malcolm
> suggested a smilar approach as well), but decided to go
> a different route, not trusting they could be kept together,
> or that they wouldn't be viewed as overly intrusive. With
> it, I have been able to distinguish dynamic initialization
> from overwriting stores even at the end of compilation, but
> I'd be fine changing that and running the detection earlier.
>
> So if the markers are added for the purpose of optimizing
> the dynamic initialization at file scope, could they be added
> for those of locals as well? That way I wouldn't need to add
> a separate solution.
>
>>
>> Currently the optimization is only able to optimize cases where the whole
>> variable is stored in a single store (typically scalar variables), or
>> uses the native_{encode,interpret}* infrastructure to create or update
>> the CONSTRUCTOR. This means that except for the first category, we can't
>> right now handle unions or anything that needs relocations (vars
>> containing
>> pointers to other vars or references).
>> I think it would be nice to incrementally add before the native_*
>> fallback
>> some attempt to just create or update a CONSTRUCTOR if possible. If
>> we only
>> see var.a.b.c.d[10].e = const; style of stores, this shouldn't be that
>> hard
>> as the whole access path is recorded there and we'd just need to
>> decide what
>> to do with unions if two or more union members are accessed. And do a
>> deep
>> copy of the CONSTRUCTOR and try to efficiently update the copy afterwards
>> (the CONSTRUCTORs should be sorted on increasing offsets of the
>> members/elements, so doing an ordered vec insertion might not be the best
>> idea). But MEM_REFs complicate this, parts or all of the access path
>> is lost. For non-unions in most cases we could try to guess which field
>> it is (do we have some existing function to do that?
>
> The sprintf pass (only, for now) uses field_at_offset() for this.
>
>> I vaguely remember
>> we've been doing that in some cases in the past in some folding but
>> stopped
>> doing so) but with unions it will be harder or impossible.
>>
>> As the middle-end can't easily differentiate between const variables
>> without
>> and with mutable members,
>
> In the work-in-progress patch I mentioned above I've added two
> hooks: one to return if a type has a mutable member:
>
> extern bool lhd_has_mutable (const_tree);
>
> and another if a decl is mutable:
>
> extern bool lhd_is_mutable (const_tree);
>
> They let the middle end tell (with some effort) if a subobject
> is mutable (they're called from field_at_offset()). The effort
> is in recreating the full path to the mmeber for every MEM_REF.
> Since mutable is rare, this is a lot of work for usually no gain.
> Rather than doing this for every MEM_REF I've been thinking of
> caching this info for every struct type in some persistent
> structure.
In the front end, this is CLASSTYPE_HAS_MUTABLE, or cp_has_mutable_p.
Jason
More information about the Gcc-patches
mailing list