[PATCH] c++, dyninit: Optimize C++ dynamic initialization by constants into DECL_INITIAL adjustment [PR102876]

Jason Merrill jason@redhat.com
Fri Nov 5 19:46:35 GMT 2021

On 11/5/21 13:06, Martin Sebor wrote:
> On 11/4/21 3:42 AM, Jakub Jelinek via Gcc-patches wrote:
>> Hi!
>> When users don't use constexpr everywhere in initialization of namespace
>> scope non-comdat vars and the initializers aren't constant when FE is
>> looking at them, the FE performs dynamic initialization of those 
>> variables.
>> But after inlining and some constant propagation, we often end up with
>> just storing constants into those variables in the _GLOBAL__sub_I_*
>> constructor.
>> C++ gives us permission to change some of that dynamic initialization
>> back into static initialization - 
>> https://eel.is/c++draft/basic.start.static#3
>> For classes that need (dynamic) construction, I believe access to some 
>> var
>> from other dynamic construction before that var is constructed is UB, but
>> as the example in the above mentioned spot of C++:
>> inline double fd() { return 1.0; }
>> extern double d1;
>> double d2 = d1;     // unspecified:
>>                      // either statically initialized to 0.0 or
>>                      // dynamically initialized to 0.0 if d1 is
>>                      // dynamically initialized, or 1.0 otherwise
>> double d1 = fd();   // either initialized statically or dynamically to 
>> 1.0
>> some vars can be used before they are dynamically initialized and the
>> implementation can still optimize those into static initialization.
>> The following patch attempts to optimize some such cases back into
>> DECL_INITIAL initializers and where possible (originally const vars 
>> without
>> mutable members) put those vars back to .rodata etc.
>> Because we put all dynamic initialization from a single TU into one 
>> single
>> function (well, originally one function per priority but typically inline
>> those back into one function), we can either have a simpler approach
>> (from the PR it seems that is what LLVM uses) where either we manage to
>> optimize all dynamic initializers into constant in the TU, or nothing,
>> or by adding some markup - in the form of a pair of internal functions in
>> this patch - around each dynamic initialization that can be optimized,
>> we can optimize each dynamic initialization separately.
>> The patch adds a new pass that is invoked (through gate check) only on
>> DECL_ARTIFICIAL DECL_STATIC_CONSTRUCTOR functions, and looks there for
>> sequences like:
>>    .DYNAMIC_INIT_START (&b, 0);
>>    b = 1;
>>    .DYNAMIC_INIT_END (&b);
>> or
>>    .DYNAMIC_INIT_START (&e, 1);
>>    # DEBUG this => &e.f
>>    MEM[(struct S *)&e + 4B] ={v} {CLOBBER};
>>    MEM[(struct S *)&e + 4B].a = 1;
>>    MEM[(struct S *)&e + 4B].b = 2;
>>    MEM[(struct S *)&e + 4B].c = 3;
>>    MEM[(struct S *)&e + 4B].d = 6;
>>    # DEBUG this => NULL
>>    .DYNAMIC_INIT_END (&e);
>> (where between the pair of markers everything is either debug stmts or
>> stores of constants into the variables or their parts).
>> The pass needs to be done late enough so that after IPA all the needed
>> constant propagation and perhaps loop unrolling is done, on the other
>> side should be early enough so that if we can't optimize it, we can
>> remove those .DYNAMIC_INIT* internal calls that could prevent some
>> further optimizations (they have fnspec such that they pretend to read
>> the corresponding variable).
> In my work-in-progress patch to diagnose stores into constant
> objects (and subobjects) I deal with the same problem.  I had
> considered a pair of markers like those above (David Malcolm
> suggested a smilar approach as well), but decided to go
> a different route, not trusting they could be kept together,
> or that they wouldn't be viewed as overly intrusive.  With
> it, I have been able to distinguish dynamic initialization
> from overwriting stores even at the end of compilation, but
> I'd be fine changing that and running the detection earlier.
> So if the markers are added for the purpose of optimizing
> the dynamic initialization at file scope, could they be added
> for those of locals as well?  That way I wouldn't need to add
> a separate solution.
>> Currently the optimization is only able to optimize cases where the whole
>> variable is stored in a single store (typically scalar variables), or
>> uses the native_{encode,interpret}* infrastructure to create or update
>> the CONSTRUCTOR.  This means that except for the first category, we can't
>> right now handle unions or anything that needs relocations (vars 
>> containing
>> pointers to other vars or references).
>> I think it would be nice to incrementally add before the native_* 
>> fallback
>> some attempt to just create or update a CONSTRUCTOR if possible.  If 
>> we only
>> see var.a.b.c.d[10].e = const; style of stores, this shouldn't be that 
>> hard
>> as the whole access path is recorded there and we'd just need to 
>> decide what
>> to do with unions if two or more union members are accessed.  And do a 
>> deep
>> copy of the CONSTRUCTOR and try to efficiently update the copy afterwards
>> (the CONSTRUCTORs should be sorted on increasing offsets of the
>> members/elements, so doing an ordered vec insertion might not be the best
>> idea).  But MEM_REFs complicate this, parts or all of the access path
>> is lost.  For non-unions in most cases we could try to guess which field
>> it is (do we have some existing function to do that?
> The sprintf pass (only, for now) uses field_at_offset() for this.
>> I vaguely remember
>> we've been doing that in some cases in the past in some folding but 
>> stopped
>> doing so) but with unions it will be harder or impossible.
>> As the middle-end can't easily differentiate between const variables 
>> without
>> and with mutable members,
> In the work-in-progress patch I mentioned above I've added two
> hooks: one to return if a type has a mutable member:
>    extern bool lhd_has_mutable (const_tree);
> and another if a decl is mutable:
>    extern bool lhd_is_mutable (const_tree);
> They let the middle end tell (with some effort) if a subobject
> is mutable (they're called from field_at_offset()).  The effort
> is in recreating the full path to the mmeber for every MEM_REF.
> Since mutable is rare, this is a lot of work for usually no gain.
> Rather than doing this for every MEM_REF I've been thinking of
> caching this info for every struct type in some persistent
> structure.

In the front end, this is CLASSTYPE_HAS_MUTABLE, or cp_has_mutable_p.


More information about the Gcc-patches mailing list