[PATCH] add -Wmismatched-tags (PR 61339)
Jason Merrill
jason@redhat.com
Wed Dec 4 23:37:00 GMT 2019
On 12/3/19 4:49 PM, Martin Sebor wrote:
> On 8/5/19 4:30 PM, Jason Merrill wrote:
>> On Mon, Aug 5, 2019 at 5:50 PM Martin Sebor <msebor@gmail.com> wrote:
>>>
>>> On 8/5/19 1:25 PM, Jason Merrill wrote:
>>>> On 8/1/19 7:35 PM, Martin Sebor wrote:
>>>>> On 8/1/19 12:09 PM, Jason Merrill wrote:
>>>>>> On 7/22/19 12:34 PM, Martin Sebor wrote:
>>>>>>> Ping: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg00622.html
> ...
>>>>>> -Wmismatched-tags is useful to have, given the MSVC/clang situation,
>>>>>> but I wonder about memory consumption from all the record keeping.
>>>>>> Do you have any overhead measurements?
>>>>>
>>>>> I did think about the overhead but not knowing if the patch would
>>>>> even be considered I didn't spend time optimizing it.
>>>>>
>>>>> In an instrumented build of GCC with the patch I just did I have
>>>>> collected the following stats for the number of elements in
>>>>> the rec2loc hash table (for 998 files):
>>>>>
>>>>> rec2loc elements locvec elements
>>>>> min: 0 0
>>>>> max: 2,785 3,303
>>>>> mean: 537 1,043
>>>>> median: 526 1,080
>>>>>
>>>>> The locvec data are based on totals for the whole hash table, so
>>>>> in the worst case the hash_map has 2,785 elements, and the sum of
>>>>> all locvec lengths in all those elements is 3,303. Each rec2loc
>>>>> element takes up 16 bytes, plus the size of the locvec data.
>>>>>
>>>>> If my math is right (which doesn't happen too often) in the worst
>>>>> case the bookkeeping overhead is 43KB for the hash_map plus on
>>>>> the order of 140KB for the vectors (just doubling the number of
>>>>> elements to account for capacity = 2 X size, times 24 for
>>>>> the flag_func_loc_t element type). So 183K in the worst case
>>>>> in GCC.
>>>>
>>>> 183k cumulative over all the GCC sources doesn't sound excessive,
>>>> but...
>>>>
>>>>> There are a few ways to reduce that if it seems excessive.
>>>>>
>>>>> One is by avoiding some waste in flag_func_loc_t which is
>>>>>
>>>>> pair<tag_types, pair<bool, pair<tree, location_t>>>
>>>>>
>>>>> tree could come first and tag_types and the bool could share
>>>>> space. That should bring it down to 16 in LP64, for about
>>>>> 30% off the 183K.
>>>>>
>>>>> Beyond that, we could change the algorithm to discard records
>>>>> for prior declarations after the first definition has been seen
>>>>> (and any mismatches diagnosed).
>>>>>
>>>>> We could also only collect one record for each definition
>>>>> in system headers rather than one for every declaration and
>>>>> reference.
>>>>
>>>> ...these all sound worthwhile.
>>>
>>> Okay, I'll look into it.
>>>
>>> To be clear: it was 183K maximum for a single compilation, not
>>> aggregate for all of them.
>>
>> Aha. Thanks.
>
> Attached is a revised patch that implements just -Wmismatched-tags
> without the other two warnings (-Wclass-not-pod and -Wstruct-is-pod).
> It also implements the optimizations mentioned above. To make it
> easier to do the tag cleanup by simply dropping the class-key when
> it isn't necessary (suggested during the review of the cleanup patch)
> I added another warning: -Wredundant-tags to point out instances
> where the class-key or enum-key can safely be dropped. Both warnings
> are off by default.
>
> With the optimizations in place, the biggest space overhead of
> using the option I measured in a GCC build was 990 elements of
> the record_to_locs_t hash table, plus 2756 elements of the locvec
> vector. In LP64, each record_to_locs_t element type is 16 bytes
> and each element of the locvec vector is 24 bytes, so the maximum
> space overhead is on the order of 80K. The average overhead per
> GCC translation unit was about 30K.
>
> The patch depends on fixes for a few bugs in GCC hash_tables (PR
> 92761 and 92762). I will post those separately.
>
> Martin
>
> PS Independently of this patch I will propose updating the GCC
> Coding Conventions to remove the guideline to use the struct
> class-key for PODs and class for non-PODs:
> https://gcc.gnu.org/codingconventions.html#Class_Use
> +class rec_decl_loc_t
Let's use "class" instead of "record" generally in this patch.
> +/* A mapping between a TYPE_DECL for a class and the rec_decl_loc_t
> + description above. */
> +typedef hash_map<tree, rec_decl_loc_t> record_to_locs_t;
You might want to use hash_map<tree_decl_hash, rec_decl_loc_t> so you
get the decl-specific hash function rather than the generic pointer hash.
It's hard to distinguish between this type and the previous one by name;
this one should probably have "map" in its name.
> +static GTY (()) record_to_locs_t *rec2loc;
...
> + rec2loc = new record_to_locs_t ();
If this isn't GC-allocated, marking it with GTY(()) seems wrong. How do
you imagine this warning interacting with PCH?
> + /* A prior declaration of TYPE_DECL has been seen. */
The rest of this function from this point on seems like it would be
better as a member function of rec_decl_loc_t.
> +static void
> +diag_mismatched_tags (tree type_decl, rec_decl_loc_t &recloc)
This could also be a member function.
> + if (rdl->idxdef < UINT_MAX && rdl->def_class_key == class_key)
Maybe != rather than < ?
> +@item -Wmismatched-tags @r{(C++ and Objective-C++ only)}
The documentation of this flag should mention that the main reason
someone might want to use it is for compatibility with MSVC++, where
struct and class are improperly mangled differently.
Jason
More information about the Gcc-patches
mailing list