[PATCH] gcov: make profile merging smarter
Richard Biener
richard.guenther@gmail.com
Mon Oct 4 11:16:15 GMT 2021
On Fri, Oct 1, 2021 at 12:53 PM Martin Liška <mliska@suse.cz> wrote:
>
> On 10/1/21 12:17, Richard Biener wrote:
> > On Fri, Oct 1, 2021 at 11:55 AM Martin Liška <mliska@suse.cz> wrote:
> >>
> >> Support merging of profiles that are built from a different .o files
> >> but belong to the same source file. Moreover, a checksum is verified
> >> during profile merging and so we can safely combine such profile.
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >> I'm going to install the patch if there are no comments about it.
> >
> > Is the ->stamp field now unused?
>
> Yes, it's still used for pairing of .gcda and .gcno files when --coverage is used.
>
> >
> > I wonder whether crc32 is good enough or whether we need to enable
>
> Dunno. We can alternatively use a stronger hashing function, maybe inchash?
>
> > -fprofile-correction by default for example?
>
> Probably not.
>
> > But -fprofile-correction
> > is about -fprofile-use, not profile merging, right? What does the latter
> > do upon mismatch?
>
> It prints the 'libgcov profiling error'.
>
> >
> > Alternatively would it be possible to keep multiple sets of data in the file,
> > one for each 'stamp'?
>
> Yes, we can theoretically do it, but I'm not planning working on that now.
>
> > How does the profile-use step figure a mismatch
> > here, or does it not care whether it mixes input with 'different stamp'?
>
> Based on function id, it does verification for cfg_checksum and lineno_checksum.
>
> >
> > The current behavior is also somewhat odd:
> >
> >> ./a.out
> > libgcov profiling error:/tmp/t.gcda:overwriting an existing profile
> > data with a different timestamp
> >
> > but it should probably say 'warning' since it indeed simply overwrites data
> > instead of merging it.
>
> Yes, I can prepare an incremental patch for it.
>
> >
> > I wonder if we can simply drop the stamp check alltogether and make
> > the checks that match up the data against the number of counters behave
> > as to warning and overwriting the data instead of failing fatally? The
> > actual merging of data would need to be delayed, but at least until
> > the first actual merge was done we could simply proceed?
>
> Do you mean doing only merging of functions that have correct checksum, and bailing
> out for the functions that don't?
I meant in merge_one_data do not check ->stamp or ->checksum but instead rely
on the counter merging code to detect mismatches (there's read_mismatch and
read_error). There's multiple things we can do when we run into those:
- when we did not actually merged any counter yet we could issue the
warning as before and drop the old data on the floor
- when we _did_ merge some counters already we could hard-error
(I suppose we can't roll-back merging that took place already)
- we could do the merging two-stage, first see whether the data matches
and only if it did perform the merging
Note that all of the changes (including yours) have user-visible effects and
the behavior is somewhat unobvious. Not merging when the object was
re-built is indeed the most obvious behavior so I'm not sure it's a good
idea. A new env variable to say whether to simply keep the _old_ data
when merging in new data isn't possible would be another "fix" I guess?
Richard.
>
> Thanks for the ideas.
>
> Martin
>
> >
> > Richard.
> >
> >> Thanks,
> >> Martin
> >>
> >> PR gcov-profile/90364
> >>
> >> gcc/ChangeLog:
> >>
> >> * coverage.c (build_info): Emit checksum to the global variable.
> >> (build_info_type): Add new field for checksum.
> >> (coverage_obj_finish): Pass object_checksum.
> >> (coverage_init): Use 0 as checksum for .gcno files.
> >> * gcov-dump.c (dump_gcov_file): Dump also new checksum field.
> >> * gcov.c (read_graph_file): Read also checksum.
> >>
> >> libgcc/ChangeLog:
> >>
> >> * libgcov-driver.c (merge_one_data): Skip timestamp and verify
> >> checksums.
> >> (write_one_data): Write also checksum.
> >> * libgcov-util.c (read_gcda_file): Read also checksum field.
> >> * libgcov.h (struct gcov_info): Add new field.
> >> ---
> >> gcc/coverage.c | 50 ++++++++++++++++++++++++++++-------------
> >> gcc/gcov-dump.c | 9 ++++----
> >> gcc/gcov.c | 5 +++++
> >> libgcc/libgcov-driver.c | 8 +++++--
> >> libgcc/libgcov-util.c | 3 +++
> >> libgcc/libgcov.h | 1 +
> >> 6 files changed, 54 insertions(+), 22 deletions(-)
> >>
> >> diff --git a/gcc/coverage.c b/gcc/coverage.c
> >> index 10d7f8366cb..4467f1eaa5c 100644
> >> --- a/gcc/coverage.c
> >> +++ b/gcc/coverage.c
> >> @@ -129,16 +129,7 @@ static const char *const ctr_names[GCOV_COUNTERS] = {
> >> #undef DEF_GCOV_COUNTER
> >>
> >> /* Forward declarations. */
> >> -static void read_counts_file (void);
> >> static tree build_var (tree, tree, int);
> >> -static void build_fn_info_type (tree, unsigned, tree);
> >> -static void build_info_type (tree, tree);
> >> -static tree build_fn_info (const struct coverage_data *, tree, tree);
> >> -static tree build_info (tree, tree);
> >> -static bool coverage_obj_init (void);
> >> -static vec<constructor_elt, va_gc> *coverage_obj_fn
> >> -(vec<constructor_elt, va_gc> *, tree, struct coverage_data const *);
> >> -static void coverage_obj_finish (vec<constructor_elt, va_gc> *);
> >>
> >> /* Return the type node for gcov_type. */
> >>
> >> @@ -218,6 +209,9 @@ read_counts_file (void)
> >> tag = gcov_read_unsigned ();
> >> bbg_file_stamp = crc32_unsigned (bbg_file_stamp, tag);
> >>
> >> + /* Read checksum. */
> >> + gcov_read_unsigned ();
> >> +
> >> counts_hash = new hash_table<counts_entry> (10);
> >> while ((tag = gcov_read_unsigned ()))
> >> {
> >> @@ -935,6 +929,12 @@ build_info_type (tree type, tree fn_info_ptr_type)
> >> DECL_CHAIN (field) = fields;
> >> fields = field;
> >>
> >> + /* Checksum. */
> >> + field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE,
> >> + get_gcov_unsigned_t ());
> >> + DECL_CHAIN (field) = fields;
> >> + fields = field;
> >> +
> >> /* Filename */
> >> field = build_decl (BUILTINS_LOCATION, FIELD_DECL, NULL_TREE,
> >> build_pointer_type (build_qualified_type
> >> @@ -977,7 +977,7 @@ build_info_type (tree type, tree fn_info_ptr_type)
> >> function info objects. */
> >>
> >> static tree
> >> -build_info (tree info_type, tree fn_ary)
> >> +build_info (tree info_type, tree fn_ary, unsigned object_checksum)
> >> {
> >> tree info_fields = TYPE_FIELDS (info_type);
> >> tree merge_fn_type, n_funcs;
> >> @@ -996,13 +996,19 @@ build_info (tree info_type, tree fn_ary)
> >> /* next -- NULL */
> >> CONSTRUCTOR_APPEND_ELT (v1, info_fields, null_pointer_node);
> >> info_fields = DECL_CHAIN (info_fields);
> >> -
> >> +
> >> /* stamp */
> >> CONSTRUCTOR_APPEND_ELT (v1, info_fields,
> >> build_int_cstu (TREE_TYPE (info_fields),
> >> bbg_file_stamp));
> >> info_fields = DECL_CHAIN (info_fields);
> >>
> >> + /* Checksum. */
> >> + CONSTRUCTOR_APPEND_ELT (v1, info_fields,
> >> + build_int_cstu (TREE_TYPE (info_fields),
> >> + object_checksum));
> >> + info_fields = DECL_CHAIN (info_fields);
> >> +
> >> /* Filename */
> >> da_file_name_len = strlen (da_file_name);
> >> filename_string = build_string (da_file_name_len + 1, da_file_name);
> >> @@ -1214,7 +1220,8 @@ coverage_obj_fn (vec<constructor_elt, va_gc> *ctor, tree fn,
> >> function objects from CTOR. Generate the gcov_info initializer. */
> >>
> >> static void
> >> -coverage_obj_finish (vec<constructor_elt, va_gc> *ctor)
> >> +coverage_obj_finish (vec<constructor_elt, va_gc> *ctor,
> >> + unsigned object_checksum)
> >> {
> >> unsigned n_functions = vec_safe_length (ctor);
> >> tree fn_info_ary_type = build_array_type
> >> @@ -1231,7 +1238,7 @@ coverage_obj_finish (vec<constructor_elt, va_gc> *ctor)
> >> varpool_node::finalize_decl (fn_info_ary);
> >>
> >> DECL_INITIAL (gcov_info_var)
> >> - = build_info (TREE_TYPE (gcov_info_var), fn_info_ary);
> >> + = build_info (TREE_TYPE (gcov_info_var), fn_info_ary, object_checksum);
> >> varpool_node::finalize_decl (gcov_info_var);
> >> }
> >>
> >> @@ -1300,7 +1307,6 @@ coverage_init (const char *filename)
> >> strcpy (da_file_name + prefix_len + len, GCOV_DATA_SUFFIX);
> >>
> >> bbg_file_stamp = local_tick;
> >> -
> >> if (flag_auto_profile)
> >> read_autofdo_file ();
> >> else if (flag_branch_probabilities)
> >> @@ -1328,6 +1334,8 @@ coverage_init (const char *filename)
> >> gcov_write_unsigned (GCOV_NOTE_MAGIC);
> >> gcov_write_unsigned (GCOV_VERSION);
> >> gcov_write_unsigned (bbg_file_stamp);
> >> + /* Use an arbitrary checksum */
> >> + gcov_write_unsigned (0);
> >> gcov_write_string (getpwd ());
> >>
> >> /* Do not support has_unexecuted_blocks for Ada. */
> >> @@ -1353,14 +1361,24 @@ coverage_finish (void)
> >> cannot uniquely stamp it. If we can stamp it, libgcov will DTRT. */
> >> unlink (da_file_name);
> >>
> >> + /* Global GCDA checksum that aggregates all functions. */
> >> + unsigned object_checksum = 0;
> >> +
> >> if (coverage_obj_init ())
> >> {
> >> vec<constructor_elt, va_gc> *fn_ctor = NULL;
> >> struct coverage_data *fn;
> >>
> >> for (fn = functions_head; fn; fn = fn->next)
> >> - fn_ctor = coverage_obj_fn (fn_ctor, fn->fn_decl, fn);
> >> - coverage_obj_finish (fn_ctor);
> >> + {
> >> + fn_ctor = coverage_obj_fn (fn_ctor, fn->fn_decl, fn);
> >> +
> >> + object_checksum = crc32_unsigned (object_checksum, fn->ident);
> >> + object_checksum = crc32_unsigned (object_checksum,
> >> + fn->lineno_checksum);
> >> + object_checksum = crc32_unsigned (object_checksum, fn->cfg_checksum);
> >> + }
> >> + coverage_obj_finish (fn_ctor, object_checksum);
> >> }
> >>
> >> XDELETEVEC (da_file_name);
> >> diff --git a/gcc/gcov-dump.c b/gcc/gcov-dump.c
> >> index f1489fe0214..bfaf735d2ff 100644
> >> --- a/gcc/gcov-dump.c
> >> +++ b/gcc/gcov-dump.c
> >> @@ -206,11 +206,12 @@ dump_gcov_file (const char *filename)
> >> }
> >>
> >> /* stamp */
> >> - {
> >> - unsigned stamp = gcov_read_unsigned ();
> >> + unsigned stamp = gcov_read_unsigned ();
> >> + printf ("%s:stamp %lu\n", filename, (unsigned long)stamp);
> >>
> >> - printf ("%s:stamp %lu\n", filename, (unsigned long)stamp);
> >> - }
> >> + /* Checksum */
> >> + unsigned checksum = gcov_read_unsigned ();
> >> + printf ("%s:checksum %lu\n", filename, (unsigned long)checksum);
> >>
> >> if (!is_data_type)
> >> {
> >> diff --git a/gcc/gcov.c b/gcc/gcov.c
> >> index cf0a49d8c30..829e955a63b 100644
> >> --- a/gcc/gcov.c
> >> +++ b/gcc/gcov.c
> >> @@ -1814,6 +1814,8 @@ read_graph_file (void)
> >> bbg_file_name, v, e);
> >> }
> >> bbg_stamp = gcov_read_unsigned ();
> >> + /* Read checksum. */
> >> + gcov_read_unsigned ();
> >> bbg_cwd = xstrdup (gcov_read_string ());
> >> bbg_supports_has_unexecuted_blocks = gcov_read_unsigned ();
> >>
> >> @@ -2031,6 +2033,9 @@ read_count_file (void)
> >> goto cleanup;
> >> }
> >>
> >> + /* Read checksum. */
> >> + gcov_read_unsigned ();
> >> +
> >> while ((tag = gcov_read_unsigned ()))
> >> {
> >> unsigned length = gcov_read_unsigned ();
> >> diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
> >> index 087f71e0107..7aa97bbb06a 100644
> >> --- a/libgcc/libgcov-driver.c
> >> +++ b/libgcc/libgcov-driver.c
> >> @@ -260,12 +260,15 @@ merge_one_data (const char *filename,
> >> if (!gcov_version (gi_ptr, length, filename))
> >> return -1;
> >>
> >> + /* Skip timestamp. */
> >> + gcov_read_unsigned ();
> >> +
> >> length = gcov_read_unsigned ();
> >> - if (length != gi_ptr->stamp)
> >> + if (length != gi_ptr->checksum)
> >> {
> >> /* Read from a different compilation. Overwrite the file. */
> >> gcov_error (GCOV_PROF_PREFIX "overwriting an existing profile data "
> >> - "with a different timestamp\n", filename);
> >> + "with a different checksum\n", filename);
> >> return 0;
> >> }
> >>
> >> @@ -495,6 +498,7 @@ write_one_data (const struct gcov_info *gi_ptr,
> >> dump_unsigned (GCOV_DATA_MAGIC, dump_fn, arg);
> >> dump_unsigned (GCOV_VERSION, dump_fn, arg);
> >> dump_unsigned (gi_ptr->stamp, dump_fn, arg);
> >> + dump_unsigned (gi_ptr->checksum, dump_fn, arg);
> >>
> >> #ifdef NEED_L_GCOV
> >> /* Generate whole program statistics. */
> >> diff --git a/libgcc/libgcov-util.c b/libgcc/libgcov-util.c
> >> index bc7416d99bf..766ca3559c4 100644
> >> --- a/libgcc/libgcov-util.c
> >> +++ b/libgcc/libgcov-util.c
> >> @@ -310,6 +310,9 @@ read_gcda_file (const char *filename)
> >> /* Read stamp. */
> >> obj_info->stamp = gcov_read_unsigned ();
> >>
> >> + /* Read checksum. */
> >> + obj_info->checksum = gcov_read_unsigned ();
> >> +
> >> while (1)
> >> {
> >> gcov_position_t base;
> >> diff --git a/libgcc/libgcov.h b/libgcc/libgcov.h
> >> index f6354a7a070..2a365c95759 100644
> >> --- a/libgcc/libgcov.h
> >> +++ b/libgcc/libgcov.h
> >> @@ -230,6 +230,7 @@ struct gcov_info
> >> struct gcov_info *next; /* link to next, used by libgcov */
> >>
> >> gcov_unsigned_t stamp; /* uniquifying time stamp */
> >> + gcov_unsigned_t checksum; /* unique object checksum */
> >> const char *filename; /* output file name */
> >>
> >> gcov_merge_fn merge[GCOV_COUNTERS]; /* merge functions (null for
> >> --
> >> 2.33.0
> >>
>
More information about the Gcc-patches
mailing list