[PATCH] minor FDO profile related fixes

Martin Liška mliska@suse.cz
Mon Nov 12 09:49:00 GMT 2018


On 11/8/18 1:49 AM, Indu Bhagat wrote:
> I have been looking at -fdump-ipa-profile dump with an intention to sanitize
> bits of information so that one may use it to judge the "quality of a profile"
> in FDO.
> 
> The overall question I want to address is - are there ways to know which
> functions were not run in the training run, i.e. have ZERO profile ?
> (This patch corrects some dumped info; in a subsequent patch I would like to add
> some more explicit information/diagnostics.)
> 
> Towards that end, I noticed that there are a couple of misleading bits of
> information (so I think) in the symbol table dump listing all functions in the
> compilation unit :
>    --- "globally 0" appears even when profile data has not been fed by feedback
>         profile (not the intent as the documentation of profile_guessed_global0
>          in profile-count.h suggests).
>    --- "unlikely_executed" should appear only when there is profile feedback or
>        a function attribute is specified (as per documentation of
>        node_frequency in coretypes.h). "unlikely_executed" in case of STALE or
>        NO profile is misleading in my opinion.
> 
> Summary of changes :
> 
> 1. This patch makes some adjustments around how x_profile_status of a function
> is set - x_profile_status should be set to PROFILE_READ only when there is a
> profile for a function read from the .gcda file. So, instead of relying on
> profile_info (set whenever the gcda feedback file is present, even if the
> function does not have a profile available in the file), use exec_counts
> (non null when function has a profile (the latter may or may not be zero)). In
> essence, x_profile_status and profile_count::m_quality
> are set consistent to the stated intent (in code comments.)
> 
> 2. A minor change in coverage.c is for more precise location of the message
> 
> Following -fdump-ipa-profile dump excerpts show the effect :
> 
> ------------------------------------------------
>      -O1, -O2, -O3
> ------------------------------------------------
> 
> 0. APPLICABLE PROFILE
> Trunk : Function flags: count:224114269 body hot
> After Patch : Function flags: count:224114269 (precise) body hot
> 
> 1. STALE PROFILE
> (i.e., those cases covered by Wcoverage-mismatch; when control flow changes
>  between profile-generate and profile-use)
> Trunk : Function flags: count:224114269 body hot
> After Patch : Function flags: count:224114269 (precise) body hot
> 
> 2. NO PROFILE
> (i.e., those cases covered by Wmissing-profile; when function has no profile
>  available in the .gcda file)
> Trunk (missing .gcda file) : Function flags: count:1073741824 (estimated locally) body
> Trunk (missing function) : Function flags: count: 1073741824 (estimated locally, globally 0) body unlikely_executed
> After Patch (missing .gcda file) : Function flags: count:1073741824 (estimated locally) body
> After Patch (missing function) : Function flags: count:1073741824 (estimated locally) body
> 
> 3. ZERO PROFILE (functions not run in training run)
> Trunk : Function flags: count: 1073741824 (estimated locally, globally 0) body unlikely_executed
> After Patch (remains the same) : count: 1073741824 (estimated locally, globally 0) body unlikely_executed
> 
> --------------------------------------------------
>     O0
> --------------------------------------------------
> In O0, flag_guess_branch_prob is not set. This makes the profile_quality set to
> (precise) for most of the above cases.
> 
> 0. APPLICABLE PROFILE
> Trunk : Function flags: count:224114269 body hot
> After Patch : Function flags: count:224114269 (precise) body hot
> 
> 1. STALE PROFILE
> (i.e., those cases covered by Wcoverage-mismatch; when control flow changes
>  between profile-generate and profile-use)
> Trunk : Function flags: count:224114269 body hot
> After Patch : Function flags: count:224114269 (precise) body hot
> 
> 2. NO PROFILE
> (i.e., those cases covered by Wmissing-profile; when function has no profile
>  available in the .gcda file)
> Trunk (missing file) : Function flags: body
> Trunk (missing function) : Function flags: count:0 body unlikely_executed
> After Patch (missing file) :  Function flags: body
> *** After Patch (missing function) : Function flags: count:0 (precise) body
> (*** This remains misleading, and I do not have a solution for this; as use of heuristics
>  to guess branch probability is not allowed in O0)
> 
> 3. ZERO PROFILE (functions not run in training run)
> Trunk : Function flags: count:0 body unlikely_executed
> After Patch : Function flags: count:0 (precise) body
> 
> --------------------------------------------------
> 
> make check-gcc on x86_64 shows no new failures.
> 
> (A related PR was https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86957 where we added diagnostics for the NO PROFILE case.)

Hi.

Thanks for the patch. I'm not a maintainer, but the idea of the patch looks correct to me.
One question about adding "(precise)", have you verified that the patch can survive regression
tests?

Thanks,
Martin

> 
> 
> precise-ipa-dump-optinfo.patch.ver1
> 
> diff --git a/gcc/coverage.c b/gcc/coverage.c
> index 599a3bb..7595e6c 100644
> --- a/gcc/coverage.c
> +++ b/gcc/coverage.c
> @@ -358,7 +358,7 @@ get_coverage_counts (unsigned counter, unsigned cfg_checksum,
>        if (warning_printed && dump_enabled_p ())
>  	{
>  	  dump_user_location_t loc
> -	    = dump_user_location_t::from_location_t (input_location);
> +	    = dump_user_location_t::from_function_decl (current_function_decl);
>            dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
>                             "use -Wno-error=coverage-mismatch to tolerate "
>                             "the mismatch but performance may drop if the "
> diff --git a/gcc/profile-count.c b/gcc/profile-count.c
> index f4ab244..90f4feb 100644
> --- a/gcc/profile-count.c
> +++ b/gcc/profile-count.c
> @@ -83,6 +83,8 @@ profile_count::dump (FILE *f) const
>  	fprintf (f, " (auto FDO)");
>        else if (m_quality == profile_guessed)
>  	fprintf (f, " (guessed)");
> +      else if (m_quality == profile_precise)
> +	fprintf (f, " (precise)");
>      }
>  }
>  
> diff --git a/gcc/profile.c b/gcc/profile.c
> index 2130319..57e3f3c 100644
> --- a/gcc/profile.c
> +++ b/gcc/profile.c
> @@ -698,6 +698,11 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum)
>  	}
>      }
>  
> +  if (exec_counts)
> +    {
> +      profile_status_for_fn (cfun) = PROFILE_READ;
> +    }
> +
>    /* If we have real data, use them!  */
>    if (bb_gcov_count (ENTRY_BLOCK_PTR_FOR_FN (cfun))
>        || !flag_guess_branch_prob)
> @@ -705,7 +710,7 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum)
>        bb->count = profile_count::from_gcov_type (bb_gcov_count (bb));
>    /* If function was not trained, preserve local estimates including statically
>       determined zero counts.  */
> -  else
> +  else if (profile_status_for_fn (cfun) == PROFILE_READ)
>      FOR_ALL_BB_FN (bb, cfun)
>        if (!(bb->count == profile_count::zero ()))
>          bb->count = bb->count.global0 ();
> @@ -718,6 +723,11 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum)
>  
>    if (dump_file)
>      {
> +      fprintf (dump_file, " Profile feedback for function");
> +      fprintf (dump_file, ((profile_status_for_fn (cfun) == PROFILE_READ)
> +			   ? " is available \n"
> +			   : " is not available \n"));
> +
>        fprintf (dump_file, "%d branches\n", num_branches);
>        if (num_branches)
>  	for (i = 0; i < 10; i++)
> @@ -1317,12 +1327,12 @@ branch_prob (void)
>    values.release ();
>    free_edge_list (el);
>    coverage_end_function (lineno_checksum, cfg_checksum);
> -  if (flag_branch_probabilities && profile_info)
> +  if (flag_branch_probabilities
> +      && (profile_status_for_fn (cfun) == PROFILE_READ))
>      {
>        struct loop *loop;
>        if (dump_file && (dump_flags & TDF_DETAILS))
>  	report_predictor_hitrates ();
> -      profile_status_for_fn (cfun) = PROFILE_READ;
>  
>        /* At this moment we have precise loop iteration count estimates.
>  	 Record them to loop structure before the profile gets out of date. */
> 



More information about the Gcc-patches mailing list