[PATCH] minor FDO profile related fixes
Indu Bhagat
indu.bhagat@oracle.com
Thu Nov 8 00:41:00 GMT 2018
I have been looking at -fdump-ipa-profile dump with an intention to sanitize
bits of information so that one may use it to judge the "quality of a profile"
in FDO.
The overall question I want to address is - are there ways to know which
functions were not run in the training run, i.e. have ZERO profile ?
(This patch corrects some dumped info; in a subsequent patch I would like to add
some more explicit information/diagnostics.)
Towards that end, I noticed that there are a couple of misleading bits of
information (so I think) in the symbol table dump listing all functions in the
compilation unit :
--- "globally 0" appears even when profile data has not been fed by feedback
profile (not the intent as the documentation of profile_guessed_global0
in profile-count.h suggests).
--- "unlikely_executed" should appear only when there is profile feedback or
a function attribute is specified (as per documentation of
node_frequency in coretypes.h). "unlikely_executed" in case of STALE or
NO profile is misleading in my opinion.
Summary of changes :
1. This patch makes some adjustments around how x_profile_status of a function
is set - x_profile_status should be set to PROFILE_READ only when there is a
profile for a function read from the .gcda file. So, instead of relying on
profile_info (set whenever the gcda feedback file is present, even if the
function does not have a profile available in the file), use exec_counts
(non null when function has a profile (the latter may or may not be zero)). In
essence, x_profile_status and profile_count::m_quality
are set consistent to the stated intent (in code comments.)
2. A minor change in coverage.c is for more precise location of the message
Following -fdump-ipa-profile dump excerpts show the effect :
------------------------------------------------
-O1, -O2, -O3
------------------------------------------------
0. APPLICABLE PROFILE
Trunk : Function flags: count:224114269 body hot
After Patch : Function flags: count:224114269 (precise) body hot
1. STALE PROFILE
(i.e., those cases covered by Wcoverage-mismatch; when control flow changes
between profile-generate and profile-use)
Trunk : Function flags: count:224114269 body hot
After Patch : Function flags: count:224114269 (precise) body hot
2. NO PROFILE
(i.e., those cases covered by Wmissing-profile; when function has no profile
available in the .gcda file)
Trunk (missing .gcda file) : Function flags: count:1073741824 (estimated locally) body
Trunk (missing function) : Function flags: count: 1073741824 (estimated locally, globally 0) body unlikely_executed
After Patch (missing .gcda file) : Function flags: count:1073741824 (estimated locally) body
After Patch (missing function) : Function flags: count:1073741824 (estimated locally) body
3. ZERO PROFILE (functions not run in training run)
Trunk : Function flags: count: 1073741824 (estimated locally, globally 0) body unlikely_executed
After Patch (remains the same) : count: 1073741824 (estimated locally, globally 0) body unlikely_executed
--------------------------------------------------
O0
--------------------------------------------------
In O0, flag_guess_branch_prob is not set. This makes the profile_quality set to
(precise) for most of the above cases.
0. APPLICABLE PROFILE
Trunk : Function flags: count:224114269 body hot
After Patch : Function flags: count:224114269 (precise) body hot
1. STALE PROFILE
(i.e., those cases covered by Wcoverage-mismatch; when control flow changes
between profile-generate and profile-use)
Trunk : Function flags: count:224114269 body hot
After Patch : Function flags: count:224114269 (precise) body hot
2. NO PROFILE
(i.e., those cases covered by Wmissing-profile; when function has no profile
available in the .gcda file)
Trunk (missing file) : Function flags: body
Trunk (missing function) : Function flags: count:0 body unlikely_executed
After Patch (missing file) : Function flags: body
*** After Patch (missing function) : Function flags: count:0 (precise) body
(*** This remains misleading, and I do not have a solution for this; as use of heuristics
to guess branch probability is not allowed in O0)
3. ZERO PROFILE (functions not run in training run)
Trunk : Function flags: count:0 body unlikely_executed
After Patch : Function flags: count:0 (precise) body
--------------------------------------------------
make check-gcc on x86_64 shows no new failures.
(A related PR was https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86957 where we added diagnostics for the NO PROFILE case.)
-------------- next part --------------
diff --git a/gcc/coverage.c b/gcc/coverage.c
index 599a3bb..7595e6c 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -358,7 +358,7 @@ get_coverage_counts (unsigned counter, unsigned cfg_checksum,
if (warning_printed && dump_enabled_p ())
{
dump_user_location_t loc
- = dump_user_location_t::from_location_t (input_location);
+ = dump_user_location_t::from_function_decl (current_function_decl);
dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
"use -Wno-error=coverage-mismatch to tolerate "
"the mismatch but performance may drop if the "
diff --git a/gcc/profile-count.c b/gcc/profile-count.c
index f4ab244..90f4feb 100644
--- a/gcc/profile-count.c
+++ b/gcc/profile-count.c
@@ -83,6 +83,8 @@ profile_count::dump (FILE *f) const
fprintf (f, " (auto FDO)");
else if (m_quality == profile_guessed)
fprintf (f, " (guessed)");
+ else if (m_quality == profile_precise)
+ fprintf (f, " (precise)");
}
}
diff --git a/gcc/profile.c b/gcc/profile.c
index 2130319..57e3f3c 100644
--- a/gcc/profile.c
+++ b/gcc/profile.c
@@ -698,6 +698,11 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum)
}
}
+ if (exec_counts)
+ {
+ profile_status_for_fn (cfun) = PROFILE_READ;
+ }
+
/* If we have real data, use them! */
if (bb_gcov_count (ENTRY_BLOCK_PTR_FOR_FN (cfun))
|| !flag_guess_branch_prob)
@@ -705,7 +710,7 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum)
bb->count = profile_count::from_gcov_type (bb_gcov_count (bb));
/* If function was not trained, preserve local estimates including statically
determined zero counts. */
- else
+ else if (profile_status_for_fn (cfun) == PROFILE_READ)
FOR_ALL_BB_FN (bb, cfun)
if (!(bb->count == profile_count::zero ()))
bb->count = bb->count.global0 ();
@@ -718,6 +723,11 @@ compute_branch_probabilities (unsigned cfg_checksum, unsigned lineno_checksum)
if (dump_file)
{
+ fprintf (dump_file, " Profile feedback for function");
+ fprintf (dump_file, ((profile_status_for_fn (cfun) == PROFILE_READ)
+ ? " is available \n"
+ : " is not available \n"));
+
fprintf (dump_file, "%d branches\n", num_branches);
if (num_branches)
for (i = 0; i < 10; i++)
@@ -1317,12 +1327,12 @@ branch_prob (void)
values.release ();
free_edge_list (el);
coverage_end_function (lineno_checksum, cfg_checksum);
- if (flag_branch_probabilities && profile_info)
+ if (flag_branch_probabilities
+ && (profile_status_for_fn (cfun) == PROFILE_READ))
{
struct loop *loop;
if (dump_file && (dump_flags & TDF_DETAILS))
report_predictor_hitrates ();
- profile_status_for_fn (cfun) = PROFILE_READ;
/* At this moment we have precise loop iteration count estimates.
Record them to loop structure before the profile gets out of date. */
More information about the Gcc-patches
mailing list