This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [google] Patch to add compiler flag to dump callgraph edge profiles in special .note sections (issue4591045)


ok for google/main.

David

On Wed, Jun 8, 2011 at 9:13 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> +davidxl
>
> On Tue, Jun 7, 2011 at 7:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Patch Description:
>> =================
>>
>> I am working on a project to do global function layout in the linker where the linker reads the callgraph edge profile information, generated by FDO, and uses that to find a ordering of functions that will place functions calling each other frequently closer, like the Pettis-Hansen code ordering algorithm described in the paper "Profile-guided Code Poisitioning" in PLDI 1990.
>>
>> This patch adds a flag that allows the callgraph edge profile information to be stored .note sections called ".note.callgraph.text". The new compiler flag -fcallgraph-profiles-sections generates these sections and must be used along with -fprofile-use. I have added a PARAM to only output callgraph edges greater than a specified threshold. Once this is available, the linker can read these sections and generate a global callgraph which can be used to determine a global function ordering.
>>
>> I am adding plugin support in the gold linker to allow linker plugins to be able to read the contents of sections and also adding plugin hooks to specify a desired ordering of functions to the linker. The linker patch is available here : http://sourceware.org/ml/binutils/2011-03/msg00043.html. Once this is available, linker plugins can be used to determine the function layout, like the Pettis-Hansen algorithm, of the final binary.
>>
>> Example: The new .note.callgraph.text sections looks like this for a function foo that calls bar 100 times and zap 50 times:
>> ****************************
>> .section ? ? ? ?.note.callgraph.text._Z3foov,"",@progbits
>> ? ? ? ?.string "Function _Z3foov"
>> ? ? ? ?.string "_Z3barv"
>> ? ? ? ?.string "100"
>> ? ? ? ?.string "_Z3zapv"
>> ? ? ? ?.string "50"
>> ***************************
>>
>> For now, this is for google/main. I will re-submit for review to trunk along with data layout.
>>
>> Google ref 41940
>>
>> 2011-06-07 ?Sriraman Tallam ?<tmsriram@google.com>
>>
>> ? ? ? ?* doc/invoke.texi: document option -fcallgraph-profiles-sections.
>> ? ? ? ?* final.c ?(dump_cgraph_profiles): New function.
>> ? ? ? ?(rest_of_handle_final): Create new section '.note.callgraph.text'
>> ? ? ? ?with compiler flag -fcallgraph-profiles-sections
>> ? ? ? ?* common.opt: New option -fcallgraph-profiles-sections.
>> ? ? ? ?* params.def (DEFPARAM): New param
>> ? ? ? ?PARAM_NOTE_CGRAPH_SECTION_EDGE_THRESHOLD.
>>
>> Index: doc/invoke.texi
>> ===================================================================
>> --- doc/invoke.texi ? ? (revision 174789)
>> +++ doc/invoke.texi ? ? (working copy)
>> @@ -351,7 +351,7 @@ Objective-C and Objective-C++ Dialects}.
>> ?-falign-labels[=@var{n}] -falign-loops[=@var{n}] -fassociative-math @gol
>> ?-fauto-inc-dec -fbranch-probabilities -fbranch-target-load-optimize @gol
>> ?-fbranch-target-load-optimize2 -fbtr-bb-exclusive -fcaller-saves @gol
>> --fcheck-data-deps -fclone-hot-version-paths @gol
>> +-fcallgraph-profiles-sections -fcheck-data-deps -fclone-hot-version-paths @gol
>> ?-fcombine-stack-adjustments -fconserve-stack @gol
>> ?-fcompare-elim -fcprop-registers -fcrossjumping @gol
>> ?-fcse-follow-jumps -fcse-skip-blocks -fcx-fortran-rules @gol
>> @@ -8114,6 +8114,15 @@ Do not promote static functions with always inline
>> ?@opindex fripa-verbose
>> ?Enable printing of verbose information about dynamic inter-procedural optimizations.
>> ?This is used in conjunction with the @option{-fripa}.
>> +
>> +@item -fcallgraph-profiles-sections
>> +@opindex fcallgraph-profiles-sections
>> +Emit call graph edge profile counts in .note.callgraph.text sections. This is
>> +used in conjunction with @option{-fprofile-use}. A new .note.callgraph.text
>> +section is created for each function. This section lists every callee and the
>> +number of times it is called. The params variable
>> +"note-cgraph-section-edge-threshold" can be used to only list edges above a
>> +certain threshold.
>> ?@end table
>>
>> ?The following options control compiler behavior regarding floating
>> Index: final.c
>> ===================================================================
>> --- final.c ? ? (revision 174789)
>> +++ final.c ? ? (working copy)
>> @@ -4321,13 +4321,37 @@ debug_free_queue (void)
>> ? ? ? symbol_queue_size = 0;
>> ? ? }
>> ?}
>> -
>> +
>> +/* List the call graph profiled edges whise value is greater than
>> + ? PARAM_NOTE_CGRAPH_SECTION_EDGE_THRESHOLD in the
>> + ? ".note.callgraph.text" section. */
>> +static void
>> +dump_cgraph_profiles (void)
>> +{
>> + ?struct cgraph_node *node = cgraph_node (current_function_decl);
>> + ?struct cgraph_edge *e;
>> + ?struct cgraph_node *callee;
>> +
>> + ?for (e = node->callees; e != NULL; e = e->next_callee)
>> + ? ?{
>> + ? ? ?if (e->count <= PARAM_VALUE (PARAM_NOTE_CGRAPH_SECTION_EDGE_THRESHOLD))
>> + ? ? ? ?continue;
>> + ? ? ?callee = e->callee;
>> + ? ? ?fprintf (asm_out_file, "\t.string \"%s\"\n",
>> + ? ? ? ? ? ? ? IDENTIFIER_POINTER (decl_assembler_name (callee->decl)));
>> + ? ? ?fprintf (asm_out_file, "\t.string \"" HOST_WIDEST_INT_PRINT_DEC "\"\n",
>> + ? ? ? ? ? ? ? e->count);
>> + ? ?}
>> +}
>> +
>> ?/* Turn the RTL into assembly. ?*/
>> ?static unsigned int
>> ?rest_of_handle_final (void)
>> ?{
>> ? rtx x;
>> ? const char *fnname;
>> + ?char *profile_fnname;
>> + ?unsigned int flags;
>>
>> ? /* Get the function's name, as described by its RTL. ?This may be
>> ? ? ?different from the DECL_NAME name used in the source file. ?*/
>> @@ -4387,6 +4411,21 @@ rest_of_handle_final (void)
>> ? ? targetm.asm_out.destructor (XEXP (DECL_RTL (current_function_decl), 0),
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?decl_fini_priority_lookup
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(current_function_decl));
>> +
>> + ?/* With -fcgraph-section, add ".note.callgraph.text" section for storing
>> + ? ? profiling information. */
>> + ?if (flag_callgraph_profiles_sections
>> + ? ? ?&& flag_profile_use
>> + ? ? ?&& cgraph_node (current_function_decl) != NULL)
>> + ? ?{
>> + ? ? ?flags = SECTION_DEBUG;
>> + ? ? ?asprintf (&profile_fnname, ".note.callgraph.text.%s", fnname);
>> + ? ? ?switch_to_section (get_section (profile_fnname, flags, NULL));
>> + ? ? ?fprintf (asm_out_file, "\t.string \"Function %s\"\n", fnname);
>> + ? ? ?dump_cgraph_profiles ();
>> + ? ? ?free (profile_fnname);
>> + ? ?}
>> +
>> ? return 0;
>> ?}
>>
>> Index: common.opt
>> ===================================================================
>> --- common.opt ?(revision 174789)
>> +++ common.opt ?(working copy)
>> @@ -907,6 +907,10 @@ fcaller-saves
>> ?Common Report Var(flag_caller_saves) Optimization
>> ?Save registers around function calls
>>
>> +fcallgraph-profiles-sections
>> +Common Report Var(flag_callgraph_profiles_sections) Init(0)
>> +Generate .note.callgraph.text sections listing callees and edge counts.
>> +
>> ?fcheck-data-deps
>> ?Common Report Var(flag_check_data_deps)
>> ?Compare the results of several data dependence analyzers.
>> Index: params.def
>> ===================================================================
>> --- params.def ?(revision 174789)
>> +++ params.def ?(working copy)
>> @@ -1002,6 +1002,15 @@ DEFPARAM (PARAM_MVERSN_CLONE_CGRAPH_DEPTH,
>> ? ? ? ? ?"maximum length of the call graph path to be cloned "
>> ? ? ? ? ? "while doing multiversioning",
>> ? ? ? ? ?2, 0, 5)
>> +
>> +/* Only output those call graph edges in .note.callgraph.text sections
>> + ? whose count is greater than this value. */
>> +DEFPARAM (PARAM_NOTE_CGRAPH_SECTION_EDGE_THRESHOLD,
>> + ? ? ? ? "note-cgraph-section-edge-threshold",
>> + ? ? ? ? "minimum call graph edge count for inclusion in "
>> + ? ? ? ? ?".note.callgraph.text section",
>> + ? ? ? ? 0, 0, 0)
>> +
>> ?/*
>> ?Local variables:
>> ?mode:c
>>
>> --
>> This patch is available for review at http://codereview.appspot.com/4591045
>>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]