This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Bring function profiles to callgraph to make them WHOPR ready


On Mon, Jul 12, 2010 at 3:14 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Apr 26, 2010 at 3:36 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Mon, Apr 26, 2010 at 6:10 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>> Hi,
>>> this patch moves cfun->function_frequency into cgraph_node->frequency. ?This is
>>> neccessary for WHOPR to use it and it is where it really belongs anyway since
>>> frequencies are not same across all clones.
>>>
>>> The patch rises a need for current_cgraph_node that is similar to
>>> cfun/crtl/current_function_decl I will propose with incremental patch
>>> (I intend to cleanup the function switching API anyway).
>>>
>>> Patch also adds new function frequency called EXECUTED_ONCE. ?Currently it is
>>> set for main(), for functions marked noreturn and for static
>>> constructors/destructors. ?Such functions are optimized for size on everything
>>> except for code inside loops. ?So the patch has minor effect on code size of
>>> programs per se.
>>>
>>> On pretty-ipa I have ipa-profile pass propagating this knowledge across
>>> callgraph that helps to shave off couple percents off the resulting binaries.
>>> This unfortunately affect mostly simple programs where this is not that
>>> important, but at -flto (-fwhopr) and ?-fwhole-program we have chance to
>>> propagate into more significant portion of program. On SPEC GCC we mark couple
>>> houndred functions this way, resulting code size savings are not that important
>>> anyway, usually just slightly over 1% at SPEC. But Still I guess worth the very
>>> simple and cheap pass.
>>>
>>> Main advantage of this code is that it can actually prove the coldness of
>>> instructions isntead of just guessing. ?Currently we guess based on fixed
>>> threshold that makes us sometimes to misguess code to be unlikely when it is
>>> not. ?With some improvements (i.e. marking basic blocks that have no path to
>>> exit as executed once and propagating this to calls) we can have bit better
>>> effects.
>>>
>>> It is possible to do more guesswork per Wu/Larus paper (i.e. promote function local
>>> profile estimates to global level) that is done by some compilers (Open64), but
>>> I am affraid of increasing the probability of doing a mistake and misjudging hot
>>> code to be cold.
>>>
>>> Simple improvements that also ran across my mind while implementing is that at
>>> -O0 we probably could default to size optimization defaults. ?While mostly we
>>> do not care, it still affects some code expansion and might lead to faster
>>> compile times?
>>>
>>> Bootstrapped/regtested x86_64-linux, will commit it shortly.
>>> Honza
>>>
>>> ? ? ? ?* cgraph.c (cgraph_create_node): Set node frequency to normal.
>>> ? ? ? ?(cgraph_clone_node): Copy function frequency.
>>> ? ? ? ?* cgraph.h (node_frequency): New enum
>>> ? ? ? ?(struct cgraph_node): Add.
>>> ? ? ? ?* final.c (rest_of_clean_state): Update.
>>> ? ? ? ?* lto-cgraph.c (lto_output_node): Output node frequency.
>>> ? ? ? ?(input_overwrite_node): Input node frequency.
>>> ? ? ? ?* tre-ssa-loop-ivopts (computation_cost): Update.
>>> ? ? ? ?* lto-streamer-out.c (output_function): Do not output function frequency.
>>> ? ? ? ?* predict.c (maybe_hot_frequency_p): Update and handle functions executed once.
>>> ? ? ? ?(cgraph_maybe_hot_edge_p): Likewise; use cgraph frequency instead of
>>> ? ? ? ?attribute lookup.
>>> ? ? ? ?(probably_never_executed_bb_p, optimize_function_for_size_p): Update.
>>> ? ? ? ?(compute_function_frequency): Set noreturn functions to be executed once.
>>> ? ? ? ?(choose_function_section): Update.
>>> ? ? ? ?* lto-streamer-in.c (input_function): Do not input function frequency.
>>> ? ? ? ?* function.c (allocate_struct_function): Do not initialize function frequency.
>>> ? ? ? ?* function.h (function_frequency): Remove.
>>> ? ? ? ?(struct function): Remove function frequency.
>>> ? ? ? ?* ipa-profile.c (CGRAPH_NODE_FREQUENCY): Remove.
>>> ? ? ? ?(try_update): Update.
>>> ? ? ? ?* tree-inline.c (initialize_cfun): Do not update function frequency.
>>> ? ? ? ?* passes.c (pass_init_dump_file): Update.
>>> ? ? ? ?* i386.c (ix86_compute_frame_layout): Update.
>>> ? ? ? ?(ix86_pad_returns): Update.
>>
>> This caused:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43903
>>
>
> This patch fixes:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44900
>
> on trunk. Is this the real fix or does it just make it latent?

It makes it latent.

Richard.

>
> --
> H.J.
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]