Bug 89306

Summary: [8 Regression] Hash based IPA summaries are too slow and consume up to 80% of IPA optimization time.
Product: gcc Reporter: Jan Hubicka <hubicka>
Component: ipaAssignee: Martin Liška <marxin>
Status: RESOLVED FIXED    
Severity: normal    
Priority: P3    
Version: 9.0   
Target Milestone: 9.0   
Host: Target:
Build: Known to work: 8.2.0
Known to fail: 9.0 Last reconfirmed: 2019-02-12 00:00:00
Attachments: Tentative patch
libxul.so WPA (--enable-checking=release) before patch
libxul.so WPA (--enable-checking=release) after patch

Description Jan Hubicka 2019-02-12 14:21:57 UTC
IPA summaries was switches to new nice C++ API but while doing so the original array based scheme was replaced by hash tables.  This is too slow for those summaries which are dense (i.e. defined for almost all functions or variables with a definition) and intensively used.  These are:

ipa-inline-analysis.c:call_summary<edge_growth_cache_entry *> *edge_growth_cache = NULL;
ipa-fnsummary.c:call_summary <ipa_call_summary *> *ipa_call_summaries;
ipa-fnsummary.c:function_summary <ipa_fn_summary *> *ipa_fn_summaries;
ipa-pure-const.c:class funct_state_summary_t: public function_summary <funct_state_d *>
ipa-pure-const.c:    function_summary <funct_state_d *> (symtab) {}
ipa-reference.c:    function_summary <ipa_reference_vars_info_d *> (symtab) {}

This one may not get too dense in all cases and at ltrans it is populated for non-definitions, but needs to be fast for alias oracle walks:
ipa-reference.c:    function_summary <ipa_reference_optimization_summary_d *> (symtab) {}
Comment 1 Martin Liška 2019-02-12 14:29:00 UTC
Mine then.
Comment 2 Martin Liška 2019-02-14 12:32:27 UTC
Created attachment 45718 [details]
Tentative patch
Comment 3 Martin Liška 2019-02-14 12:33:15 UTC
Created attachment 45719 [details]
libxul.so WPA (--enable-checking=release) before patch
Comment 4 Martin Liška 2019-02-14 12:33:42 UTC
Created attachment 45720 [details]
libxul.so WPA (--enable-checking=release) after patch
Comment 5 Martin Liška 2019-02-14 12:36:34 UTC
So considering only fns with more that 0.05% I see before:

     0.83%  lto1-wpa         lto1              [.] hash_table<hash_map<int_hash<int, 0, -1>, ipa_fn_summary*, simple_hashmap_traits<default_hash_traits<int_hash<int, 0, -1> >, ipa_fn_summary*> >::hash_entry, xcallocator>::find_with_hash
     0.37%  lto1-wpa         lto1              [.] hash_table<hash_map<int_hash<int, 0, -1>, ipa_call_summary*, simple_hashmap_traits<default_hash_traits<int_hash<int, 0, -1> >, ipa_call_summary*> >::hash_entry, xcallocator>::find_slot_with_hash
     0.36%  lto1-wpa         lto1              [.] hash_table<hash_map<int_hash<int, 0, -1>, ipa_call_summary*, simple_hashmap_traits<default_hash_traits<int_hash<int, 0, -1> >, ipa_call_summary*> >::hash_entry, xcallocator>::find_with_hash

After the patch:

     0.49%  lto1-wpa         lto1              [.] fast_call_summary<ipa_call_summary*, va_heap>::get_create

So there's an improvement, but it's quite small.
Can you Honza re-measure that for me please?
Comment 6 Martin Liška 2019-02-14 12:37:50 UTC
Having all *.o files in RAM, I see time change for WPA from 93.0 to 90.0s.
So relatively small improvement.
Comment 7 Martin Liška 2019-02-18 08:21:54 UTC
Author: marxin
Date: Mon Feb 18 08:21:23 2019
New Revision: 268979

URL: https://gcc.gnu.org/viewcvs?rev=268979&root=gcc&view=rev
Log:
Come up with fast {function,call}_summary classes (PR ipa/89306).

2019-02-18  Martin Liska  <mliska@suse.cz>

	PR ipa/89306
	* cgraph.c (symbol_table::create_edge): Set m_summary_id to -1
	by default.
	(symbol_table::free_edge): Recycle m_summary_id.
	* cgraph.h (get_summary_id): New.
	(symbol_table::release_symbol): Set m_summary_id to -1
	by default.
	(symbol_table::allocate_cgraph_symbol): Recycle m_summary_id.
	* ipa-fnsummary.c (ipa_fn_summary_t): Switch from
	function_summary to fast_function_summary.
	* ipa-fnsummary.h (ipa_fn_summary_t): Likewise.
	* ipa-pure-const.c (class funct_state_summary_t):
	Switch from function_summary to fast_function_summary.
	* ipa-reference.c (class ipa_ref_var_info_summary_t): Likewise.
	(class ipa_ref_opt_summary_t): Switch from function_summary
	to fast_function_summary.
	* symbol-summary.h (class function_summary_base): New class
	that is created from base of former function_summary.
	(function_summary_base::unregister_hooks): New.
	(class function_summary): Inherit from function_summary_base.
	(class call_summary_base): New class
	that is created from base of former call_summary.
	(class call_summary): Inherit from call_summary_base.
	(struct is_same): New.
	(class fast_function_summary): New summary class.
	(class fast_call_summary): New summary class.
	* vec.h (vec_safe_grow_cleared): New function.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cgraph.c
    trunk/gcc/cgraph.h
    trunk/gcc/ipa-fnsummary.c
    trunk/gcc/ipa-fnsummary.h
    trunk/gcc/ipa-pure-const.c
    trunk/gcc/ipa-reference.c
    trunk/gcc/symbol-summary.h
    trunk/gcc/vec.h
Comment 8 Martin Liška 2019-02-18 08:23:54 UTC
Fixed on trunk. It's an infrastructure change, thus I'm not planning to backport that to GCC 8.