Materialize clones on demand

Richard Biener rguenther@suse.de
Mon Oct 26 10:32:04 GMT 2020


On Mon, 26 Oct 2020, Jan Hubicka wrote:

> > > We seem to leak some hashtables:
> > > dwarf2out.c:28850 (dwarf2out_init)                      31M: 23.8%       47M       19 :  0.0%       ggc
> > 
> > that one likely keeps quite some memory live...
> 
> Yep, having in-memory dwaf2out for whole cc1plus eats a lot of memory
> quite naturally.

OTOH the late debug shouldn't be so big ...

> > 
> > > cselib.c:3137 (cselib_init)                             34M: 25.9%       34M     1514k: 17.3%      heap
> > > tree-scalar-evolution.c:2984 (scev_initialize)          37M: 27.6%       50M      228k:  2.6%       ggc
> > 
> > Hmm, so we do
> > 
> >   scalar_evolution_info = hash_table<scev_info_hasher>::create_ggc (100);
> > 
> > and
> > 
> >   scalar_evolution_info->empty ();
> >   scalar_evolution_info = NULL;
> > 
> > to reclaim.  ->empty () will IIRC at least allocate 7 elements which we
> > the eventually should reclaim during a GC walk - I guess the hashtable
> > statistics do not really handle GC reclaimed portions?
> > 
> > If there's a friendlier way of releasing a GC allocated hash-tab
> > we can switch to that.  Note that in principle the hash-table doesn't
> > need to be GC allocated but it needs to be walked since it refers to
> > trees that might not be referenced in other ways.
> 
> hashtable has destructor that does ggc_free, so i think ggc_delete is
> right way to free.

Can you try if that helps?  As said, in the end it's probably
miscountings in the stats.

> > 
> > > and hashmaps:
> > > ipa-reference.c:1133 (ipa_reference_read_optimiz      2047k:  3.0%     3071k        9 :  0.0%      heap
> > > tree-ssa.c:60 (redirect_edge_var_map_add)             4125k:  6.1%     4126k     8190 :  0.1%      heap
> > 
> > Similar as SCEV, probably mis-accounting?
> > 
> > > alias.c:1200 (record_alias_subset)                    4510k:  6.6%     4510k     4546 :  0.0%       ggc
> > > ipa-prop.h:986 (ipcp_transformation_t)                8191k: 12.0%       11M       16 :  0.0%       ggc
> > > dwarf2out.c:5957 (dwarf2out_register_external_di        47M: 72.2%       71M       12 :  0.0%       ggc
> > > 
> > > and hashsets:
> > > ipa-devirt.c:3093 (possible_polymorphic_call_tar        15k:  0.9%       23k        8 :  0.0%      heap
> > > ipa-devirt.c:1599 (add_type_duplicate)                 412k: 22.2%      412k     4065 :  0.0%      heap
> > > tree-ssa-threadbackward.c:40 (thread_jumps)           1432k: 77.0%     1433k      119k:  0.8%      heap
> > > 
> > > and vectors:
> > > tree-ssa-structalias.c:5783 (push_fields_onto_fi          8       847k: 0.3%      976k    475621: 0.8%        17k        24k
> > 
> > Huh.  It's an auto_vec<>
> 
> Hmm, those maybe gets miscounted, i will check.
> > 
> > > tree-ssa-pre.c:334 (alloc_expression_id)                 48      1125k: 0.4%     1187k    198336: 0.3%        23k        34k
> > > tree-into-ssa.c:1787 (register_new_update_single          8      1196k: 0.5%     1264k    380385: 0.6%        24k        36k
> > > ggc-page.c:1264 (add_finalizer)                           8      1232k: 0.5%     1848k        43: 0.0%        77k        81k
> > > tree-ssa-structalias.c:1609 (topo_visit)                  8      1302k: 0.5%     1328k    892964: 1.4%        27k        33k
> > > graphds.c:254 (graphds_dfs)                               4      1469k: 0.6%     1675k   2101780: 3.4%        30k        34k
> > > dominance.c:955 (get_dominated_to_depth)                  8      2251k: 0.9%     2266k    685140: 1.1%        46k        50k
> > > tree-ssa-structalias.c:410 (new_var_info)                32      2264k: 0.9%     2341k    330758: 0.5%        47k        63k
> > > tree-ssa-structalias.c:3104 (process_constraint)         48      2376k: 0.9%     2606k    405451: 0.7%        49k        83k
> > > symtab.c:612 (create_reference)                           8      3314k: 1.3%     4897k     75213: 0.1%       414k       612k
> > > vec.h:1734 (copy)                                        48       233M:90.5%      234M   6243163:10.1%      4982k      5003k
> 
> Also I should annotate copy.

Yeah, some missing annotations might cause issues.

> > 
> > Those all look OK to me, not sure why we even think there's a leak?
> 
> I think we do not need to hold references anymore (perhaps for aliases -
> i will check).  Also all function bodies should be freed by now.
> > 
> > > However main problem is
> > > cfg.c:202 (connect_src)                               5745k:  0.2%      271M:  1.9%     1754k:  0.0%     1132k:  0.2%     7026k
> > > cfg.c:212 (connect_dest)                              6307k:  0.2%      281M:  2.0%    10129k:  0.2%     2490k:  0.5%     7172k
> > > varasm.c:3359 (build_constant_desc)                   7387k:  0.2%        0 :  0.0%        0 :  0.0%        0 :  0.0%       51k
> > > emit-rtl.c:486 (gen_raw_REG)                          7799k:  0.2%      215M:  1.5%       96 :  0.0%        0 :  0.0%     9502k
> > > dwarf2cfi.c:2341 (add_cfis_to_fde)                    8027k:  0.2%        0 :  0.0%     4906k:  0.1%     1405k:  0.3%       78k
> > > emit-rtl.c:4074 (make_jump_insn_raw)                  8239k:  0.2%       93M:  0.7%        0 :  0.0%        0 :  0.0%     1442k
> > > tree-ssanames.c:308 (make_ssa_name_fn)                9130k:  0.2%      456M:  3.3%        0 :  0.0%        0 :  0.0%     6622k
> > > gimple.c:1808 (gimple_copy)                           9508k:  0.3%      524M:  3.7%     8609k:  0.2%     2972k:  0.6%     7135k
> > > tree-inline.c:4879 (expand_call_inline)               9590k:  0.3%       21M:  0.2%        0 :  0.0%        0 :  0.0%      328k
> > > dwarf2cfi.c:418 (new_cfi)                               10M:  0.3%        0 :  0.0%        0 :  0.0%        0 :  0.0%      444k
> > > cfg.c:266 (unchecked_make_edge)                         10M:  0.3%       60M:  0.4%      355M:  6.8%        0 :  0.0%     9083k
> I think it is bug to have fuction body at the end of compilation - will
> try to work out reason for that.
> > > tree.c:1642 (wide_int_to_tree_1)                        10M:  0.3%     2313k:  0.0%        0 :  0.0%        0 :  0.0%      548k
> > > stringpool.c:41 (stringpool_ggc_alloc)                  10M:  0.3%     7055k:  0.0%        0 :  0.0%     2270k:  0.5%      588k
> > > stringpool.c:63 (alloc_node)                            10M:  0.3%       12M:  0.1%        0 :  0.0%        0 :  0.0%      588k
> > > tree-phinodes.c:119 (allocate_phi_node)                 11M:  0.3%      153M:  1.1%        0 :  0.0%     3539k:  0.7%      340k
> > > cgraph.c:289 (create_empty)                             12M:  0.3%        0 :  0.0%      109M:  2.1%        0 :  0.0%      371k
> > > cfg.c:127 (alloc_block)                                 14M:  0.4%      705M:  5.0%        0 :  0.0%        0 :  0.0%     7086k
> > > tree-streamer-in.c:558 (streamer_read_tree_bitfi        22M:  0.6%       13k:  0.0%        0 :  0.0%       22k:  0.0%       64k
> > > tree-inline.c:834 (remap_block)                         28M:  0.8%      159M:  1.1%        0 :  0.0%        0 :  0.0%     2009k
> > > stringpool.c:79 (ggc_alloc_string)                      28M:  0.8%     5619k:  0.0%        0 :  0.0%     6658k:  1.4%     1785k
> > > dwarf2out.c:11727 (add_ranges_num)                      32M:  0.9%        0 :  0.0%       32M:  0.6%      144 :  0.0%       20 
> > > tree-inline.c:5942 (copy_decl_to_var)                   39M:  1.1%       51M:  0.4%        0 :  0.0%        0 :  0.0%      646k
> > > tree-inline.c:5994 (copy_decl_no_change)                78M:  2.1%      270M:  1.9%        0 :  0.0%        0 :  0.0%     2497k
> > > function.c:4438 (reorder_blocks_1)                      96M:  2.6%      101M:  0.7%        0 :  0.0%        0 :  0.0%     2109k
> > > hash-table.h:802 (expand)                              142M:  3.9%       18M:  0.1%      198M:  3.8%       32M:  6.9%       38k
> > > dwarf2out.c:10086 (new_loc_list)                       219M:  6.0%       11M:  0.1%        0 :  0.0%        0 :  0.0%     2955k
> > > tree-streamer-in.c:637 (streamer_alloc_tree)           379M: 10.3%      426M:  3.0%        0 :  0.0%     4201k:  0.9%     9828k
> > > dwarf2out.c:5702 (new_die_raw)                         434M: 11.8%        0 :  0.0%        0 :  0.0%        0 :  0.0%     5556k
> > > dwarf2out.c:1383 (new_loc_descr)                       519M: 14.1%       12M:  0.1%     2880 :  0.0%        0 :  0.0%     6812k
> > > dwarf2out.c:4420 (add_dwarf_attr)                      640M: 17.4%        0 :  0.0%       94M:  1.8%     4584k:  1.0%     3877k
> > > toplev.c:906 (realloc_for_line_map)                    768M: 20.8%        0 :  0.0%      767M: 14.6%      255M: 54.4%       33 
> > > --------------------------------------------------------------------------------------------------------------------------------------------
> > > GGC memory                                              Leak          Garbage            Freed        Overhead            Times
> > > --------------------------------------------------------------------------------------------------------------------------------------------
> > > Total                                                 3689M:100.0%    14039M:100.0%     5254M:100.0%      470M:100.0%      391M
> > > --------------------------------------------------------------------------------------------------------------------------------------------
> > > 
> > > Clearly some function bodies leak - I will try to figure out what. But
> > > main problem is debug info.
> > > I guess debug info for whole cc1plus is large, but it would be nice if
> > > it was not in the garbage collector, for example :)
> > 
> > Well, we're building a DIE tree for the whole unit here so I'm not sure
> > what parts we can optimize.  The structures may keep quite some stuff
> > on the tree side live through the decl -> DIE and block -> DIE maps
> > and the external_die_map used for LTO streaming (but if we lazily stream
> > bodies we do need to keep this map ... unless we add some
> > start/end-stream-body hooks and doing the map per function.  But then
> > we build the DIEs lazily as well so the query of the map is lazy :/)
> 
> Yep, not sure how much we could do here.  Of course ggc_collect when
> invoked will do quite a lot of walking to discover relatively few tree
> references, but not sure if that can be solved by custom marking or so.

In principle the late DIE creation code can remove entries from the
external_die_map map, but not sure how much that helps (might also
cause re-allocation of it if we shrink it).  It might help quite a bit
for references to BLOCKs.  Maybe you can try the following simple
patch ...

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index ba93a6c3d81..350cc5d443c 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -5974,6 +5974,7 @@ maybe_create_die_with_external_ref (tree decl)
 
   const char *sym = desc->sym;
   unsigned HOST_WIDE_INT off = desc->off;
+  external_die_map->remove (decl);
 
   in_lto_p = false;
   dw_die_ref die = (TREE_CODE (decl) == BLOCK



> Hona
> > 
> > Richard.
> > 
> > -- 
> > Richard Biener <rguenther@suse.de>
> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> > Germany; GF: Felix Imend
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


More information about the Gcc-patches mailing list