A recent patch increased GCC's memory consumption in some cases!

Jan Hubicka jh@suse.cz
Thu May 8 17:00:00 GMT 2008


Hi,
this seems really nice ;)
> 
> comparing PR rtl-optimization/28071 testcase compilation at -O0 level:
>   Ovarall memory allocated via mmap and sbrk decreased from 380351k to 285391k, overall -33.27%
>   Peak amount of GGC memory allocated before garbage collecting run decreased from 100958k to 62096k, overall -62.58%
>   Peak amount of GGC memory still allocated after garbage collecting decreased from 56611k to 40417k, overall -40.07%
>   Amount of produced GGC garbage decreased from 178452k to 118819k, overall -50.19%
>   Amount of memory still referenced at the end of compilation decreased from 6103k to 5336k, overall -14.37%
>     Overall memory needed: 380351k -> 285391k
>     Peak memory use before GGC: 100958k -> 62096k
>     Peak memory use after GGC: 56611k -> 40417k
>     Maximum of released memory in single GGC run: 50583k -> 31619k
>     Garbage: 178452k -> 118819k
>     Leak: 6103k -> 5336k
>     Overhead: 30540k -> 18233k
>     GGC runs: 107 -> 106
> Testing has produced no results
> Testing has produced no results
> 
> comparing PR rtl-optimization/28071 testcase compilation at -O0 -g level:
>   Ovarall memory allocated via mmap and sbrk decreased from 381191k to 286243k, overall -33.17%
>   Peak amount of GGC memory allocated before garbage collecting run decreased from 101651k to 62789k, overall -61.89%
>   Peak amount of GGC memory still allocated after garbage collecting decreased from 57304k to 41110k, overall -39.39%
>   Amount of produced GGC garbage decreased from 178616k to 118983k, overall -50.12%
>   Amount of memory still referenced at the end of compilation decreased from 8132k to 7365k, overall -10.41%
>     Overall memory needed: 381191k -> 286243k
>     Peak memory use before GGC: 101651k -> 62789k
>     Peak memory use after GGC: 57304k -> 41110k
>     Maximum of released memory in single GGC run: 50583k -> 31695k
>     Garbage: 178616k -> 118983k
>     Leak: 8132k -> 7365k
>     Overhead: 31123k -> 18816k
>     GGC runs: 110 -> 108
> Testing has produced no results
> Testing has produced no results
> 
> comparing PR rtl-optimization/28071 testcase compilation at -O1 level:
>   Peak amount of GGC memory allocated before garbage collecting run decreased from 76380k to 70580k, overall -8.22%
>   Peak amount of GGC memory still allocated after garbage collecting decreased from 70370k to 61354k, overall -14.70%
>   Amount of produced GGC garbage decreased from 238003k to 192311k, overall -23.76%
>   Amount of memory still referenced at the end of compilation decreased from 13677k to 12432k, overall -10.01%
>     Overall memory needed: 393123k -> 382403k
>     Peak memory use before GGC: 76380k -> 70580k
>     Peak memory use after GGC: 70370k -> 61354k
>     Maximum of released memory in single GGC run: 35019k -> 29401k
>     Garbage: 238003k -> 192311k
>     Leak: 13677k -> 12432k
>     Overhead: 32125k -> 24677k
>     GGC runs: 105 -> 107
>   Amount of produced pre-ipa-GGC garbage decreased from 47276k to 39611k, overall -19.35%
>   Amount of memory referenced pre-ipa decreased from 67562k to 59927k, overall -12.74%
>     Pre-IPA-Garbage: 47276k -> 39611k
>     Pre-IPA-Leak: 67562k -> 59927k
>     Pre-IPA-Overhead: 7504k -> 5628k
>   Amount of produced post-ipa-GGC garbage decreased from 47276k to 39611k, overall -19.35%
>   Amount of memory referenced post-ipa decreased from 67562k to 59927k, overall -12.74%
>     Post-IPA-Garbage: 47276k -> 39611k
>     Post-IPA-Leak: 67562k -> 59927k
>     Post-IPA-Overhead: 7504k -> 5628k
> 
> comparing PR rtl-optimization/28071 testcase compilation at -O2 level:
>   Ovarall memory allocated via mmap and sbrk decreased from 309479k to 238731k, overall -29.64%
>   Peak amount of GGC memory allocated before garbage collecting run decreased from 76380k to 70331k, overall -8.60%
>   Peak amount of GGC memory still allocated after garbage collecting decreased from 70370k to 61355k, overall -14.69%
>   Amount of produced GGC garbage decreased from 252446k to 208026k, overall -21.35%
>   Amount of memory still referenced at the end of compilation decreased from 13851k to 12605k, overall -9.88%
>     Overall memory needed: 309479k -> 238731k
>     Peak memory use before GGC: 76380k -> 70331k
>     Peak memory use after GGC: 70370k -> 61355k
>     Maximum of released memory in single GGC run: 31602k -> 25655k
>     Garbage: 252446k -> 208026k
>     Leak: 13851k -> 12605k
>     Overhead: 35239k -> 28695k
>     GGC runs: 118 -> 117
>   Amount of produced pre-ipa-GGC garbage decreased from 99865k to 80833k, overall -23.55%
>   Amount of memory referenced pre-ipa decreased from 77323k to 72346k, overall -6.88%
>     Pre-IPA-Garbage: 99865k -> 80833k
>     Pre-IPA-Leak: 77323k -> 72346k
>     Pre-IPA-Overhead: 12142k -> 8403k
>   Amount of produced post-ipa-GGC garbage decreased from 99865k to 80833k, overall -23.55%
>   Amount of memory referenced post-ipa decreased from 77323k to 72346k, overall -6.88%
>     Post-IPA-Garbage: 99865k -> 80833k
>     Post-IPA-Leak: 77323k -> 72346k
>     Post-IPA-Overhead: 12142k -> 8403k
> 
> comparing PR rtl-optimization/28071 testcase compilation at -O3 -fno-tree-pre -fno-tree-fre level:
>   Peak amount of GGC memory allocated before garbage collecting run decreased from 138642k to 116511k, overall -18.99%
>   Peak amount of GGC memory still allocated after garbage collecting decreased from 127952k to 109476k, overall -16.88%
>   Amount of produced GGC garbage decreased from 374831k to 353887k, overall -5.92%
>   Amount of memory still referenced at the end of compilation decreased from 24124k to 21397k, overall -12.75%
>     Overall memory needed: 1200099k -> 1192295k
>     Peak memory use before GGC: 138642k -> 116511k
>     Peak memory use after GGC: 127952k -> 109476k
>     Maximum of released memory in single GGC run: 59910k -> 43506k
>     Garbage: 374831k -> 353887k
>     Leak: 24124k -> 21397k
>     Overhead: 49858k -> 46186k
>     GGC runs: 104 -> 110
>   Amount of produced pre-ipa-GGC garbage decreased from 99865k to 80833k, overall -23.55%
>   Amount of memory referenced pre-ipa decreased from 77323k to 72346k, overall -6.88%
>     Pre-IPA-Garbage: 99865k -> 80833k
>     Pre-IPA-Leak: 77323k -> 72346k
>     Pre-IPA-Overhead: 12142k -> 8403k
>   Amount of produced post-ipa-GGC garbage decreased from 99865k to 80833k, overall -23.55%
>   Amount of memory referenced post-ipa decreased from 77323k to 72346k, overall -6.88%
>     Post-IPA-Garbage: 99865k -> 80833k
>     Post-IPA-Leak: 77323k -> 72346k
>     Post-IPA-Overhead: 12142k -> 8403k
> 
> Head of the ChangeLog is:
> 
> --- /usr/src/SpecTests/sandbox-britten-memory/x86_64/mem-result/ChangeLog	2008-05-08 02:12:21.000000000 +0000
> +++ /usr/src/SpecTests/sandbox-britten-memory/gcc/gcc/ChangeLog	2008-05-08 09:42:11.000000000 +0000
> @@ -1,3 +1,151 @@
> +2008-05-08  Richard Guenther  <rguenther@suse.de>
> +
> +	* tree-data-ref.c (dr_analyze_alias): Do not set DR_SUBVARS.
> +	* tree-data-ref.h (struct dr_alias): Remove subvars field.
> +	(DR_SUBVARS): Remove.
> +	* tree-dfa.c (dump_subvars_for): Remove.
> +	(debug_subvars_for): Likewise.
> +	(dump_variable): Do not dump subvars.
> +	(remove_referenced_var): Do not remove subvars.
> +	* tree-flow-inline.h (clear_call_clobbered): SFTs no longer exist.
> +	(lookup_subvars_for_var): Remove.
> +	(get_subvars_for_var): Likewise.
> +	(get_subvars_at): Likewise.
> +	(get_first_overlapping_subvar): Likewise.
> +	(overlap_subvar): Likewise.
> +	* tree-flow.h (subvar_t): Remove.
> +	(struct var_ann_d): Remove subvars field.
> +	* tree-ssa-alias.c (mark_aliases_call_clobbered): Remove queued
> +	argument.  Remove special handling of SFTs.
> +	(compute_tag_properties): Likewise.
> +	(set_initial_properties): Likewise.
> +	(compute_call_clobbered): Likewise.
> +	(count_mem_refs): Likewise.
> +	(compute_memory_partitions): Likewise.
> +	(compute_flow_insensitive_aliasing): Likewise.
> +	(setup_pointers_and_addressables): Likewise.
> +	(new_type_alias): Likewise.
> +	(struct used_part): Remove.
> +	(used_portions): Likewise.
> +	(struct used_part_map): Likewise.
> +	(used_part_map_eq): Likewise.
> +	(used_part_map_hash): Likewise.
> +	(free_used_part_map): Likewise.
> +	(up_lookup): Likewise.
> +	(up_insert): Likewise.
> +	(get_or_create_used_part_for): Likewise.
> +	(create_sft): Likewise.
> +	(create_overlap_variables_for): Likewise.
> +	(find_used_portions): Likewise.
> +	(create_structure_vars): Likewise.
> +	* tree.def (STRUCT_FIELD_TAG): Remove.
> +	* tree.h (MTAG_P): Adjust.
> +	(struct tree_memory_tag): Remove base_for_components and
> +	unpartitionable flags.
> +	(struct tree_struct_field_tag): Remove.
> +	(SFT_PARENT_VAR): Likewise.
> +	(SFT_OFFSET): Likewise.
> +	(SFT_SIZE): Likewise.
> +	(SFT_NONADDRESSABLE_P): Likewise.
> +	(SFT_ALIAS_SET): Likewise.
> +	(SFT_UNPARTITIONABLE_P): Likewise.
> +	(SFT_BASE_FOR_COMPONENTS_P): Likewise.
> +	(union tree_node): Remove sft field.
> +	* alias.c (get_alias_set): Remove special handling of SFTs.
> +	* print-tree.c (print_node): Remove handling of SFTs.
> +	* tree-dump.c (dequeue_and_dump): Likewise.
> +	* tree-into-ssa.c (mark_sym_for_renaming): Likewise.
> +	* tree-nrv.c (dest_safe_for_nrv_p): Remove special handling of SFTs.
> +	* tree-predcom.c (set_alias_info): Do not set subvars.
> +	* tree-pretty-print.c (dump_generic_node): Do not handle SFTs.
> +	* tree-ssa-loop-ivopts.c (get_ref_tag): Likewise.
> +	* tree-ssa-operands.c (access_can_touch_variable): Likewise.
> +	(add_vars_for_offset): Remove.
> +	(add_virtual_operand): Remove special handling of SFTs.
> +	(add_call_clobber_ops): Likewise.
> +	(add_call_read_ops): Likewise.
> +	(get_asm_expr_operands): Likewise.
> +	(get_modify_stmt_operands): Likewise.
> +	(get_expr_operands): Likewise.
> +	(add_to_addressable_set): Likewise.
> +	* tree-ssa.c (verify_ssa_name): Do not handle SFTs.
> +	* tree-tailcall.c (suitable_for_tail_opt_p): Likewise.
> +	* tree-vect-transform.c (vect_create_data_ref_ptr): Do not
> +	set subvars.
> +	* tree.c (init_ttree): Remove STRUCT_FIELD_TAG initialization.
> +	(tree_code_size): Remove STRUCT_FIELD_TAG handling.
> +	(tree_node_structure): Likewise.
> +	* tree-ssa-structalias.c (set_uids_in_ptset): Remove special
> +	handling of SFTs.
> +	(find_what_p_points_to): Likewise.
> +
> +2008-05-08  Sa Liu  <saliu@de.ibm.com>
> +
> +	* config/spu/spu.md: Fixed subti3 pattern.
> +	* testsuite/gcc.target/spu/subti3.c: New.
> +
> +2008-05-08  Richard Guenther  <rguenther@suse.de>
> +
> +	PR middle-end/36154
> +	* tree-ssa-structalias.c (push_fields_onto_fieldstack): Make
> +	sure to create a representative for trailing arrays for PTA.
> +
> +2008-05-08  Richard Guenther  <rguenther@suse.de>
> +
> +	PR middle-end/36172
> +	* fold-const.c (operand_equal_p): Two objects which types
> +	differ in pointerness are not equal.
> +
> +2008-05-08  Kai Tietz  <kai,tietz@onevision.com>
> +
> +	* calls.c (compute_argument_block_size): Add argument tree fndecl.
> +	(OUTGOING_REG_PARM_STACK_SPACE): Add function type argument.
> +	(emit_library_call_value_1): Add new variable fndecl initialized by
> +	NULL_TREE. It should be the decl type of orgfun, but this information
> +	seems not to be available here, so it uses the default calling abi.
> +	* config/arm/arm.c (arm_return_in_memory): Add fntype argumen.
> +	* config/arm/arm.h (RETURN_IN_MEMORY): Replace RETURN_IN_MEMORY
> +	by TARGET_RETURN_IN_MEMORY.
> +	* config/i386/i386-interix.h: Likewise.
> +	* config/i386/i386.h: Likewise.
> +	* config/i386/i386elf.h: Likewise.
> +	* config/i386/ptx4-i.h: Likewise.
> +	* config/i386/sol2-10.h: Likewise.
> +	* config/i386/sysv4.h: Likewise.
> +	* config/i386/vx-common.h: Likewise.
> +	* config/cris/cris.h: Removed #if 0 clause.
> +	* config/arm/arm-protos.h (arm_return_in_memory): Add fntype
> +	argument.
> +	* config/i386/i386-protos.h (ix86_return_in_memory): Add fntype
> +	argument.
> +	(ix86_sol10_return_in_memory): Likewise.
> +	(ix86_i386elf_return_in_memory): New.
> +	(ix86_i386interix_return_in_memory): New.
> +	* config/mt/mt-protos.h (mt_return_in_memory): New.
> +	* config/mt/mt.c: Likewise.
> +	* config/mt/mt.h (OUTGOING_REG_PARM_STACK_SPACE): Add FNTYPE argument.
> +	(RETURN_IN_MEMORY):  Replace by TARGET_RETURN_IN_MEMORY.
> +	* config/bfin/bfin.h: Likewise.
> +	* config/bfin/bfin-protos.h (bfin_return_in_memory): Add fntype
> +	argument.
> +	* config/bfin/bfin.c: Likewise.
> +	* config/pa/pa.h (OUTGOING_REG_PARM_STACK_SPACE): Add FNTYPE argument.
> +	* config/alpha/unicosmk.h: Likewise.
> +	* config/i386/cygming.h: Likewise.
> +	* config/iq2000/iq2000.h: Likewise.
> +	* config/mips/mips.h: Likewise.
> +	* config/mn10300/mn10300.h: Likewise.
> +	* config/rs6000/rs6000.h: Likewise.
> +	* config/score/score.h: Likewise.
> +	* config/spu/spu.h: Likewise.
> +	* config/v850/v850.h: Likewise.
> +	* defaults.h: Likewise.
> +	* doc/tm.texi (OUTGOING_REG_PARM_STACK_SPACE): Adjust documentation.
> +	* expr.c (emit_block_move): Adjust use of OUTGOING_REG_PARM_STACK_SPACE.
> +	* function.c (STACK_DYNAMIC_OFFSET): Adjust use of
> +	OUTGOING_REG_PARM_STACK_SPACE.
> +	* targhooks.c (default_return_in_memory): Remove RETURN_IN_MEMORY.
> +
>  2008-05-08  Jakub Jelinek  <jakub@redhat.com>
>  
>  	* tree-parloops.c (create_parallel_loop): Set OMP_RETURN_NOWAIT
> 
> 
> The results can be reproduced by building a compiler with
> 
> --enable-gather-detailed-mem-stats targetting x86-64
> 
> and compiling preprocessed combine.c or testcase from PR8632 with:
> 
> -fmem-report --param=ggc-min-heapsize=1024 --param=ggc-min-expand=1 -Ox -Q
> 
> The memory consumption summary appears in the dump after detailed listing
> of the places they are allocated in.  Peak memory consumption is actually
> computed by looking for maximal value in {GC XXXX -> YYYY} report.
> 
> Your testing script.



More information about the Gcc-regression mailing list