This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Change the badness computation to ensure no integer-underflow

From: Jan Hubicka <hubicka at ucw dot cz>
To: Richard Biener <richard dot guenther at gmail dot com>
Cc: Jan Hubicka <hubicka at ucw dot cz>, Dehao Chen <dehao at google dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
Date: Wed, 28 Aug 2013 19:11:08 +0200
Subject: Re: [PATCH] Change the badness computation to ensure no integer-underflow
Authentication-results: sourceware.org; auth=none
References: <CAO2gOZX-YhBDxZmEGXztty6H+KAdbKtpPhyY5betB0tHj0dhwQ at mail dot gmail dot com> <CAFiYyc06O286hxaQOLpoa--y0BMAiJf5sMVqWCe4APf4edtO2w at mail dot gmail dot com> <CAO2gOZXaNMuMPGnUAeUmHK9_Xjj69MymB6yVEidK=Z=PicyDzA at mail dot gmail dot com> <CAO2gOZXexPV1rb9ag_Xcv2sfcgOB9Uuyah1uremcMdMu1LG2Lw at mail dot gmail dot com> <CAFiYyc2ECPttt13EupXuCRdY-9NZCMfM9fjxh1JFgOfdaeX1UQ at mail dot gmail dot com> <20130827125010 dot GB6791 at atrey dot karlin dot mff dot cuni dot cz> <CAFiYyc2DDLkpn1nwEcs6AzpmR-t2nLkTuCFnQMUEQhfRvu_X4A at mail dot gmail dot com> <20130828121323 dot GB11901 at kam dot mff dot cuni dot cz> <CAFiYyc0UFn=zmRHFgc3XmtxBm+rksCiiPKKdO3N5+dHu+2m3_w at mail dot gmail dot com>

> > I am giving the patch brief benchmarking on profiledbootstrap and it it won't
> > cause major regression, I think we should go ahead with the patch.

Uhm, I profiledbootstrapped and we bit too fast to get resonable oprofile.  What I get is:
7443      9.4372  lto1                     lto1                     lto_end_uncompression(lto_compression_stream*)
4438      5.6271  lto1                     lto1                     _ZL14DFS_write_treeP12output_blockP4sccsP9tree_nodebb.lto_priv.4993
2351      2.9809  lto1                     lto1                     lto_output_tree(output_block*, tree_node*, bool, bool)
2179      2.7628  lto1                     lto1                     _ZL30linemap_macro_loc_to_exp_pointP9line_mapsjPPK8line_map.lto_priv.7860
1910      2.4217  lto1                     lto1                     _ZL19unpack_value_fieldsP7data_inP9bitpack_dP9tree_node.lto_priv.7292
1855      2.3520  libc-2.11.1.so           libc-2.11.1.so           msort_with_tmp
1531      1.9412  lto1                     lto1                     streamer_string_index(output_block*, char const*, unsigned int, bool)
1530      1.9399  libc-2.11.1.so           libc-2.11.1.so           _int_malloc
1471      1.8651  lto1                     lto1                     do_estimate_growth(cgraph_node*)
1306      1.6559  lto1                     lto1                     pointer_map_insert(pointer_map_t*, void const*)
1238      1.5697  lto1                     lto1                     _Z28streamer_pack_tree_bitfieldsP12output_blockP9bitpack_dP9tree_node.constprop.1086
1138      1.4429  lto1                     lto1                     compare_tree_sccs_1(tree_node*, tree_node*, tree_node***)
1082      1.3719  lto1                     lto1                     streamer_write_tree_body(output_block*, tree_node*, bool)
1044      1.3237  lto1                     lto1                     _ZL28estimate_calls_size_and_timeP11cgraph_nodePiS1_S1_j3vecIP9tree_node7va_heap6vl_ptrES7_S2_IP21ipa_agg_jump_function

We take 12 seconds of WPA on GCC (with my fork patch)
Execution times (seconds)
 phase setup             :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall    1412 kB ( 0%) ggc
 phase opt and generate  :   4.48 (37%) usr   0.05 ( 6%) sys   4.57 (34%) wall   42983 kB ( 7%) ggc
 phase stream in         :   7.21 (60%) usr   0.26 (32%) sys   7.47 (56%) wall  565102 kB (93%) ggc
 phase stream out        :   0.38 ( 3%) usr   0.50 (62%) sys   1.37 (10%) wall     623 kB ( 0%) ggc
 callgraph optimization  :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall       6 kB ( 0%) ggc
 ipa dead code removal   :   0.46 ( 4%) usr   0.00 ( 0%) sys   0.46 ( 3%) wall       0 kB ( 0%) ggc
 ipa cp                  :   0.36 ( 3%) usr   0.01 ( 1%) sys   0.41 ( 3%) wall   38261 kB ( 6%) ggc
 ipa inlining heuristics :   2.84 (24%) usr   0.05 ( 6%) sys   2.87 (21%) wall   60263 kB (10%) ggc
 ipa lto gimple in       :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 ipa lto gimple out      :   0.04 ( 0%) usr   0.02 ( 2%) sys   0.06 ( 0%) wall       0 kB ( 0%) ggc
 ipa lto decl in         :   6.23 (52%) usr   0.18 (22%) sys   6.40 (48%) wall  425731 kB (70%) ggc
 ipa lto decl out        :   0.09 ( 1%) usr   0.01 ( 1%) sys   0.10 ( 1%) wall       0 kB ( 0%) ggc
 ipa lto cgraph I/O      :   0.22 ( 2%) usr   0.02 ( 2%) sys   0.25 ( 2%) wall   60840 kB (10%) ggc
 ipa lto decl merge      :   0.20 ( 2%) usr   0.00 ( 0%) sys   0.20 ( 1%) wall    1051 kB ( 0%) ggc
 ipa lto cgraph merge    :   0.22 ( 2%) usr   0.01 ( 1%) sys   0.25 ( 2%) wall   17676 kB ( 3%) ggc
 whopr wpa               :   0.38 ( 3%) usr   0.00 ( 0%) sys   0.35 ( 3%) wall     626 kB ( 0%) ggc
 whopr wpa I/O           :   0.01 ( 0%) usr   0.47 (58%) sys   0.98 ( 7%) wall       0 kB ( 0%) ggc
 whopr partitioning      :   0.18 ( 1%) usr   0.00 ( 0%) sys   0.19 ( 1%) wall       0 kB ( 0%) ggc
 ipa reference           :   0.31 ( 3%) usr   0.01 ( 1%) sys   0.33 ( 2%) wall       0 kB ( 0%) ggc
 ipa profile             :   0.09 ( 1%) usr   0.01 ( 1%) sys   0.10 ( 1%) wall     150 kB ( 0%) ggc
 ipa pure const          :   0.29 ( 2%) usr   0.00 ( 0%) sys   0.30 ( 2%) wall       0 kB ( 0%) ggc
 tree SSA incremental    :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall     203 kB ( 0%) ggc
 tree operand scan       :   0.00 ( 0%) usr   0.01 ( 1%) sys   0.00 ( 0%) wall    3512 kB ( 1%) ggc
 dominance computation   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall       0 kB ( 0%) ggc
 varconst                :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 unaccounted todo        :   0.06 ( 0%) usr   0.01 ( 1%) sys   0.09 ( 1%) wall       0 kB ( 0%) ggc
 TOTAL                 :  12.08             0.81            13.43             610123 kB

Inliing heuristics was also around 25% w/o your change.  Timming maches my
experience with firefox - growth estimation tends to be the hot functions, with
caching, badness is off the radar.  As such I think the patch is safe to go.
Thank you!


> >
> > I was never really happy about the double use there and in fact the whole fixed
> > point arithmetic in badness compuation is a mess.  If we had template based
> > fibonaci heap and sreal fast enough, turing it all to reals would save quite
> > some maintenance burden.
> 
> Yeah, well.
> 
> Richard.
> 
> > Honza

References:
- Re: [PATCH] Change the badness computation to ensure no integer-underflow
  - From: Richard Biener
- Re: [PATCH] Change the badness computation to ensure no integer-underflow
  - From: Jan Hubicka
- Re: [PATCH] Change the badness computation to ensure no integer-underflow
  - From: Richard Biener
- Re: [PATCH] Change the badness computation to ensure no integer-underflow
  - From: Jan Hubicka
- Re: [PATCH] Change the badness computation to ensure no integer-underflow
  - From: Richard Biener

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]