This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Change the badness computation to ensure no integer-underflow
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: Jan Hubicka <hubicka at ucw dot cz>, Dehao Chen <dehao at google dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 28 Aug 2013 19:11:08 +0200
- Subject: Re: [PATCH] Change the badness computation to ensure no integer-underflow
- Authentication-results: sourceware.org; auth=none
- References: <CAO2gOZX-YhBDxZmEGXztty6H+KAdbKtpPhyY5betB0tHj0dhwQ at mail dot gmail dot com> <CAFiYyc06O286hxaQOLpoa--y0BMAiJf5sMVqWCe4APf4edtO2w at mail dot gmail dot com> <CAO2gOZXaNMuMPGnUAeUmHK9_Xjj69MymB6yVEidK=Z=PicyDzA at mail dot gmail dot com> <CAO2gOZXexPV1rb9ag_Xcv2sfcgOB9Uuyah1uremcMdMu1LG2Lw at mail dot gmail dot com> <CAFiYyc2ECPttt13EupXuCRdY-9NZCMfM9fjxh1JFgOfdaeX1UQ at mail dot gmail dot com> <20130827125010 dot GB6791 at atrey dot karlin dot mff dot cuni dot cz> <CAFiYyc2DDLkpn1nwEcs6AzpmR-t2nLkTuCFnQMUEQhfRvu_X4A at mail dot gmail dot com> <20130828121323 dot GB11901 at kam dot mff dot cuni dot cz> <CAFiYyc0UFn=zmRHFgc3XmtxBm+rksCiiPKKdO3N5+dHu+2m3_w at mail dot gmail dot com>
> > I am giving the patch brief benchmarking on profiledbootstrap and it it won't
> > cause major regression, I think we should go ahead with the patch.
Uhm, I profiledbootstrapped and we bit too fast to get resonable oprofile. What I get is:
7443 9.4372 lto1 lto1 lto_end_uncompression(lto_compression_stream*)
4438 5.6271 lto1 lto1 _ZL14DFS_write_treeP12output_blockP4sccsP9tree_nodebb.lto_priv.4993
2351 2.9809 lto1 lto1 lto_output_tree(output_block*, tree_node*, bool, bool)
2179 2.7628 lto1 lto1 _ZL30linemap_macro_loc_to_exp_pointP9line_mapsjPPK8line_map.lto_priv.7860
1910 2.4217 lto1 lto1 _ZL19unpack_value_fieldsP7data_inP9bitpack_dP9tree_node.lto_priv.7292
1855 2.3520 libc-2.11.1.so libc-2.11.1.so msort_with_tmp
1531 1.9412 lto1 lto1 streamer_string_index(output_block*, char const*, unsigned int, bool)
1530 1.9399 libc-2.11.1.so libc-2.11.1.so _int_malloc
1471 1.8651 lto1 lto1 do_estimate_growth(cgraph_node*)
1306 1.6559 lto1 lto1 pointer_map_insert(pointer_map_t*, void const*)
1238 1.5697 lto1 lto1 _Z28streamer_pack_tree_bitfieldsP12output_blockP9bitpack_dP9tree_node.constprop.1086
1138 1.4429 lto1 lto1 compare_tree_sccs_1(tree_node*, tree_node*, tree_node***)
1082 1.3719 lto1 lto1 streamer_write_tree_body(output_block*, tree_node*, bool)
1044 1.3237 lto1 lto1 _ZL28estimate_calls_size_and_timeP11cgraph_nodePiS1_S1_j3vecIP9tree_node7va_heap6vl_ptrES7_S2_IP21ipa_agg_jump_function
We take 12 seconds of WPA on GCC (with my fork patch)
Execution times (seconds)
phase setup : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 1412 kB ( 0%) ggc
phase opt and generate : 4.48 (37%) usr 0.05 ( 6%) sys 4.57 (34%) wall 42983 kB ( 7%) ggc
phase stream in : 7.21 (60%) usr 0.26 (32%) sys 7.47 (56%) wall 565102 kB (93%) ggc
phase stream out : 0.38 ( 3%) usr 0.50 (62%) sys 1.37 (10%) wall 623 kB ( 0%) ggc
callgraph optimization : 0.04 ( 0%) usr 0.00 ( 0%) sys 0.03 ( 0%) wall 6 kB ( 0%) ggc
ipa dead code removal : 0.46 ( 4%) usr 0.00 ( 0%) sys 0.46 ( 3%) wall 0 kB ( 0%) ggc
ipa cp : 0.36 ( 3%) usr 0.01 ( 1%) sys 0.41 ( 3%) wall 38261 kB ( 6%) ggc
ipa inlining heuristics : 2.84 (24%) usr 0.05 ( 6%) sys 2.87 (21%) wall 60263 kB (10%) ggc
ipa lto gimple in : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc
ipa lto gimple out : 0.04 ( 0%) usr 0.02 ( 2%) sys 0.06 ( 0%) wall 0 kB ( 0%) ggc
ipa lto decl in : 6.23 (52%) usr 0.18 (22%) sys 6.40 (48%) wall 425731 kB (70%) ggc
ipa lto decl out : 0.09 ( 1%) usr 0.01 ( 1%) sys 0.10 ( 1%) wall 0 kB ( 0%) ggc
ipa lto cgraph I/O : 0.22 ( 2%) usr 0.02 ( 2%) sys 0.25 ( 2%) wall 60840 kB (10%) ggc
ipa lto decl merge : 0.20 ( 2%) usr 0.00 ( 0%) sys 0.20 ( 1%) wall 1051 kB ( 0%) ggc
ipa lto cgraph merge : 0.22 ( 2%) usr 0.01 ( 1%) sys 0.25 ( 2%) wall 17676 kB ( 3%) ggc
whopr wpa : 0.38 ( 3%) usr 0.00 ( 0%) sys 0.35 ( 3%) wall 626 kB ( 0%) ggc
whopr wpa I/O : 0.01 ( 0%) usr 0.47 (58%) sys 0.98 ( 7%) wall 0 kB ( 0%) ggc
whopr partitioning : 0.18 ( 1%) usr 0.00 ( 0%) sys 0.19 ( 1%) wall 0 kB ( 0%) ggc
ipa reference : 0.31 ( 3%) usr 0.01 ( 1%) sys 0.33 ( 2%) wall 0 kB ( 0%) ggc
ipa profile : 0.09 ( 1%) usr 0.01 ( 1%) sys 0.10 ( 1%) wall 150 kB ( 0%) ggc
ipa pure const : 0.29 ( 2%) usr 0.00 ( 0%) sys 0.30 ( 2%) wall 0 kB ( 0%) ggc
tree SSA incremental : 0.02 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 203 kB ( 0%) ggc
tree operand scan : 0.00 ( 0%) usr 0.01 ( 1%) sys 0.00 ( 0%) wall 3512 kB ( 1%) ggc
dominance computation : 0.01 ( 0%) usr 0.00 ( 0%) sys 0.02 ( 0%) wall 0 kB ( 0%) ggc
varconst : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc
unaccounted todo : 0.06 ( 0%) usr 0.01 ( 1%) sys 0.09 ( 1%) wall 0 kB ( 0%) ggc
TOTAL : 12.08 0.81 13.43 610123 kB
Inliing heuristics was also around 25% w/o your change. Timming maches my
experience with firefox - growth estimation tends to be the hot functions, with
caching, badness is off the radar. As such I think the patch is safe to go.
Thank you!
> >
> > I was never really happy about the double use there and in fact the whole fixed
> > point arithmetic in badness compuation is a mess. If we had template based
> > fibonaci heap and sreal fast enough, turing it all to reals would save quite
> > some maintenance burden.
>
> Yeah, well.
>
> Richard.
>
> > Honza