The change is visible at https://gcc.opensuse.org/gcc-old/SPEC/CINT/sb-czerny-head-64/recent.html https://gcc.opensuse.org/gcc-old/SPEC/CINT/sb-czerny-head-64/size.html The affected run is Jan 2 2018 https://gcc.opensuse.org/gcc-old/SPEC/CINT/sb-czerny-head-64/list.html
Also visible here: https://vmakarov.fedorapeople.org/spec/spec2000.topka/gcc/home.html
One potential candidate is the ipa-inline big_speedup_p fix by Richard. It does not seem to explain all the difference though. I have quickly checked that it does affect gzip LTO binary, but less than reported - by about 2.5KB.
403.gcc with -flto is now ~6% slower.
I see change in r256072 that increased .text in gromacs from: 83.6% 743Ki .text 743Ki 14.4% to 83.8% 760Ki .text 760Ki 14.4%
Bisection of one or another example with the big_speedup_p fix manually patched in would be appreciated I guess.
Ok, for the gromacs r255102: .text size: 737B r255103 + fix for sreal (r256072): .text size: 752B (+2.03%). Honza can you please take a look?
Another experiment: r255024: 741B r255024 + patch r255103 + patch r256072: 752B
On Mon, 8 Jan 2018, marxin at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83665 > > --- Comment #6 from Martin Liška <marxin at gcc dot gnu.org> --- > Ok, for the gromacs > > r255102: .text size: 737B > r255103 + fix for sreal (r256072): .text size: 752B (+2.03%). > > Honza can you please take a look? suspicous chage: @@ -1151,7 +1118,7 @@ edge_badness (struct cgraph_edge *edge, overall_growth += 256 * 256 - 256; denominator *= overall_growth; } - denominator *= inlined_time; + /*denominator *= inlined_time;*/ badness = - numerator / denominator; does uncommenting fix that?
I have noticed that accidental change and plan to deal with it while retuning inliner now. I think overall the problem is caused by the fact that old code capped time to 10000. This means that most larger functions was estimated to have time 10000 and inlining seemed to produce no speedup. With sreals we keep the ratios more realistic and see more benefits from inlining and thus get big speedup more often. I will see what I can do with the default value wrt code size. Thanks for bisecting this!
Unsure what to do about this bug -- things have gone back a bit but as you say the cap introduced artificial limit back in times. But we do have large-function-growth to limit big_speedup for large functions, no? There was big size improvement with the big_speedup fuckup and comparing to before that we only "regressed" for a few cases, like overall size is still improving. So is this fixed or do you plan more changes?
Yes, I have started experiments with adjusting inliner limits: hopefully we can tune up speedup limit because it is now applied more agresively (due to lack of capping) and perhaps tune down size limits because we do better early opts and time/size estimations. Last week we did some work with Martin Liska verifying that profile propagation now works as expected and things seems to be in order now in ways we can test (i.e. no zero profiles on places where they should not be and IPA/BB profiles are in sync). Before those bugs was chased away inliner retuning made little sense. From experiments this week I know that tramp3d still need early inlining of 14, but other limits seems to have gained quite some border on spec2000/2006. I would like to use czerny for searching the space this week and hope that tuning some parameters down is still OK at this stage?
On Tue, 30 Jan 2018, hubicka at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83665 > > --- Comment #11 from Jan Hubicka <hubicka at gcc dot gnu.org> --- > Yes, I have started experiments with adjusting inliner limits: hopefully we can > tune up speedup limit because it is now applied more agresively (due to lack of > capping) and perhaps tune down size limits because we do better early opts and > time/size estimations. > > Last week we did some work with Martin Liska verifying that profile propagation > now works as expected and things seems to be in order now in ways we can test > (i.e. no zero profiles on places where they should not be and IPA/BB profiles > are in sync). Before those bugs was chased away inliner retuning made little > sense. > > From experiments this week I know that tramp3d still need early inlining of 14, > but other limits seems to have gained quite some border on spec2000/2006. I > would like to use czerny for searching the space this week and hope that tuning > some parameters down is still OK at this stage? I think we can tolerate one change in terms of addressing code size regressions but it's really late now already. So if you end up with profitable changes please post the patch and ask for an OK.
I have started with experiments on czerny. Set --param inline-min-speedup from 8 to 15 at 30th of January and to 30 yesterday. Most of size regression comes away with 15 and I observed to off-noise regressions. GCC of spec2006 improves 32.5->34.5 With 30 there are regressions in botan (but not wrt earlier releases, just wrt best values we got) that are bit random (improvements too) and mesa of spec2000 5200->4950. There is still noticeable code size improvements and also facerec improves 6400->8000. (may be independent as it fixes regression from 19th) So consistent improvement is possible with 15. I will also check inline-insns-auto/single and arrive to patch hopefully by end of weekend. Honza
Author: hubicka Date: Mon Feb 12 09:48:06 2018 New Revision: 257582 URL: https://gcc.gnu.org/viewcvs?rev=257582&root=gcc&view=rev Log: PR middle-end/83665 * params.def (inline-min-speedup): Increase from 8 to 15. (max-inline-insns-auto): Decrease from 40 to 30. * ipa-split.c (consider_split): Add some buffer for function to be considered inlining candidate. * invoke.texi (max-inline-insns-auto, inline-min-speedup): UPdate default values. Modified: trunk/gcc/ChangeLog trunk/gcc/doc/invoke.texi trunk/gcc/ipa-split.c trunk/gcc/params.def
Honza, may I close this as fixed?
(In reply to Jan Hubicka from comment #14) > Author: hubicka > Date: Mon Feb 12 09:48:06 2018 > New Revision: 257582 > > URL: https://gcc.gnu.org/viewcvs?rev=257582&root=gcc&view=rev > Log: > > PR middle-end/83665 > * params.def (inline-min-speedup): Increase from 8 to 15. > (max-inline-insns-auto): Decrease from 40 to 30. > * ipa-split.c (consider_split): Add some buffer for function to > be considered inlining candidate. > * invoke.texi (max-inline-insns-auto, inline-min-speedup): UPdate > default values. > > Modified: > trunk/gcc/ChangeLog > trunk/gcc/doc/invoke.texi > trunk/gcc/ipa-split.c > trunk/gcc/params.def This change is responsible for a 6% degradation in CPU2000 175.vpr and a 12% degradation in CPU2006 401.bzip2. Both run on a Power7 box.
Pat, please open a new bug for the regression caused by the fix. Closing original bug, IIRC numbers now show things are reasonable.
(In reply to Richard Biener from comment #17) > Pat, please open a new bug for the regression caused by the fix. Done, pr85103.