Bug 83665 - [8 regression] Big code size regression and some code quality improvement at Jan 2 2018
Summary: [8 regression] Big code size regression and some code quality improvement at ...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 8.0
: P3 normal
Target Milestone: 8.0
Assignee: Jan Hubicka
URL:
Keywords:
Depends on: 79224
Blocks: spec 84613
  Show dependency treegraph
 
Reported: 2018-01-03 10:39 UTC by Jan Hubicka
Modified: 2018-03-27 22:19 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2018-01-04 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Comment 1 Markus Trippelsdorf 2018-01-03 10:41:22 UTC
Also visible here:
https://vmakarov.fedorapeople.org/spec/spec2000.topka/gcc/home.html
Comment 2 Jan Hubicka 2018-01-03 11:33:08 UTC
One potential candidate is the ipa-inline big_speedup_p fix by Richard.  It does not seem to explain all the difference though. I have quickly checked that it does affect gzip LTO binary, but less than reported - by about 2.5KB.
Comment 3 Markus Trippelsdorf 2018-01-04 10:25:39 UTC
403.gcc with -flto is now ~6% slower.
Comment 4 Martin Liška 2018-01-04 10:59:37 UTC
I see change in r256072 that increased .text in gromacs from:
  83.6%   743Ki .text           743Ki  14.4%

to

  83.8%   760Ki .text           760Ki  14.4%
Comment 5 Richard Biener 2018-01-05 12:49:08 UTC
Bisection of one or another example with the big_speedup_p fix manually patched in would be appreciated I guess.
Comment 6 Martin Liška 2018-01-08 13:54:44 UTC
Ok, for the gromacs

r255102: .text size: 737B
r255103 + fix for sreal (r256072): .text size: 752B (+2.03%).

Honza can you please take a look?
Comment 7 Martin Liška 2018-01-08 14:33:46 UTC
Another experiment:

r255024: 741B
r255024 + patch r255103 + patch r256072: 752B
Comment 8 rguenther@suse.de 2018-01-08 15:31:40 UTC
On Mon, 8 Jan 2018, marxin at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83665
> 
> --- Comment #6 from Martin Liška <marxin at gcc dot gnu.org> ---
> Ok, for the gromacs
> 
> r255102: .text size: 737B
> r255103 + fix for sreal (r256072): .text size: 752B (+2.03%).
> 
> Honza can you please take a look?

suspicous chage:

@@ -1151,7 +1118,7 @@ edge_badness (struct cgraph_edge *edge,
            overall_growth += 256 * 256 - 256;
          denominator *= overall_growth;
         }
-      denominator *= inlined_time;
+      /*denominator *= inlined_time;*/
 
       badness = - numerator / denominator;
 

does uncommenting fix that?
Comment 9 Jan Hubicka 2018-01-08 16:51:44 UTC
I have noticed that accidental change and plan to deal with it while retuning inliner now.

I think overall the problem is caused by the fact that old code capped time to 10000. This means that most larger functions was estimated to have time 10000 and inlining seemed to produce no speedup.

With sreals we keep the ratios more realistic and see more benefits from inlining and thus get big speedup more often.  I will see what I can do with the default value wrt code size.

Thanks for bisecting this!
Comment 10 Richard Biener 2018-01-15 13:47:03 UTC
Unsure what to do about this bug -- things have gone back a bit but as you say the cap introduced artificial limit back in times.  But we do have large-function-growth to limit big_speedup for large functions, no?

There was big size improvement with the big_speedup fuckup and comparing to before that we only "regressed" for a few cases, like overall size is still improving.

So is this fixed or do you plan more changes?
Comment 11 Jan Hubicka 2018-01-30 10:05:36 UTC
Yes, I have started experiments with adjusting inliner limits: hopefully we can tune up speedup limit because it is now applied more agresively (due to lack of capping) and perhaps tune down size limits because we do better early opts and time/size estimations.

Last week we did some work with Martin Liska verifying that profile propagation now works as expected and things seems to be in order now in ways we can test (i.e. no zero profiles on places where they should not be and IPA/BB profiles are in sync). Before those bugs was chased away inliner retuning made little sense.

From experiments this week I know that tramp3d still need early inlining of 14, but other limits seems to have gained quite some border on spec2000/2006.  I would like to use czerny for searching the space this week and hope that tuning some parameters down is still OK at this stage?
Comment 12 rguenther@suse.de 2018-01-30 10:10:53 UTC
On Tue, 30 Jan 2018, hubicka at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83665
> 
> --- Comment #11 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
> Yes, I have started experiments with adjusting inliner limits: hopefully we can
> tune up speedup limit because it is now applied more agresively (due to lack of
> capping) and perhaps tune down size limits because we do better early opts and
> time/size estimations.
> 
> Last week we did some work with Martin Liska verifying that profile propagation
> now works as expected and things seems to be in order now in ways we can test
> (i.e. no zero profiles on places where they should not be and IPA/BB profiles
> are in sync). Before those bugs was chased away inliner retuning made little
> sense.
> 
> From experiments this week I know that tramp3d still need early inlining of 14,
> but other limits seems to have gained quite some border on spec2000/2006.  I
> would like to use czerny for searching the space this week and hope that tuning
> some parameters down is still OK at this stage?

I think we can tolerate one change in terms of addressing code size
regressions but it's really late now already.  So if you end up
with profitable changes please post the patch and ask for an OK.
Comment 13 Jan Hubicka 2018-02-01 14:11:44 UTC
I have started with experiments on czerny. Set --param inline-min-speedup from 8 to 15 at 30th of January and to 30 yesterday.
Most of size regression comes away with 15 and I observed to off-noise regressions.  GCC of spec2006 improves 32.5->34.5
With 30 there are regressions in botan (but not wrt earlier releases, just wrt best values we got) that are bit random (improvements too) and mesa of spec2000 5200->4950.
There is still noticeable code size improvements and also facerec improves 6400->8000. (may be independent as it fixes regression from 19th)

So consistent improvement is possible with 15. I will also check inline-insns-auto/single and arrive to patch hopefully by end of weekend.

Honza
Comment 14 Jan Hubicka 2018-02-12 09:49:04 UTC
Author: hubicka
Date: Mon Feb 12 09:48:06 2018
New Revision: 257582

URL: https://gcc.gnu.org/viewcvs?rev=257582&root=gcc&view=rev
Log:

	PR middle-end/83665
	* params.def (inline-min-speedup): Increase from 8 to 15.
	(max-inline-insns-auto): Decrease from 40 to 30.
	* ipa-split.c (consider_split): Add some buffer for function to
	be considered inlining candidate.
	* invoke.texi (max-inline-insns-auto, inline-min-speedup): UPdate
	default values.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/doc/invoke.texi
    trunk/gcc/ipa-split.c
    trunk/gcc/params.def
Comment 15 Martin Liška 2018-03-19 14:30:43 UTC
Honza, may I close this as fixed?
Comment 16 Pat Haugen 2018-03-26 21:26:11 UTC
(In reply to Jan Hubicka from comment #14)
> Author: hubicka
> Date: Mon Feb 12 09:48:06 2018
> New Revision: 257582
> 
> URL: https://gcc.gnu.org/viewcvs?rev=257582&root=gcc&view=rev
> Log:
> 
> 	PR middle-end/83665
> 	* params.def (inline-min-speedup): Increase from 8 to 15.
> 	(max-inline-insns-auto): Decrease from 40 to 30.
> 	* ipa-split.c (consider_split): Add some buffer for function to
> 	be considered inlining candidate.
> 	* invoke.texi (max-inline-insns-auto, inline-min-speedup): UPdate
> 	default values.
> 
> Modified:
>     trunk/gcc/ChangeLog
>     trunk/gcc/doc/invoke.texi
>     trunk/gcc/ipa-split.c
>     trunk/gcc/params.def

This change is responsible for a 6% degradation in CPU2000 175.vpr and a 12% degradation in CPU2006 401.bzip2. Both run on a Power7 box.
Comment 17 Richard Biener 2018-03-27 09:42:51 UTC
Pat, please open a new bug for the regression caused by the fix.

Closing original bug, IIRC numbers now show things are reasonable.
Comment 18 Pat Haugen 2018-03-27 22:19:25 UTC
(In reply to Richard Biener from comment #17)
> Pat, please open a new bug for the regression caused by the fix.

Done, pr85103.