Bug 17790

Summary: [4.0/4.1 Regression] Significant compile time increases for sixtrack with tree LICM and IV optimization
Product: gcc Reporter: Daniel Berlin <dberlin>
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: gcc-bugs, pinskia, rakdver
Priority: P2 Keywords: compile-time-hog, patch
Version: 4.0.0   
Target Milestone: 4.0.3   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2005-01-15 06:18:20
Bug Depends on:    
Bug Blocks: 8361, 18693    

Description Daniel Berlin 2004-10-02 02:24:58 UTC
For datem.f from sixtrack, i get

loop invariant motion :  23.03 (17%) usr   0.05 ( 7%) sys  23.82 (16%) wall
loop iv optimization : 21.03 (15%) usr 0.05 (7%) sys 22.13 (16%) walll

On struct-aliasing, which has vuse bypassing enabled (and thus enables more
optimization), the situation is worse:

 loop invariant motion :  23.13 (17%) usr   0.04 (6%) sys  23.94 (15%) wall
 tree iv optimization  :  68.67 (49%) usr   0.29 (40%) sys  74.50 (50%) wall

For maincr.f, we have:

 tree iv optimization  :   9.72 (23%) usr   0.06 (13%) sys  10.10 (23%) wall
Comment 1 Zdenek Dvorak 2004-10-17 19:05:35 UTC
I cannot reproduce the ivopts problem on daten.f (ivopts are <2% for me, which
is not great, but also not so terrible).  IM problem reproduces.

mainrc.f currently runs out of memory for me.
Comment 2 Zdenek Dvorak 2004-10-17 19:20:24 UTC
Actually mainrc.f does not run out of memory, but causes segfault during garbage
collection (infinite recursion).
Comment 3 Daniel Berlin 2004-10-17 19:44:19 UTC
the ivopts stuff may have been fixed by your ivopts patch for important candidates.
i'll try maincr again.
Comment 4 Zdenek Dvorak 2004-10-17 20:16:36 UTC
IM problem seems to be caused by some inefficiency in store motion (I suspect
scanning loop repeatedly for various insignificant virtual operands).  Anyway,
the patch for PR 17133 (complete rewrite of store motion) fixes this (reduces
the compile time to <1%).

http://gcc.gnu.org/ml/gcc-patches/2004-09/msg01120.html
Comment 5 Andrew Pinski 2004-11-25 20:47:23 UTC
I see LICM on some other code high up on the radar.
Comment 6 Andrew Pinski 2004-12-06 00:02:04 UTC
Hmm, I found another testcase where we are slow at LIM:
 loop invariant motion :   2.55 ( 5%) usr   0.40 ( 3%) sys   3.36 ( 4%) wall
This is PR8361.

Zdenek can you update your patch for the changes where V_MUST_DEF changes and see what the 
compile time improvements you get with the patch?
Comment 7 Zdenek Dvorak 2004-12-06 00:12:55 UTC
Subject: Re:  [4.0 Regression] Significant compile time increases for sixtrack with tree LICM and IV optimization

> ------- Additional Comments From pinskia at gcc dot gnu dot org  2004-12-06 00:02 -------
> Hmm, I found another testcase where we are slow at LIM:
>  loop invariant motion :   2.55 ( 5%) usr   0.40 ( 3%) sys   3.36 ( 4%) wall
> This is PR8361.
> 
> Zdenek can you update your patch for the changes where V_MUST_DEF changes and see what the 
> compile time improvements you get with the patch?

there is an updated version of the patch

http://gcc.gnu.org/ml/gcc-patches/2004-10/msg01642.html

that should work (possibly with minor changes due to some renaming).
Comment 8 Steven Bosscher 2005-02-02 08:13:31 UTC
Any news here?  This is one of the more serious compile time problems 
in GCC4, I've seen a number of cases where these passes are high up in 
the profile. 
Comment 9 Zdenek Dvorak 2005-02-02 08:38:20 UTC
Subject: Re:  [4.0 Regression] Significant compile time increases for sixtrack with tree LICM and IV optimization

> Any news here?  This is one of the more serious compile time problems 
> in GCC4, I've seen a number of cases where these passes are high up in 
> the profile. 

As for ivopts, the problems reported under this PR are solved.  So if
you have a testcase where ivopts eat more than 1% of time without a good
reason, please let me know.

I will try to update and resend the patch for inefficiency in store
motion.
Comment 10 Steven Bosscher 2005-02-02 09:17:06 UTC
18687 is one example where IVopts takes a significant amount of time (9%). 
Comment 11 Zdenek Dvorak 2005-02-06 20:25:39 UTC
Updated version of the patch:

http://gcc.gnu.org/ml/gcc-patches/2005-02/msg00205.html
Comment 12 Steven Bosscher 2005-02-23 09:25:56 UTC
Is this patch still 4.0 material? No reviewers have looked at it yet :-/ 
 
Comment 13 Andrew Pinski 2005-07-25 04:13:55 UTC
Does anyone have new numbers?
Comment 14 Richard Biener 2005-09-12 12:02:33 UTC
Current mainline with -O3 -funroll-loops daten.f takes 3.6s to compile.  

 loop invariant motion :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree canonical iv     :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   8 kB ( 0%) ggc

Multiple compilations finally produce an evenly distributed profile:

samples  %        image name               symbol name
91        4.8097  no-vmlinux               (no symbols)
32        1.6913  f951                     cse_insn
29        1.5328  f951                     count_reg_usage
28        1.4799  f951                     mark_set_1
26        1.3742  f951                     constrain_operands
23        1.2156  f951                     bitmap_bit_p
22        1.1628  f951                     find_reg_note
22        1.1628  f951                     init_alias_analysis
20        1.0571  f951                     for_each_rtx_1
19        1.0042  f951                     invalidate
Comment 15 Daniel Berlin 2005-10-04 12:26:20 UTC
I think we can call this one fixed for now, i'll reopen if it goes crazy again