Bug 33508 - [4.3 Regression] tree struct aliasing goes into a loop marking call clobbers.
Summary: [4.3 Regression] tree struct aliasing goes into a loop marking call clobbers.
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: 4.3.0
Assignee: Richard Biener
URL:
Keywords: alias, compile-time-hog
Depends on:
Blocks:
 
Reported: 2007-09-20 10:35 UTC by Ramana Radhakrishnan
Modified: 2007-09-21 09:38 UTC (History)
6 users (show)

See Also:
Host: i686-linux-gnu
Target: arm-none-eabi
Build: i686-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2007-09-20 12:06:32


Attachments
testcase. (183.35 KB, application/x-bzip)
2007-09-20 10:44 UTC, Ramana Radhakrishnan
Details
patch fixing the problem (1.13 KB, patch)
2007-09-20 12:07 UTC, Richard Biener
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ramana Radhakrishnan 2007-09-20 10:35:02 UTC
Using head from today to build a telecom stack on the ARM port.

It spends nearly 100% of time in tree call clobbering for a particularly nasty testcase which I am attaching. 

Almost every time I do a Ctrl-C in gdb I hit myself in the function set_initial_properties with a similar backtrace.  

. #0  0x0847f0da in var_ann (t=0xb6f433c8)
    at ../../gnu/gcc/tree-flow-inline.h:204
#1  0x0847eee5 in mark_call_clobbered (var=0xb6f433c8, escape_type=256)
    at ../../gnu/gcc/tree-flow-inline.h:878
#2  0x08480ce6 in set_initial_properties (ai=0x8a8b280)
    at ../../gnu/gcc/tree-ssa-alias.c:565
#3  0x08481198 in compute_call_clobbered (ai=0x8a8b280)
    at ../../gnu/gcc/tree-ssa-alias.c:619


in the hunk 

  if (TREE_CODE (alias) == STRUCT_FIELD_TAG)
    {
      subvar_t svars;
      svars = get_subvars_for_var (SFT_PARENT_VAR (alias));
      for (; svars; svars = svars->next)
if (!unmodifiable_var_p (alias))
  mark_call_clobbered (svars->var, pi->escape_mask);
    }
  else if (!unmodifiable_var_p (alias))
    mark_call_clobbered (alias, pi->escape_mask);




I tried by reverting the patch from r127629 but that didn't help straight off. Doing a binary search I got to the versions r127831-37 :

r127831 is the last revision of the tree that builds this testcase without any compile time regressions. 

r127836 is the first revision of the tree where this goes awry with a huge compile time regression . 

versions (r127832-35) do not build for undeclared references to forbidden_auto_inc_dec_classes in regclass.c. I haven't yet seen what's gone into these commits to make any informed decisions. 






This is the statistic with the 


  garbage collection    :   0.43 ( 0%) usr   0.03 ( 0%) sys   4.51 ( 0%) wall       0 kB ( 0%) ggc
callgraph construction:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall      69 kB ( 0%) ggc
callgraph optimization:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall      51 kB ( 0%) ggc
ipa type escape       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
cfg cleanup           :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       2 kB ( 0%) ggc
CFG verifier          :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall       0 kB ( 0%) ggc
trivially dead code   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
df live regs          :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
df live&initialized regs:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall       0 kB ( 0%) ggc
df use-def / def-use chains:   0.00 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
df reg dead/unused notes:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall      19 kB ( 0%) ggc
register information  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall       0 kB ( 0%) ggc
alias analysis        :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall      88 kB ( 0%) ggc
preprocessing         :   0.13 ( 0%) usr   0.04 ( 0%) sys   0.19 ( 0%) wall    1636 kB ( 4%) ggc
lexical analysis      :   0.05 ( 0%) usr   0.10 ( 1%) sys   0.09 ( 0%) wall       0 kB ( 0%) ggc
parser                :   0.22 ( 0%) usr   0.09 ( 1%) sys   0.39 ( 0%) wall    7237 kB (19%) ggc
inline heuristics     :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall     325 kB ( 1%) ggc
integration           :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall      91 kB ( 0%) ggc
tree CFG cleanup      :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall       3 kB ( 0%) ggc
tree VRP              :   0.04 ( 0%) usr   0.00 ( 0%) sys   1.11 ( 0%) wall     338 kB ( 1%) ggc
tree copy propagation :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall      20 kB ( 0%) ggc
tree PTA              :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall     123 kB ( 0%) ggc
tree alias analysis   :   0.53 ( 0%) usr   0.00 ( 0%) sys   1.27 ( 0%) wall      23 kB ( 0%) ggc
tree call clobbering  :3604.47 (100%) usr  12.38 (95%) sys4166.01 (99%) wall       0 kB ( 0%) ggc
tree flow sensitive alias:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall     150 kB ( 0%) ggc
tree flow insensitive alias:   0.54 ( 0%) usr   0.00 ( 0%) sys   0.73 ( 0%) wall       0 kB ( 0%) ggc
tree memory partitioning:   1.91 ( 0%) usr   0.04 ( 0%) sys   3.32 ( 0%) wall       2 kB ( 0%) ggc
tree SSA rewrite      :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.27 ( 0%) wall     338 kB ( 1%) ggc
tree SSA incremental  :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.30 ( 0%) wall      17 kB ( 0%) ggc
tree operand scan     :   1.96 ( 0%) usr   0.04 ( 0%) sys   4.16 ( 0%) wall     507 kB ( 1%) ggc
dominator optimization:   0.58 ( 0%) usr   0.00 ( 0%) sys   0.78 ( 0%) wall     234 kB ( 1%) ggc
tree SRA              :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall       1 kB ( 0%) ggc
tree CCP              :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall      17 kB ( 0%) ggc
tree reassociation    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.17 ( 0%) wall      11 kB ( 0%) ggc
tree PRE              :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.23 ( 0%) wall     192 kB ( 1%) ggc
tree FRE              :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.42 ( 0%) wall     113 kB ( 0%) ggc
tree code sinking     :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall       5 kB ( 0%) ggc
tree forward propagate:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall      40 kB ( 0%) ggc
tree conservative DCE :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
tree loop bounds      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       4 kB ( 0%) ggc
tree iv optimization  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall      38 kB ( 0%) ggc
tree loop init        :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall      12 kB ( 0%) ggc
tree SSA to normal    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.52 ( 0%) wall      31 kB ( 0%) ggc
tree SSA verifier     :   4.07 ( 0%) usr   0.27 ( 2%) sys   5.43 ( 0%) wall      99 kB ( 0%) ggc
tree STMT verifier    :   0.18 ( 0%) usr   0.00 ( 0%) sys   0.22 ( 0%) wall       0 kB ( 0%) ggc
dominance computation :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall       0 kB ( 0%) ggc
expand                :   0.96 ( 0%) usr   0.06 ( 0%) sys   9.88 ( 0%) wall   25260 kB (66%) ggc
forward prop          :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall      32 kB ( 0%) ggc
CSE                   :   0.06 ( 0%) usr   0.01 ( 0%) sys   0.46 ( 0%) wall      18 kB ( 0%) ggc
dead code elimination :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
dead store elim1      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall      17 kB ( 0%) ggc
dead store elim2      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall      24 kB ( 0%) ggc
loop analysis         :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall      15 kB ( 0%) ggc
CPROP 1               :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall      15 kB ( 0%) ggc
PRE                   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall       6 kB ( 0%) ggc
CPROP 2               :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall      15 kB ( 0%) ggc
bypass jumps          :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall      14 kB ( 0%) ggc
auto inc dec          :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       7 kB ( 0%) ggc
CSE 2                 :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall       8 kB ( 0%) ggc
combiner              :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.38 ( 0%) wall      37 kB ( 0%) ggc
regmove               :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall       2 kB ( 0%) ggc
scheduling            :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.24 ( 0%) wall      39 kB ( 0%) ggc
local alloc           :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall      98 kB ( 0%) ggc
global alloc          :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.26 ( 0%) wall      35 kB ( 0%) ggc
reload CSE regs       :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.13 ( 0%) wall      52 kB ( 0%) ggc
thread pro- & epilogue:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall      34 kB ( 0%) ggc
peephole 2            :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall       0 kB ( 0%) ggc
rename registers      :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall       0 kB ( 0%) ggc
scheduling 2          :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall       7 kB ( 0%) ggc
machine dep reorg     :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall       4 kB ( 0%) ggc
final                 :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.74 ( 0%) wall       0 kB ( 0%) ggc
symout                :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.22 ( 0%) wall       0 kB ( 0%) ggc
TOTAL                 :3617.35            13.09          4205.33              38504 kB
Comment 1 Ramana Radhakrishnan 2007-09-20 10:44:55 UTC
Created attachment 14229 [details]
testcase.
Comment 2 Richard Biener 2007-09-20 11:50:02 UTC
At least the unmodifiable_var_p (alias) call looks suspicious (but probably
unrelated to the problem):

Index: tree-ssa-alias.c
===================================================================
--- tree-ssa-alias.c    (revision 128618)
+++ tree-ssa-alias.c    (working copy)
@@ -561,7 +561,7 @@ set_initial_properties (struct alias_inf
                      subvar_t svars;
                      svars = get_subvars_for_var (SFT_PARENT_VAR (alias));
                      for (; svars; svars = svars->next)
-                       if (!unmodifiable_var_p (alias))
+                       if (!unmodifiable_var_p (svars->var))
                          mark_call_clobbered (svars->var, pi->escape_mask);
                    }
                  else if (!unmodifiable_var_p (alias))

what we have is that SFT_PARENT_VAR is of type 'union Meep' which has
quite a lot of fields and the above loop is quadratic in the number of
fields.
Comment 3 Richard Biener 2007-09-20 12:06:32 UTC
(gdb) call bitmap_count_bits (pi->pt_vars)
$23 = 10307

that is, we have all SFTs in pi->pt_vars and as we process them we loop over
all of them again.
Comment 4 Richard Biener 2007-09-20 12:07:44 UTC
Created attachment 14230 [details]
patch fixing the problem

This fixes it.  The idea is to keep track of which parent vars we need to add
all subvars to the call clobbered list in a bitmap and process them after the
first walk.
Comment 5 Richard Biener 2007-09-20 13:29:58 UTC
4.2 doesn't have this extra loop.
Comment 6 Ramana Radhakrishnan 2007-09-20 13:52:10 UTC
(In reply to comment #4)
> Created an attachment (id=14230) [edit]
> patch fixing the problem
> 
> This fixes it.  The idea is to keep track of which parent vars we need to add
> all subvars to the call clobbered list in a bitmap and process them after the
> first walk.
> 

Yep it does - Thanks for the quick fix. I am testing it now and will let you know in a bit . 


(In reply to comment #5)
> 4.2 doesn't have this extra loop.
> 

Comment 7 Daniel Berlin 2007-09-20 15:12:22 UTC
Subject: Re:  [4.3 Regression] tree struct aliasing goes into a loop marking call clobbers.

On 20 Sep 2007 13:52:11 -0000, ramana dot radhakrishnan at celunite
dot com <gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #6 from ramana dot radhakrishnan at celunite dot com  2007-09-20 13:52 -------
> (In reply to comment #4)
> > Created an attachment (id=14230)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14230&action=view) [edit]
> > patch fixing the problem
> >
> > This fixes it.  The idea is to keep track of which parent vars we need to add
> > all subvars to the call clobbered list in a bitmap and process them after the
> > first walk.


I don't have enough context handy in the patch to see if you moved it
outside the transitive loop, or just the initial loop.
You can move it outside the initial loop, but not the transitive loop,
because the SFT's may themselves be pointers (obviously for
non-pointer SFT's, you can certainly ignore them).

You could also speed this up dramatically by not queuing non-pointers.
Comment 8 rguenther@suse.de 2007-09-20 15:20:37 UTC
Subject: Re:  [4.3 Regression] tree struct
 aliasing goes into a loop marking call clobbers.

On Thu, 20 Sep 2007, dberlin at dberlin dot org wrote:

> ------- Comment #7 from dberlin at gcc dot gnu dot org  2007-09-20 15:12 -------
> Subject: Re:  [4.3 Regression] tree struct aliasing goes into a loop marking
> call clobbers.
> 
> On 20 Sep 2007 13:52:11 -0000, ramana dot radhakrishnan at celunite
> dot com <gcc-bugzilla@gcc.gnu.org> wrote:
> >
> >
> > ------- Comment #6 from ramana dot radhakrishnan at celunite dot com  2007-09-20 13:52 -------
> > (In reply to comment #4)
> > > Created an attachment (id=14230)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14230&action=view)
> >  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14230&action=view) [edit]
> > > patch fixing the problem
> > >
> > > This fixes it.  The idea is to keep track of which parent vars we need to add
> > > all subvars to the call clobbered list in a bitmap and process them after the
> > > first walk.
> 
> 
> I don't have enough context handy in the patch to see if you moved it
> outside the transitive loop, or just the initial loop.
> You can move it outside the initial loop, but not the transitive loop,
> because the SFT's may themselves be pointers (obviously for
> non-pointer SFT's, you can certainly ignore them).

I only moved it outside of the innermost EXECUTE_IF_SET_IN_BITMAP loop
on aliases, which should be safe.  I don't completely understand what
you mean with non queuing non-pointers - I suppose queuing non-pointers
in the worklist in mark_aliases_call_clobbered?

Richard.
Comment 9 Richard Biener 2007-09-21 09:37:04 UTC
Subject: Bug 33508

Author: rguenth
Date: Fri Sep 21 09:36:52 2007
New Revision: 128645

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=128645
Log:
2007-09-21  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/33508
	* tree-ssa-alias.c (mark_aliases_call_clobbered): Avoid
	quadratic loop by keeping a bitmap of variables we have
	to clobber all subvariables for.
	(set_initial_properties): Likewise.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-ssa-alias.c

Comment 10 Richard Biener 2007-09-21 09:38:53 UTC
Fixed.