This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Example program takes 2000 times as long to compile under C++ as C


On Wed, Aug 30, 2000 at 05:56:05PM -0400, Kelley Cook wrote:
> As originally brought up in message
> http://gcc.gnu.org/ml/gcc-bugs/2000-06/msg00385.html, Mr. Robert Melcher
> <robert.melchers@drives.eurotherm.co.uk> discovered that the following
> program compiles nearly instantly in C, but takes 5 minutes with C++.
> 
> Mike Stump suggested to profile the compiler, unfortunately this appeared
> hard to do, so I didn't bother.

It happens that I have a profiled cc1plus binary right here. When I
run it on your test case, I get:

Each sample counts as 0.01 seconds.
  %       cum     self              self     total           
 time    secs     secs      calls  ms/call  ms/call  name    
 67.82  58.49    58.49         50  1169.80  1723.80  walk_tree
 10.99  67.97     9.48  193710255     0.00     0.00  simplify_aggr_init_exprs_r
  9.08  75.80     7.83  258280343     0.00     0.00  calls_setjmp_r
  4.80  79.94     4.14  286978156     0.00     0.00  statement_code_p
  4.29  83.64     3.70  143489643     0.00     0.00  first_rtl_op
  2.40  85.71     2.07   21523360     0.00     0.00  special_function_p
  0.56  86.19     0.48   21523360     0.00     0.00  setjmp_call_p

Yow!  The immediate question is who called walk_tree, and the answer
is we don't know for sure, because it's being called in a cycle.

[4] 99.9   58.49   27.70      34+83      <cycle 2 as a whole> [4]
           58.49   27.70      50+495037350 walk_tree <cycle 2> [5]
            0.00    0.00      30          for_each_template_parm <cycle 2> [708]
-----------------------------------------------
                         495037350      walk_tree <cycle 2> [5]
                              30        for_each_template_parm <cycle 2> [708]
            1.72    0.81       1/34     expand_body [22]
            1.72    0.81       1/34     calls_setjmp_p [32]
           30.97   14.66      18/34     no_linkage_check [9]
[5] 99.9   58.49   27.70      50+495037350 walk_tree <cycle 2> [5]
            7.83    2.55 258280343/258280343 calls_setjmp_r [20]
            9.48    0.00 193710255/193710255 simplify_aggr_init_exprs_r [21]
            4.14    0.00 286978156/286978156 statement_code_p [25]
            3.70    0.00 143489078/143489643 first_rtl_op [26]
            0.00    0.00      23/23     no_linkage_helper [722]
                              37        for_each_template_parm_r <cycle 2> [666]
                         495037350             walk_tree <cycle 2> [5]

So I ran it under the debugger until it got stuck, hit ^C, and the
lucky winner is... calls_setjmp_p!  But it's only called once, I hear
you cry?  Yah - but it's called to walk over the saved expression tree
for the entire function, and that expression tree... oy, what a mess.

<modify_expr type <integer_type unsigned int>
    side-effects
    arg 0 <var_decl output>
    arg 1 <nop_expr type <integer_type unsigned int>
        side-effects
        arg 0 <save_expr type <integer_type int>
            side-effects
            arg 0 <compound_expr type <integer_type int>
                side-effects
                arg 0 <convert_expr type <void_type void>
                    side-effects
                    arg 0 <save_expr type <integer_type int>
                        side-effects unsigned
                            arg 0 <compound_expr>
                            arg 1 <function_decl mux>
                        rtl 2 (nil)>>
                arg 1 <cond_expr type <integer_type int>
                    side-effects arg 0 <var_decl in15>
                    arg 1 <bit_ior_expr type <integer_type int>
                        side-effects 
                            arg 0 <save_expr> 
                            arg 1 <integer_cst 32768>>
                    arg 2 <non_lvalue_expr type <integer_type int>
                        side-effects 
                            arg 0 <save_expr>>>> 
            arg 1 <function_decl mux>
            rtl 2 (nil)>>>

That's an overview.  It goes on, and on, and on, at least 60 levels
deep.  Look back up at the summary: calls_setjmp_r was called
258,280,343 times.  That means the tree walker visited 258 million
tree nodes.  (The calls to simplify_aggr

How can that possibly be when GC claims there's only a few thousand
tree nodes in existence?  Because there's heavy repetition going on.
Every node in that tree has a type, each time we hit a node we visit
its type.  And types are fairly complex beasts.  Likewise, we
encounter the VAR_DECLs and FUNCTION_DECLs over and over again.  I
suspect there's some shared structure hiding in the SAVE_EXPR nodes,
too.

I instructed gdb to print out the address of every node visited by
calls_setjmp_r, let it grind for awhile, and ran the output through
"sort | uniq -c".  There are a small number (ten or so) of nodes that
are visited more than 20,000 times.  There are only about 200 nodes
examined overall.

This is not trivial to fix.  calls_setjmp_p could be replaced by
noticing during parsing that we're calling setjmp and setting a flag
(kinda like the way calls.c does it at the RTL level, now).  But
expand_body does much the same thing with simplify_aggr_init_exprs_r.
And in the current tree (this compiler is a wee out of date)
finish_stmt_tree does it with prune_unused_decls...

I think what we want is two variants of walk_tree.  One for things
like no_linkage_check, that really do need to visit each node; one for
things like calls_setjmp_p and the new prune_unused_decls, that only
visits statement nodes.  I'm not sure what to do with
simplify_aggr_init_exprs_r.

Or I s'pose we could find a way to notice when we'd already visited a
node and not examine it again.  There *are* two unused flag bits in
struct tree_common still...

Comments from the whole-function-mode crowd?

zw

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]