This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Example program takes 2000 times as long to compile under C++ as C


On Thu, Aug 31, 2000 at 10:55:14AM -0700, Mark Mitchell wrote:
> >>>>> "Zack" == Zack Weinberg <zack@wolery.cumb.org> writes:
> 
>     Zack> This is not trivial to fix.  calls_setjmp_p could be
>     Zack> replaced by noticing during parsing that we're calling
>     Zack> setjmp and setting a flag (kinda like the way calls.c does
>     Zack> it at the RTL level, now).  
> 
> That's sort-of ugly.  I'd like to allow for arbitrary transformations
> for as long as possible.

Okay.

...
> What you say about TREE_TYPE is extremely valid.  I've been meaning to
> outfit walk_tree with some flags that say whether or not to walk into
> types for quite a while.  For lots of tree-walks we don't need to walk
> into types at all.

I've done something along these lines.  You get the choice of
inspecting just statements, just expressions, or everything.  

Just examining expressions is not good enough for Kelley's test case,
we still spend 90 seconds or so inside calls_setjmp_p.  I instrumented
it to report the trees it encountered and their codenames - we're
still visiting a very few trees hundreds of thousands of times:

 512784 0x40111580 save_expr
 512784 0x401113a0 cond_expr
 170928 0x40111a20 save_expr
 170928 0x40111640 compound_expr
 170928 0x40111620 convert_expr
 170928 0x40111600 cond_expr
 170928 0x401115e0 non_lvalue_expr
 170928 0x401115a0 bit_ior_expr
 ...

That's a run forced to terminate after 30 seconds.  If I let it go to
completion, the logfile is 800 megabytes and 'sort' runs out of disk
space.

A CALL_EXPR might be hiding deep inside a complex expression, so we
can't switch to examining just STMT nodes.  I see no alternative at
this point but to do something like what verify_stmt_tree does in
order to prevent visiting nodes more than once (it remembers every
node in a hash table).  But I thought I'd ask you if there were
something obviously wrong with the tree structure.  Should we be
visiting that SAVE_EXPR so very many times?  Here's some sample
debugging dump:

 <cond_expr 0x40111600
    type <integer_type 0x400fb400 int type_6 SI
        size <integer_cst 0x400f8760 constant 32>
        unit size <integer_cst 0x400f8780 constant 4>
        align 32 symtab 0 alias set -1 precision 32
        min <integer_cst 0x400f8720 constant -2147483648>
        max <integer_cst 0x400f8740 constant 2147483647>
        pointer_to_this <pointer_type 0x400fff80>>
    side-effects
    arg 0 <var_decl 0x4010b180 in1
        type <boolean_type 0x400ff780 bool unsigned type_6 QI
            size <integer_cst 0x400f84e0 constant 8>
            unit size <integer_cst 0x400fd060 constant 1>
            align 8 symtab 0 alias set -1 precision 1
            min <integer_cst 0x40100000 constant 0>
            max <integer_cst 0x40100040 constant 1>>
        unsigned used public static QI file test.ii line 2
	size <integer_cst 0x400f84e0 8> unit size <integer_cst 0x400fd060 1>
        align 8
        (mem/f:QI (symbol_ref:SI ("in1")) 0)
        chain <var_decl 0x4010b100 in0 type <boolean_type 0x400ff780 bool>
            unsigned used public static QI file test.ii line 1
	    size <integer_cst 0x400f84e0 8> unit size <integer_cst 0x400fd060 1>
            align 8
            (mem/f:QI (symbol_ref:SI ("in0")) 0)
	    chain <var_decl 0x4010b080 __PRETTY_FUNCTION__>>>
    arg 1 <bit_ior_expr 0x401115a0 type <integer_type 0x400fb400 int>
        side-effects
        arg 0 <save_expr 0x40111580 type <integer_type 0x400fb400 int>
            side-effects unsigned
            arg 0 <cond_expr 0x401113a0 type <integer_type 0x400fb400 int>
                arg 0 <var_decl 0x4010b100 in0>
                arg 1 <integer_cst 0x40111380 constant 1>
                arg 2 <integer_cst 0x400fd220 constant 0>>
	    arg 1 <function_decl 0x4010ba80 mux>
            rtl 2 (nil)>
        arg 1 <integer_cst 0x401113e0 constant 2>>
    arg 2 <non_lvalue_expr 0x401115e0 type <integer_type 0x400fb400 int>
        side-effects arg 0 <save_expr 0x40111580>>>

Note that the source code does _not_ contain heavily nested logic.  It
looks like this:

void mux(void)
{
  output =
      (in0   ?  0x0001 : 0) |
      (in1   ?  0x0002 : 0) |
      (in2   ?  0x0004 : 0) |
      (in3   ?  0x0008 : 0) |
      (in4   ?  0x0010 : 0) |
      (in5   ?  0x0020 : 0) |
      (in6   ?  0x0040 : 0) |
      (in7   ?  0x0080 : 0) |
      (in8   ?  0x0100 : 0) |
      (in9   ?  0x0200 : 0) |
      (in10  ?  0x0400 : 0) |
      (in11  ?  0x0800 : 0) |
      (in12  ?  0x1000 : 0) |
      (in13  ?  0x2000 : 0) |
      (in14  ?  0x4000 : 0) |
      (in15  ?  0x8000 : 0) ;
}

where output is an int and in0..in15 are booleans (all are globals).

zw


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]