This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PR c++/5504: Optimization breaks wei-ku-1 from blitz


On Fri, Apr 05, 2002 at 11:34:40PM +0100, Jason Merrill wrote:
> Unfortunately, the backend doesn't seem to be prepared to handle this
> number of EH regions; many of the data structures can only be searched
> linearly.  For the four-term case, gprof shows that we're spending the vast
> majority of our time traversing them:
> 
>   %   cumulative   self              self     total           
>  time   seconds   seconds    calls  ms/call  ms/call  name    
>  37.35    211.09   211.09   110832     1.90     2.32  maybe_remove_eh_handler
>  26.59    361.36   150.27   344409     0.44     0.44  in_expr_list_p
>  10.00    417.89    56.53   305399     0.19     0.19  next_nonnote_insn
>   6.59    455.11    37.22    28896     1.29     1.29  remove_exception_handler_label
[...]
> I'm not sure what to do about this.  We need to support this sort of code;
> this is a very common C++ programming idiom.  Obvious ways to improve
> performance would be:
>
> [1] Adjust the EH data structures so that we can do more efficient
>     searches in them.
> [2] Try to avoid creating EH regions that will just be deleted again.
>     If nothing in the region can throw, we can discard it at
>     expand_eh_region_end time.

It turns out that [2] is annoyingly difficult.  FIXUP regions created
by expand_cleanups want to insert code into the *parent* of a CLEANUP
region, after that region has already been finalized by
expand_eh_region_end.  I.e. if we simply fail to create the cleanup
region based on the fact that it'll never be used, we'll not have all
the data structures needed to resolve the fixup region.  :-(

Doing [1] is within our power.  I found a whole series of quadratic
(or worse) operations here -- not all of them in the EH code.

I'm currently testing a patch that brings the original test case to

  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
  5.14      7.34     7.34  2662100     0.00     0.00  walk_tree
  4.57     13.87     6.53      701     0.01     0.01  fixup_var_refs_insns
  4.30     20.01     6.14 34516896     0.00     0.00  ggc_alloc
  3.20     24.58     4.57 62692969     0.00     0.00  statement_code_p
  2.96     28.81     4.23  7488030     0.00     0.00  fixup_var_refs_1
  2.70     32.66     3.85 77455616     0.00     0.00  add_insn

 expand                : 207.49 (59%) usr   8.39 (68%) sys 216.25 (59%) wall
 parser                :  38.15 (11%) usr   0.77 ( 6%) sys  38.52 (11%) wall
 garbage collection    :  29.99 ( 9%) usr   1.31 (11%) sys  31.30 ( 9%) wall
 cfg construction      :  18.96 ( 5%) usr   0.46 ( 4%) sys  19.56 ( 5%) wall
 cfg cleanup           :  16.27 ( 5%) usr   0.04 ( 0%) sys  16.23 ( 4%) wall
 TOTAL                 : 351.17            12.35           363.83

I.e. we're down below 6 minutes compile time instead of 7 hours.

Unfortunately, it still consumes unacceptable amounts of memory.
Somewhere near 1.2GB peak.  I think I can improve this by eliding
some of the *code* associated with cleanup regions, even though
I cannot elide the cleanup region data structure itself.

I'll test the patch I have overnight and post if it succeeds.


r~


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]