Bug 93199 - [9 Regression] Compile time hog in sink_clobbers
Summary: [9 Regression] Compile time hog in sink_clobbers
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 10.0
: P2 normal
Target Milestone: 10.0
Assignee: Richard Biener
URL:
Keywords: compile-time-hog, patch
Depends on: 105838 93273
Blocks:
  Show dependency treegraph
 
Reported: 2020-01-08 09:28 UTC by Jakub Jelinek
Modified: 2022-06-14 13:10 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work: 10.0
Known to fail:
Last reconfirmed: 2020-01-08 00:00:00


Attachments
cleanup patch (1.79 KB, patch)
2020-01-08 09:50 UTC, Richard Biener
Details | Diff
patch fixing the quadraticness (1.32 KB, patch)
2020-01-09 14:35 UTC, Richard Biener
Details | Diff
Patch candidate for #c15 (778 bytes, patch)
2020-01-10 14:06 UTC, Martin Liška
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2020-01-08 09:28:57 UTC
struct S {
  S ();
  S (int i);
  int s;
  operator bool () { return s != 0; }
};

int bar ();

int
foo (bool x)
{
  S a;
  try {
    x ?
#define A(n) (a = S (0)),
#define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) \
	     A(n##5) A(n##6) A(n##7) A(n##8) A(n##9)
#define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) \
	     B(n##5) B(n##6) B(n##7) B(n##8) B(n##9)
#define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) \
	     C(n##5) C(n##6) C(n##7) C(n##8) C(n##9)
#define E(n) D(n##0) D(n##1) D(n##2) D(n##3) D(n##4) \
	     D(n##5) D(n##6) D(n##7) D(n##8) D(n##9)
#define F(n) E(n##0) E(n##1) E(n##2) E(n##3) E(n##4) \
	     E(n##5) E(n##6) E(n##7) E(n##8) E(n##9)
    E(1) E(2) E(3)
	0 : 1;
  } catch (int) {
    return 1;
  }
  return 0;
}

hits quadratic behavior in sink_clobbers at -O0.  g++ 4.4 compiled this almost instantly, 4.6 too, 4.7/4.8/4.9/5 eat a lot of RAM on this already during into ssa pass, while 6+ just hog compile time (but not memory) in sink_clobbers.
Comment 1 Richard Biener 2020-01-08 09:40:10 UTC
Confirmed.  I have a cleanup patch and an idea for fixing the quadraticness as well.
Comment 2 Richard Biener 2020-01-08 09:50:35 UTC
Created attachment 47608 [details]
cleanup patch

Testing this first, reliably catching secondary opportunities and micro-optimizing virtual operand update.
Comment 3 Jakub Jelinek 2020-01-08 10:07:20 UTC
The 4.7 behavior started with r181332.  Then in r182283 the sink_clobbers quadratic behavior has been added.  And finally r246314 got rid of the eat all memory and compile time during into ssa pass and only hangs in sink_clobbers.
Comment 4 Jakub Jelinek 2020-01-08 10:08:46 UTC
BTW, if we want to put the testcase into the testsuite, maybe we need to tune the exact resx/CLOBBER count, so that even after the fix it doesn't take way too long, but on the other side with unfixed compiler takes long enough to FAIL it at least on slower machines.
Comment 5 Jakub Jelinek 2020-01-08 10:14:22 UTC
As for the patch, I wonder if internal resx doesn't occur also without the clobbers to move or in places where sink_clobbers would give up.  So, perhaps add a dry_run bool to sink_clobbers and in the first loop, if we haven't yet determined we need the second one, run sink_clobbers with dry_run=true which would perform the first half of the function and only if it found clobbers in a bb with EH preds and single successor would tell the caller it should set the bool to trigger the second loop.
Comment 6 rguenther@suse.de 2020-01-08 11:20:37 UTC
On Wed, 8 Jan 2020, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93199
> 
> --- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> As for the patch, I wonder if internal resx doesn't occur also without the
> clobbers to move or in places where sink_clobbers would give up.  So, perhaps
> add a dry_run bool to sink_clobbers and in the first loop, if we haven't yet
> determined we need the second one, run sink_clobbers with dry_run=true which
> would perform the first half of the function and only if it found clobbers in a
> bb with EH preds and single successor would tell the caller it should set the
> bool to trigger the second loop.

Hmm, not sure if it's worth that, but yeah, could do that easily I guess.
Consider it done.
Comment 7 Richard Biener 2020-01-08 12:49:49 UTC
Author: rguenth
Date: Wed Jan  8 12:49:14 2020
New Revision: 280000

URL: https://gcc.gnu.org/viewcvs?rev=280000&root=gcc&view=rev
Log:
2019-01-08  Richard Biener  <rguenther@suse.de>

	PR middle-end/93199
	c/
	* gimple-parser.c (c_parser_parse_gimple_body): Remove __PHI IFN
	permanently.

	* gimple-fold.c (rewrite_to_defined_overflow): Mark stmt modified.
	* tree-ssa-loop-im.c (move_computations_worker): Properly adjust
	virtual operand, also updating SSA use.
	* gimple-loop-interchange.cc (loop_cand::undo_simple_reduction):
	Update stmt after resetting virtual operand.
	(tree_loop_interchange::move_code_to_inner_loop): Likewise.

	* gimple-iterator.c (gsi_remove): When not removing the stmt
	permanently do not delink immediate uses or mark the stmt modified.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/c/ChangeLog
    trunk/gcc/c/gimple-parser.c
    trunk/gcc/gimple-fold.c
    trunk/gcc/gimple-iterator.c
    trunk/gcc/gimple-loop-interchange.cc
    trunk/gcc/tree-ssa-loop-im.c
Comment 8 Richard Biener 2020-01-08 14:31:16 UTC
Author: rguenth
Date: Wed Jan  8 14:30:44 2020
New Revision: 280006

URL: https://gcc.gnu.org/viewcvs?rev=280006&root=gcc&view=rev
Log:
2020-01-08  Richard Biener  <rguenther@suse.de>

	PR middle-end/93199
	* tree-eh.c (sink_clobbers): Update virtual operands for
	the first and last stmt only.  Add a dry-run capability.
	(pass_lower_eh_dispatch::execute): Perform clobber sinking
	after CFG manipulations and in RPO order to catch all
	secondary opportunities reliably.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-eh.c
Comment 9 Richard Biener 2020-01-09 14:35:53 UTC
Created attachment 47619 [details]
patch fixing the quadraticness

Like this for the quadraticness.  Still runs into other slowness.
pass_lower_eh_dispatch::execute takes less than 10 seconds now.
Comment 10 Richard Biener 2020-01-09 14:53:48 UTC
Still

 tree eh                            : 509.70 ( 97%)   1.58 ( 69%) 511.32 ( 97%) 9776324 kB ( 98%)

bah.  Something else ruins things.  Will figure tomorrow.
Comment 11 Richard Biener 2020-01-10 08:25:42 UTC
-   77.83%     0.45%         16118  cc1plus  cc1plus           [.] (anonymous namespace)::pass_cleanup_eh::execute              ▒
   - 77.38% (anonymous namespace)::pass_cleanup_eh::execute 
      - 77.29% cleanup_empty_eh_merge_phis              
         - 44.55% redirect_eh_edge_1
              30.45% last_stmt                                                                                 
            + 4.01% lookup_stmt_eh_lp_fn                                                                                  
            + 2.96% remove_stmt_from_eh_lp_fn                                                            
              2.77% gimple_block_label                                                                                 
              0.55% get_eh_landing_pad_from_number                     
         + 16.68% add_stmt_to_eh_lp_fn                      
           5.34% find_edge                                     
         + 4.58% redirect_edge_succ                          
           0.59% gimple_execute_on_growing_pred

that last_stmt figure looks odd tho (I blame perf for this).  This is on
the original redhat bugzilla testcase btw, will check your reduced one.
Comment 12 Richard Biener 2020-01-10 09:12:18 UTC
(In reply to Richard Biener from comment #11)
> -   77.83%     0.45%         16118  cc1plus  cc1plus           [.]
> (anonymous namespace)::pass_cleanup_eh::execute              ▒
>    - 77.38% (anonymous namespace)::pass_cleanup_eh::execute 
>       - 77.29% cleanup_empty_eh_merge_phis              
>          - 44.55% redirect_eh_edge_1
>               30.45% last_stmt                                              
> 
>             + 4.01% lookup_stmt_eh_lp_fn                                    
> 
>             + 2.96% remove_stmt_from_eh_lp_fn                               
> 
>               2.77% gimple_block_label                                      
> 
>               0.55% get_eh_landing_pad_from_number                     
>          + 16.68% add_stmt_to_eh_lp_fn                      
>            5.34% find_edge                                     
>          + 4.58% redirect_edge_succ                          
>            0.59% gimple_execute_on_growing_pred
> 
> that last_stmt figure looks odd tho (I blame perf for this).  This is on
> the original redhat bugzilla testcase btw, will check your reduced one.

Apart from structural quadraticnesses involving edges the main hog seems
to be the quadratic updating of the RESX stmt moving here:

static void
redirect_eh_edge_1 (edge edge_in, basic_block new_bb, bool change_region)
{ 
...
  /* Maybe move the throwing statement to the new region.  */
  if (old_lp != new_lp)
    {    
      remove_stmt_from_eh_lp (throw_stmt);
      add_stmt_to_eh_lp (throw_stmt, new_lp->index);
    }   

which boils down to the very same issue.  We're also getting a very big
in-degree for the EH redirection probably because we're walking the EH
LP array when optimizing empty EH.  Thus code like

  /* Notice when we redirect the last EH edge away from OLD_BB.  */
  FOR_EACH_EDGE (e, ei, old_bb->preds)
    if (e != edge_in && (e->flags & EDGE_EH))
      break;

ends up expensive as well (we will move all EH edges anyway so the
above at least could be avoided with some care).  Not to mention

  FOR_EACH_EDGE (e, ei, old_bb->preds)
    if (find_edge (e->src, new_bb))
      return false;

which we can short-cut when new_bb has a single predecessor.

So I have some micro-optimizing things here (only).  But I wonder
whether walking the landing pads in some better order in cleanup_all_empty_eh
would fix things.  Simply walking the array in reverse already helps a tremedous
amount!

 tree eh                            :   4.75 ( 35%)   0.01 (  3%)   4.75 ( 34%)   16911 kB (  9%)

vs.

 tree eh                            : 182.21 ( 95%)   0.84 ( 65%) 183.07 ( 95%) 4653260 kB ( 97%)

on your testcase and 

 tree eh                            :  29.56 ( 30%)   0.05 (  1%)  29.60 ( 28%)  246315 kB (  8%)

vs.

 tree eh                            : 626.00 ( 89%)   5.75 ( 45%) 631.88 ( 88%)38736930 kB ( 93%)

on the redhat bugzilla one.
Comment 13 Richard Biener 2020-01-10 09:28:56 UTC
The recursive lower_eh_construct & collect_finally_tree also are prone to eventually blow the stack with these kind of testcases.
Comment 14 Jakub Jelinek 2020-01-10 09:37:10 UTC
There is always -fstack-reuse=named_vars workaround, or one can bump ulimit -s.
Comment 15 Richard Biener 2020-01-10 10:42:09 UTC
So lower_eh_constructs is what is remaining of EH time and there it's just
cleanup_is_dead_in which ends up costly:

  while (reg && reg->type == ERT_CLEANUP)
    reg = reg->outer;
  return (reg && reg->type == ERT_MUST_NOT_THROW);

looks like we could easily track that in the leh_state (cache the innermost non-cleanup region).  I won't pursue this, but a quick check making the above
simply return false shows

 tree eh                            :   1.44 (  2%)   0.05 (  1%)   1.48 (  2%)  246315 kB (  8%)

on the RH bugzilla testcase.

The reduced testcase can now be conveniently analyzed using callgrind (even
with -O0 cc1plus).  The RH one is still bigger.
Comment 16 Richard Biener 2020-01-10 10:50:29 UTC
Author: rguenth
Date: Fri Jan 10 10:49:57 2020
New Revision: 280101

URL: https://gcc.gnu.org/viewcvs?rev=280101&root=gcc&view=rev
Log:
2020-01-10  Richard Biener  <rguenther@suse.de>

	PR middle-end/93199
	* tree-eh.c (redirect_eh_edge_1): Avoid some work if possible.
	(cleanup_all_empty_eh): Walk landing pads in reverse order to
	avoid quadraticness.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-eh.c
Comment 17 Richard Biener 2020-01-10 11:24:41 UTC
Author: rguenth
Date: Fri Jan 10 11:23:53 2020
New Revision: 280102

URL: https://gcc.gnu.org/viewcvs?rev=280102&root=gcc&view=rev
Log:
2020-01-10  Richard Biener  <rguenther@suse.de>

	PR middle-end/93199
	* tree-eh.c (sink_clobbers): Move clobbers to out-of-IL
	sequences to avoid walking them again for secondary opportunities.
	(pass_lower_eh_dispatch::execute): Instead actually insert
	them here.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-eh.c
Comment 18 Richard Biener 2020-01-10 12:13:03 UTC
At -O2 I see, with just E(1),

 expand vars                        :  61.55 ( 23%)   0.01 (  3%)  61.56 ( 23%)    1267 kB (  1%)
 store merging                      : 185.44 ( 69%)   0.00 (  0%) 185.44 ( 69%)     625 kB (  1%)

the time is spent in terminate_all_aliasing_chains where it seems the
m_stores_head chain is quite long.  With D(1) D(2) only we have 4000 calls
to this function but then the inner look iterates 8 million times.  Guess
we miss some limiting there, a testcase might be just a large BB with
many (non-aliasing) stores.

Micro-optimizing the function is also possible (testing patch for that).
Comment 19 Martin Liška 2020-01-10 14:06:43 UTC
Created attachment 47631 [details]
Patch candidate for #c15

I've got a patch for #c15.
@Richi: Is it something you expected?
I see the following speed up:
w/ verification: 9.9 -> 8.7s
w/o verification: 4.8 -> 3.7

Now perf report looks like:
     5.41%  cc1plus  cc1plus                      [.] hash_table<hash_map<gimple*, int, simple_hashmap_traits<default_hash_traits<gimple*>, int> >::hash_entry, false, xcallocator>::find_with_hash
     3.79%  cc1plus  cc1plus                      [.] mark_used_flags
     3.33%  cc1plus  cc1plus                      [.] (anonymous namespace)::dom_info::calc_idoms
     3.06%  cc1plus  cc1plus                      [.] (anonymous namespace)::dom_info::calc_dfs_tree_nonrec
     2.68%  cc1plus  cc1plus                      [.] rtl_verify_flow_info_1
     2.52%  cc1plus  cc1plus                      [.] verify_ssa
     2.08%  cc1plus  cc1plus                      [.] rtl_verify_flow_info
Comment 20 Richard Biener 2020-01-13 11:22:18 UTC
(In reply to Martin Liška from comment #19)
> Created attachment 47631 [details]
> Patch candidate for #c15
> 
> I've got a patch for #c15.
> @Richi: Is it something you expected?

Well - there's the leh_state passed to both callers of the function
so I expected a patch to amend that rather than adding an on-the-side
caching hash-map.  So basically whenever we push a non-CLEANUP
update leh_state->xyz and when backtracking update it back (the whole
process looked recursive from a quick look).

> I see the following speed up:
> w/ verification: 9.9 -> 8.7s
> w/o verification: 4.8 -> 3.7
> 
> Now perf report looks like:
>      5.41%  cc1plus  cc1plus                      [.]
> hash_table<hash_map<gimple*, int,
> simple_hashmap_traits<default_hash_traits<gimple*>, int> >::hash_entry,
> false, xcallocator>::find_with_hash
>      3.79%  cc1plus  cc1plus                      [.] mark_used_flags
>      3.33%  cc1plus  cc1plus                      [.] (anonymous
> namespace)::dom_info::calc_idoms
>      3.06%  cc1plus  cc1plus                      [.] (anonymous
> namespace)::dom_info::calc_dfs_tree_nonrec
>      2.68%  cc1plus  cc1plus                      [.] rtl_verify_flow_info_1
>      2.52%  cc1plus  cc1plus                      [.] verify_ssa
>      2.08%  cc1plus  cc1plus                      [.] rtl_verify_flow_info
Comment 21 Martin Liška 2020-01-13 11:35:07 UTC
> Well - there's the leh_state passed to both callers of the function
> so I expected a patch to amend that rather than adding an on-the-side
> caching hash-map.  So basically whenever we push a non-CLEANUP
> update leh_state->xyz and when backtracking update it back (the whole
> process looked recursive from a quick look).

Yes, it's recurring, but leh_state instances are different:

#0  cleanup_is_dead_in (reg=0x7ffff5e94478) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1640
#1  0x00000000010c0ca6 in lower_try_finally (state=0x7fffffffc060, tp=0x7ffff2f0f4d0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1676
#2  0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc060, gsi=0x7fffffffc020) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
#3  0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc060, pseq=0x7ffff2f0f4c0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
#4  0x00000000010c0d53 in lower_try_finally (state=0x7fffffffc220, tp=0x7ffff2f0f498) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
#5  0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc220, gsi=0x7fffffffc1e0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
#6  0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc220, pseq=0x7ffff2f0f488) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
#7  0x00000000010c0d53 in lower_try_finally (state=0x7fffffffc3e0, tp=0x7ffff2f0f460) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
#8  0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc3e0, gsi=0x7fffffffc3a0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
#9  0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc3e0, pseq=0x7ffff2f0f450) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
#10 0x00000000010c0d53 in lower_try_finally (state=0x7fffffffc5a0, tp=0x7ffff2f0f428) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
#11 0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc5a0, gsi=0x7fffffffc560) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
#12 0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc5a0, pseq=0x7ffff2f0f418) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
#13 0x00000000010c0d53 in lower_try_finally (state=0x7fffffffc760, tp=0x7ffff2f0f3f0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
#14 0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc760, gsi=0x7fffffffc720) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
#15 0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc760, pseq=0x7ffff2f0f3e0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
#16 0x00000000010c0d53 in lower_try_finally (state=0x7fffffffc920, tp=0x7ffff2f0f3b8) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
#17 0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc920, gsi=0x7fffffffc8e0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
#18 0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc920, pseq=0x7ffff2f0f3a8) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
#19 0x00000000010c0d53 in lower_try_finally (state=0x7fffffffcae0, tp=0x7ffff2f0f380) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
#20 0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffcae0, gsi=0x7fffffffcaa0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099

where a new 'state' is always created here:

  1769  static gimple_seq
  1770  lower_catch (struct leh_state *state, gtry *tp)
  1771  {
  1772    eh_region try_region = NULL;
  1773    struct leh_state this_state = *state;
...
Comment 22 rguenther@suse.de 2020-01-13 12:34:14 UTC
On Mon, 13 Jan 2020, marxin at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93199
> 
> --- Comment #21 from Martin Liška <marxin at gcc dot gnu.org> ---
> > Well - there's the leh_state passed to both callers of the function
> > so I expected a patch to amend that rather than adding an on-the-side
> > caching hash-map.  So basically whenever we push a non-CLEANUP
> > update leh_state->xyz and when backtracking update it back (the whole
> > process looked recursive from a quick look).
> 
> Yes, it's recurring, but leh_state instances are different:
> 
> #0  cleanup_is_dead_in (reg=0x7ffff5e94478) at
> /home/marxin/Programming/gcc/gcc/tree-eh.c:1640
> #1  0x00000000010c0ca6 in lower_try_finally (state=0x7fffffffc060,
> tp=0x7ffff2f0f4d0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1676
> #2  0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc060,
> gsi=0x7fffffffc020) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
> #3  0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc060,
> pseq=0x7ffff2f0f4c0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
> #4  0x00000000010c0d53 in lower_try_finally (state=0x7fffffffc220,
> tp=0x7ffff2f0f498) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
> #5  0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc220,
> gsi=0x7fffffffc1e0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
> #6  0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc220,
> pseq=0x7ffff2f0f488) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
> #7  0x00000000010c0d53 in lower_try_finally (state=0x7fffffffc3e0,
> tp=0x7ffff2f0f460) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
> #8  0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc3e0,
> gsi=0x7fffffffc3a0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
> #9  0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc3e0,
> pseq=0x7ffff2f0f450) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
> #10 0x00000000010c0d53 in lower_try_finally (state=0x7fffffffc5a0,
> tp=0x7ffff2f0f428) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
> #11 0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc5a0,
> gsi=0x7fffffffc560) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
> #12 0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc5a0,
> pseq=0x7ffff2f0f418) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
> #13 0x00000000010c0d53 in lower_try_finally (state=0x7fffffffc760,
> tp=0x7ffff2f0f3f0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
> #14 0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc760,
> gsi=0x7fffffffc720) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
> #15 0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc760,
> pseq=0x7ffff2f0f3e0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
> #16 0x00000000010c0d53 in lower_try_finally (state=0x7fffffffc920,
> tp=0x7ffff2f0f3b8) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
> #17 0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffc920,
> gsi=0x7fffffffc8e0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
> #18 0x00000000010c1e8a in lower_eh_constructs_1 (state=0x7fffffffc920,
> pseq=0x7ffff2f0f3a8) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2158
> #19 0x00000000010c0d53 in lower_try_finally (state=0x7fffffffcae0,
> tp=0x7ffff2f0f380) at /home/marxin/Programming/gcc/gcc/tree-eh.c:1693
> #20 0x00000000010c1cdf in lower_eh_constructs_2 (state=0x7fffffffcae0,
> gsi=0x7fffffffcaa0) at /home/marxin/Programming/gcc/gcc/tree-eh.c:2099
> 
> where a new 'state' is always created here:
> 
>   1769  static gimple_seq
>   1770  lower_catch (struct leh_state *state, gtry *tp)
>   1771  {
>   1772    eh_region try_region = NULL;
>   1773    struct leh_state this_state = *state;
> ...

But it's copied.  There's a reason why I didn't tackle it
(because of this interwided stuff).  But I don't like the
simple cache-map.
Comment 23 Martin Liška 2020-01-15 09:57:57 UTC
> But it's copied.  There's a reason why I didn't tackle it
> (because of this interwided stuff).  But I don't like the
> simple cache-map.

Sure, I've done that in:
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00862.html
Comment 24 Richard Biener 2020-01-20 09:01:45 UTC
Fixed for GCC 10.  Note the testcase(s) expose other slownesses to be categorized and filed separately (mainly -O1 [-g] is interesting here, for -O2+ we don't
provide any guarantees with these kind of testcases).  But I don't want to
make this bug more complicated since the sink_clobbers stuff might be backportable.
Comment 25 CVS Commits 2020-01-20 10:11:30 UTC
The master branch has been updated by Martin Liska <marxin@gcc.gnu.org>:

https://gcc.gnu.org/g:92ce93c743b3c81f6911bc3d06056099369e9191

commit r10-6084-g92ce93c743b3c81f6911bc3d06056099369e9191
Author: Martin Liska <mliska@suse.cz>
Date:   Mon Jan 20 11:10:30 2020 +0100

    Record outer non-cleanup region in TREE EH.
    
    	PR tree-optimization/93199
    	* tree-eh.c (struct leh_state): Add
    	new field outer_non_cleanup.
    	(cleanup_is_dead_in): Pass leh_state instead
    	of eh_region.  Add a checking that state->outer_non_cleanup
    	points to outer non-clean up region.
    	(lower_try_finally): Record outer_non_cleanup
    	for this_state.
    	(lower_catch): Likewise.
    	(lower_eh_filter): Likewise.
    	(lower_eh_must_not_throw): Likewise.
    	(lower_cleanup): Likewise.
Comment 26 Jakub Jelinek 2020-03-04 09:41:23 UTC
GCC 8.4.0 has been released, adjusting target milestone.
Comment 27 Jakub Jelinek 2021-05-14 09:52:33 UTC
GCC 8 branch is being closed.
Comment 28 Richard Biener 2021-06-01 08:15:51 UTC
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
Comment 29 Richard Biener 2022-05-27 08:41:06 UTC
Fixed in GCC 10.