Bug 119491 - missed tail call due to exceptions which is empty
Summary: missed tail call due to exceptions which is empty
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 15.0
: P3 normal
Target Milestone: 15.0
Assignee: Jakub Jelinek
URL:
Keywords: EH, missed-optimization, tail-call
Depends on:
Blocks: 119376
  Show dependency treegraph
 
Reported: 2025-03-27 03:53 UTC by Drea Pinski
Modified: 2025-07-11 18:42 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2025-03-27 00:00:00


Attachments
gcc15-pr119491.patch (2.70 KB, patch)
2025-04-01 17:51 UTC, Jakub Jelinek
Details | Diff
gcc15-pr119491.patch (3.10 KB, patch)
2025-04-02 07:40 UTC, Jakub Jelinek
Details | Diff
Patch to make optimization apply to sjlj targets (1.56 KB, patch)
2025-07-11 18:21 UTC, Jorn Wolfgang Rennecke
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Drea Pinski 2025-03-27 03:53:52 UTC
Take this C++ code:
```
struct TcFieldData {
  struct DefaultInit {};
  TcFieldData() {}
};
struct TcParseTableBase;
void target_atomic(TcFieldData data);
template <typename VarintType>
static inline __attribute__((always_inline)) void ShiftMixParseVarint( int &res1) {  }
static inline __attribute__((always_inline)) void ParseVarint(int *value) {
  int res = 0;
  ShiftMixParseVarint<int>(res);
}
void f();
unsigned char fast_idx_mask;
TcFieldData bits;
inline __attribute__((always_inline)) void TagDispatch(TcFieldData) {
  if (fast_idx_mask)
    f();
  [[clang::musttail]] return target_atomic(bits);
}
void MpUnknownEnumFallback(TcFieldData data) {
  int tmp;
  ParseVarint(&tmp);
  [[clang::musttail]]
  return TagDispatch(TcFieldData{});
}
```

The musttail should not fail but currently does since we end up with:
```
  <bb 5> [count: 0]:
<L5>:
  goto <bb 8>; [100.00%]

...

  <bb 7> [count: 0]:
<L4>:

  <bb 8> [count: 0]:
<L2>:
  resx 1
```

Which should be optimized away. We do optimize it away but after tailc is run.

Note this is reduced from PR 119376 .
Comment 1 Drea Pinski 2025-03-27 03:58:06 UTC
This fixes it but I am not sure about adding another cleanup eh pass:
diff --git a/gcc/passes.def b/gcc/passes.def
index 9fd85a35a63..d9fff4cf833 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -368,6 +368,7 @@ along with GCC; see the file COPYING3.  If not see
         real warnings (e.g., testsuite/gcc.dg/pr18501.c).  */
       NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
       NEXT_PASS (pass_sccopy);
+      NEXT_PASS (pass_cleanup_eh);
       NEXT_PASS (pass_tail_calls);
       /* Split critical edges before late uninit warning to reduce the
          number of false positives from it.  */
Comment 2 Jakub Jelinek 2025-03-27 16:12:42 UTC
I think it would be useful to do one cleanup_eh far earlier after IPA, right now we have cleanup_eh before IPA (but after tailr) and another one only a few passes after tailc almost at the end of GIMPLE passes.
Inlining can introduce cases where EH needs to be cleaned up and perhaps the lack of that hurts even other optimizations than just tailc.
So perhaps around vrp1 (i.e. after some cleanups of the post IPA IL (ccp, forwprop, fre))?

Another possibility is what you have but have a special version of the pass guarded on cfun->has_musttail (though that won't help other tail calls, just musttail).

Or yet another possibility is not to handle musttail calls during tailc (so, pretty much set diag_musttail only in musttail; then it would be the same thing as only_musttail and could be merged into just one flag) pass and do it always in the musttail pass.  This won't help tail calls other than musttail as well.

Now the second or third option are maybe slightly safer this late in stage4, but perhaps we have still time to add another cleanup_eh.

Richi, what do you think about this?
Comment 3 rguenther@suse.de 2025-03-28 07:04:19 UTC
On Thu, 27 Mar 2025, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119491
> 
> Jakub Jelinek <jakub at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |jakub at gcc dot gnu.org,
>                    |                            |rguenth at gcc dot gnu.org
> 
> --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> I think it would be useful to do one cleanup_eh far earlier after IPA, right
> now we have cleanup_eh before IPA (but after tailr) and another one only a few
> passes after tailc almost at the end of GIMPLE passes.
> Inlining can introduce cases where EH needs to be cleaned up and perhaps the
> lack of that hurts even other optimizations than just tailc.
> So perhaps around vrp1 (i.e. after some cleanups of the post IPA IL (ccp,
> forwprop, fre))?
> 
> Another possibility is what you have but have a special version of the pass
> guarded on cfun->has_musttail (though that won't help other tail calls, just
> musttail).
> 
> Or yet another possibility is not to handle musttail calls during tailc (so,
> pretty much set diag_musttail only in musttail; then it would be the same thing
> as only_musttail and could be merged into just one flag) pass and do it always
> in the musttail pass.  This won't help tail calls other than musttail as well.
> 
> Now the second or third option are maybe slightly safer this late in stage4,
> but perhaps we have still time to add another cleanup_eh.
> 
> Richi, what do you think about this?

It probably makes sense to cleanup EH either right after inlining when
we inlined a "substantial" EH tree or after the first round of scalar
cleanups post-IPA which would usually be after the DSE/DCE pair which
itself is a bit late, only after jump threading & VRP and after array
bound diags (uh?).

I'm not sure we want to shuffle passes at this point though.
Comment 4 Jakub Jelinek 2025-04-01 17:51:56 UTC
Created attachment 60952 [details]
gcc15-pr119491.patch

Ok, in that case the following patch attempts to handle it (for musttail only) in the tailc/musttail passes.
Comment 5 Jakub Jelinek 2025-04-02 07:40:03 UTC
Created attachment 60957 [details]
gcc15-pr119491.patch

The previous patch caused quite a lot of regressions, this ought to fix it, but haven't done full bootstrap/regtest with it, just with an earlier version + incremental fix + make check on the testcases that failed previously.
Comment 6 GCC Commits 2025-04-02 18:03:18 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:8ea537988f718f026d94b558e09479c3b5fe088a

commit r15-9154-g8ea537988f718f026d94b558e09479c3b5fe088a
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Apr 2 20:02:34 2025 +0200

    tailc: Deal with trivially useless EH cleanups [PR119491]
    
    The following testcases FAIL, because EH cleanup is performed only before
    IPA and then right before musttail pass.
    At -O2 etc. (except for -O0/-Og) we handle musttail calls in the tailc
    pass though, and we can fail at that point because the calls might appear
    to throw internal exceptions which just don't do anything interesting
    (perhaps have debug statements or clobber statements in them) before they
    continue with resume of the exception (i.e. throw it externally).
    
    As Richi said in the PR (and I agree) that moving passes is risky at this
    point, the following patch instead teaches the tail{r,c} and musttail
    passes to deal with such extra EDGE_EH edges.
    
    It is fairly simple thing, if we see an EDGE_EH edge from the call we
    just look up where it lands and if there are no
    non-debug/non-clobber/non-label statements before resx which throws
    externally, such edge can be ignored for tail call optimization or
    tail recursion.  At other spots I just need to avoid using
    single_succ/single_succ_edge because the bb might have another edge -
    EDGE_EH.
    
    To make this less risky, this is done solely for the musttail calls for now.
    
    2025-04-02  Jakub Jelinek  <jakub@redhat.com>
    
            PR tree-optimization/119491
            * tree-tailcall.cc (single_non_eh_succ_edge): New function.
            (independent_of_stmt_p): Use single_non_eh_succ_edge (bb)->dest
            instead of single_succ (bb).
            (empty_eh_cleanup): New function.
            (find_tail_calls): Diagnose throwing of exceptions which do not
            propagate only if there are no EDGE_EH successor edges.  If there are
            and the call is musttail, use empty_eh_cleanup to find if the cleanup
            is not empty.  If not or the call is not musttail, use different
            diagnostics.  Set is_noreturn even if there are successor edges.  Use
            single_non_eh_succ_edge (abb) instead of single_succ_edge (abb).  Punt
            on internal noreturn calls.
            (decrease_profile): Don't assert 0 or 1 successor edges.
            (eliminate_tail_call): Use
            single_non_eh_succ_edge (gsi_bb (t->call_gsi)) instead of
            single_succ_edge (gsi_bb (t->call_gsi)).
            (tree_optimize_tail_calls_1): Also look into basic blocks with
            single succ edge which is EDGE_EH for noreturn musttail calls.
    
            * g++.dg/opt/musttail3.C: New test.
            * g++.dg/opt/musttail4.C: New test.
            * g++.dg/opt/musttail5.C: New test.
Comment 7 Jakub Jelinek 2025-04-02 18:47:30 UTC
Fixed.
Comment 8 Drea Pinski 2025-04-19 17:32:49 UTC
Note the non-musttail call missed tail call bug report about this is PR 28850.
Comment 9 Jorn Wolfgang Rennecke 2025-07-11 18:16:04 UTC
The patch does not work as is for sjlj exceptions.
Comment 10 Drea Pinski 2025-07-11 18:19:11 UTC
GCC has rejected tail calls for if sjlj exceptions are in use, please file a seperate bug even.
Comment 11 Jorn Wolfgang Rennecke 2025-07-11 18:21:42 UTC
Created attachment 61844 [details]
Patch to make optimization apply to sjlj targets

This patch allows musttail3.C and musttail5.C to be tailcall-optimized for sjlj targets.  musttail4.C still fails due to getting a different diagnostic than expected:
../../gcc/gcc/testsuite/g++.dg/opt/musttail4.C: In function 'int bar()':
../../gcc/gcc/testsuite/g++.dg/opt/musttail4.C:13:32: error: cannot tail-call: caller uses sjlj exceptions
   13 |   [[gnu::musttail]] return foo ();      // { dg-error "cannot tail-call: call may throw exception that does not propagate" }