Take this C++ code: ``` struct TcFieldData { struct DefaultInit {}; TcFieldData() {} }; struct TcParseTableBase; void target_atomic(TcFieldData data); template <typename VarintType> static inline __attribute__((always_inline)) void ShiftMixParseVarint( int &res1) { } static inline __attribute__((always_inline)) void ParseVarint(int *value) { int res = 0; ShiftMixParseVarint<int>(res); } void f(); unsigned char fast_idx_mask; TcFieldData bits; inline __attribute__((always_inline)) void TagDispatch(TcFieldData) { if (fast_idx_mask) f(); [[clang::musttail]] return target_atomic(bits); } void MpUnknownEnumFallback(TcFieldData data) { int tmp; ParseVarint(&tmp); [[clang::musttail]] return TagDispatch(TcFieldData{}); } ``` The musttail should not fail but currently does since we end up with: ``` <bb 5> [count: 0]: <L5>: goto <bb 8>; [100.00%] ... <bb 7> [count: 0]: <L4>: <bb 8> [count: 0]: <L2>: resx 1 ``` Which should be optimized away. We do optimize it away but after tailc is run. Note this is reduced from PR 119376 .
This fixes it but I am not sure about adding another cleanup eh pass: diff --git a/gcc/passes.def b/gcc/passes.def index 9fd85a35a63..d9fff4cf833 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -368,6 +368,7 @@ along with GCC; see the file COPYING3. If not see real warnings (e.g., testsuite/gcc.dg/pr18501.c). */ NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */); NEXT_PASS (pass_sccopy); + NEXT_PASS (pass_cleanup_eh); NEXT_PASS (pass_tail_calls); /* Split critical edges before late uninit warning to reduce the number of false positives from it. */
I think it would be useful to do one cleanup_eh far earlier after IPA, right now we have cleanup_eh before IPA (but after tailr) and another one only a few passes after tailc almost at the end of GIMPLE passes. Inlining can introduce cases where EH needs to be cleaned up and perhaps the lack of that hurts even other optimizations than just tailc. So perhaps around vrp1 (i.e. after some cleanups of the post IPA IL (ccp, forwprop, fre))? Another possibility is what you have but have a special version of the pass guarded on cfun->has_musttail (though that won't help other tail calls, just musttail). Or yet another possibility is not to handle musttail calls during tailc (so, pretty much set diag_musttail only in musttail; then it would be the same thing as only_musttail and could be merged into just one flag) pass and do it always in the musttail pass. This won't help tail calls other than musttail as well. Now the second or third option are maybe slightly safer this late in stage4, but perhaps we have still time to add another cleanup_eh. Richi, what do you think about this?
On Thu, 27 Mar 2025, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119491 > > Jakub Jelinek <jakub at gcc dot gnu.org> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |jakub at gcc dot gnu.org, > | |rguenth at gcc dot gnu.org > > --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > I think it would be useful to do one cleanup_eh far earlier after IPA, right > now we have cleanup_eh before IPA (but after tailr) and another one only a few > passes after tailc almost at the end of GIMPLE passes. > Inlining can introduce cases where EH needs to be cleaned up and perhaps the > lack of that hurts even other optimizations than just tailc. > So perhaps around vrp1 (i.e. after some cleanups of the post IPA IL (ccp, > forwprop, fre))? > > Another possibility is what you have but have a special version of the pass > guarded on cfun->has_musttail (though that won't help other tail calls, just > musttail). > > Or yet another possibility is not to handle musttail calls during tailc (so, > pretty much set diag_musttail only in musttail; then it would be the same thing > as only_musttail and could be merged into just one flag) pass and do it always > in the musttail pass. This won't help tail calls other than musttail as well. > > Now the second or third option are maybe slightly safer this late in stage4, > but perhaps we have still time to add another cleanup_eh. > > Richi, what do you think about this? It probably makes sense to cleanup EH either right after inlining when we inlined a "substantial" EH tree or after the first round of scalar cleanups post-IPA which would usually be after the DSE/DCE pair which itself is a bit late, only after jump threading & VRP and after array bound diags (uh?). I'm not sure we want to shuffle passes at this point though.
Created attachment 60952 [details] gcc15-pr119491.patch Ok, in that case the following patch attempts to handle it (for musttail only) in the tailc/musttail passes.
Created attachment 60957 [details] gcc15-pr119491.patch The previous patch caused quite a lot of regressions, this ought to fix it, but haven't done full bootstrap/regtest with it, just with an earlier version + incremental fix + make check on the testcases that failed previously.
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:8ea537988f718f026d94b558e09479c3b5fe088a commit r15-9154-g8ea537988f718f026d94b558e09479c3b5fe088a Author: Jakub Jelinek <jakub@redhat.com> Date: Wed Apr 2 20:02:34 2025 +0200 tailc: Deal with trivially useless EH cleanups [PR119491] The following testcases FAIL, because EH cleanup is performed only before IPA and then right before musttail pass. At -O2 etc. (except for -O0/-Og) we handle musttail calls in the tailc pass though, and we can fail at that point because the calls might appear to throw internal exceptions which just don't do anything interesting (perhaps have debug statements or clobber statements in them) before they continue with resume of the exception (i.e. throw it externally). As Richi said in the PR (and I agree) that moving passes is risky at this point, the following patch instead teaches the tail{r,c} and musttail passes to deal with such extra EDGE_EH edges. It is fairly simple thing, if we see an EDGE_EH edge from the call we just look up where it lands and if there are no non-debug/non-clobber/non-label statements before resx which throws externally, such edge can be ignored for tail call optimization or tail recursion. At other spots I just need to avoid using single_succ/single_succ_edge because the bb might have another edge - EDGE_EH. To make this less risky, this is done solely for the musttail calls for now. 2025-04-02 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/119491 * tree-tailcall.cc (single_non_eh_succ_edge): New function. (independent_of_stmt_p): Use single_non_eh_succ_edge (bb)->dest instead of single_succ (bb). (empty_eh_cleanup): New function. (find_tail_calls): Diagnose throwing of exceptions which do not propagate only if there are no EDGE_EH successor edges. If there are and the call is musttail, use empty_eh_cleanup to find if the cleanup is not empty. If not or the call is not musttail, use different diagnostics. Set is_noreturn even if there are successor edges. Use single_non_eh_succ_edge (abb) instead of single_succ_edge (abb). Punt on internal noreturn calls. (decrease_profile): Don't assert 0 or 1 successor edges. (eliminate_tail_call): Use single_non_eh_succ_edge (gsi_bb (t->call_gsi)) instead of single_succ_edge (gsi_bb (t->call_gsi)). (tree_optimize_tail_calls_1): Also look into basic blocks with single succ edge which is EDGE_EH for noreturn musttail calls. * g++.dg/opt/musttail3.C: New test. * g++.dg/opt/musttail4.C: New test. * g++.dg/opt/musttail5.C: New test.
Fixed.
Note the non-musttail call missed tail call bug report about this is PR 28850.
The patch does not work as is for sjlj exceptions.
GCC has rejected tail calls for if sjlj exceptions are in use, please file a seperate bug even.
Created attachment 61844 [details] Patch to make optimization apply to sjlj targets This patch allows musttail3.C and musttail5.C to be tailcall-optimized for sjlj targets. musttail4.C still fails due to getting a different diagnostic than expected: ../../gcc/gcc/testsuite/g++.dg/opt/musttail4.C: In function 'int bar()': ../../gcc/gcc/testsuite/g++.dg/opt/musttail4.C:13:32: error: cannot tail-call: caller uses sjlj exceptions 13 | [[gnu::musttail]] return foo (); // { dg-error "cannot tail-call: call may throw exception that does not propagate" }