Bug 120608 - [15 regression] error: cannot tail-call: other reasons when using address sanitizer with musttail
Summary: [15 regression] error: cannot tail-call: other reasons when using address san...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 16.0
: P3 normal
Target Milestone: 15.2
Assignee: Jakub Jelinek
URL:
Keywords: diagnostic, rejects-valid, tail-call
Depends on:
Blocks:
 
Reported: 2025-06-09 18:44 UTC by Carlos Galvez
Modified: 2025-08-04 20:29 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2025-06-09 00:00:00


Attachments
Preprocessed source (316.46 KB, application/gzip)
2025-06-09 18:48 UTC, Carlos Galvez
Details
gcc16-pr120608.patch (4.12 KB, patch)
2025-06-27 16:56 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Carlos Galvez 2025-06-09 18:44:33 UTC
Hi,

We are bumping our GCC installation from 6db9d4e954bff3dfd926c7c9b71e41e47b7089c8 to d5050287acd28cbe23df527605449f514a659bba and we encounter this compile error:

src/google/protobuf/generated_message_tctable_lite.cc:310:45: error: cannot tail-call: other reasons
  310 |     PROTOBUF_MUSTTAIL return table->fallback(PROTOBUF_TC_PARAM_PASS);
      |                              ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~

We have reported this to the protobuf repo (https://github.com/protocolbuffers/protobuf/issues/21333), and they suggest to open a bug report here.

Command to reproduce:

g++ -c -fsanitize=address src/google/protobuf/generated_message_tctable_lite.cc -I src/ 

The error does not appear if we don't compile with sanitizer.

Source file: https://github.com/protocolbuffers/protobuf/blob/v21.12/src/google/protobuf/generated_message_tctable_lite.cc

Preprocessed source is attached to this bug report.

Let me know if there's anything else I can provide to help triaging the issue!
Comment 1 Andrew Pinski 2025-06-09 18:46:58 UTC
>Preprocessed source is attached to this bug report.


Looks like it was not. maybe you need to compress it.

Also -fsanitize=address and musttail is very fragile and is known not to always work.
Comment 2 Carlos Galvez 2025-06-09 18:48:06 UTC
Created attachment 61608 [details]
Preprocessed source
Comment 3 Carlos Galvez 2025-06-09 18:48:36 UTC
Sorry, it seems I went over the file limit, it should be up now :)
Comment 4 Andrew Pinski 2025-06-09 19:03:57 UTC
Reducing ...
Comment 5 Jakub Jelinek 2025-06-09 19:17:37 UTC
Reduced:
struct A;
struct B { unsigned long b; };
typedef const char *(*F) (void *u, const char *v, void *w, const A *x, unsigned long y, B z);
struct A { F a; };

const char *
foo (void *u, const char *v, void *w, const A *x, unsigned long y, B z)
{
  [[gnu::musttail]] return x->a(u, v, w, x, y, z);
}
Comment 6 Jakub Jelinek 2025-06-09 19:40:08 UTC
Another testcase:
void foo (int *, int *, int *);
void bar (int *, int *, int *);

void
baz (int *, int *, int *)
{
  int a = 42, b = -42, c = 0;
  foo (&a, &b, &c);
  [[gnu::musttail]] return bar (&a, &b, &c);
}

The reason is that we set disable_tailcalls when expanding blocks when there is
epilogue cleanup sequence emitted by asan_emit_stack_protection.
Wonder if clang emits such sequence before the musttail calls instead or what, will need to investigate.  Or maybe doesn't diagnose use after free in functions calling musttail calls?
Comment 7 Jakub Jelinek 2025-06-09 19:42:59 UTC
Just tried it on godbolt and seems clang just silently doesn't tail call it.  That is IMHO worse behavior than erroring on that out.
Comment 8 Andrew Pinski 2025-06-09 23:19:22 UTC
Another two:
```
struct s1 {};
const char *f(s1 a);
const char *g(s1 a) {
  [[clang::musttail]]
    return f(a);
}
```
and
```
struct s2 {int i;};
const char *f(s2 a);
const char *g(s2 a) {
  [[clang::musttail]]
    return f(a);
}
```
Comment 9 Sam James 2025-06-10 12:09:25 UTC
("Regression" by the same reasoning as before for other musttail bugs.)
Comment 10 Jakub Jelinek 2025-06-10 12:14:43 UTC
(In reply to Jakub Jelinek from comment #7)
> Just tried it on godbolt and seems clang just silently doesn't tail call it.
> That is IMHO worse behavior than erroring on that out.

Sorry, ignore this, PEBKAC, I've used [[gnu::musttail]] attribute in the godbolt case instead of [[clang::musttail]].  With [[clang::musttail]] they actually emit a copy of the sanitizer epilogue sequence right before the tail call, and we should do that too.
Comment 11 Jakub Jelinek 2025-06-12 09:46:36 UTC
Untested patch which fixes the #c5 testcase at -O0 -fsanitize=address.
There all that needs to be done is emit the asan epilogue sequence before the musttail call(s) instead of disabling the tail call when it is present.

This doesn't fix the #c6 testcase or better
void foo (int *, int *, int *);
void bar (int *, int *, int *);

void
baz (int *x, int *y, int *z)
{
  (void) x; (void) y; (void) z;
  int a = 42, b = -42, c = 0;
  foo (&a, &b, &c);
  [[gnu::musttail]] return bar (0, 0, 0);
}
at -O2 -fsanitize=address, because there we need to deal in the tailc pass with .ASAN_MARK POISON calls.

--- gcc/cfgexpand.cc.jj	2025-06-12 10:05:14.961227842 +0200
+++ gcc/cfgexpand.cc	2025-06-12 11:38:06.875469134 +0200
@@ -75,6 +75,7 @@ along with GCC; see the file COPYING3.
 #include "builtins.h"
 #include "opts.h"
 #include "gimple-range.h"
+#include "rtl-iter.h"
 
 /* Some systems use __main in a way incompatible with its use in gcc, in these
    cases use the macros NAME__MAIN to give a quoted symbol and SYMBOL__MAIN to
@@ -4434,9 +4435,10 @@ expand_gimple_stmt (gimple *stmt)
    tailcall) and the normal result happens via a sqrt instruction.  */
 
 static basic_block
-expand_gimple_tailcall (basic_block bb, gcall *stmt, bool *can_fallthru)
+expand_gimple_tailcall (basic_block bb, gcall *stmt, bool *can_fallthru,
+			rtx_insn *asan_epilogue_seq)
 {
-  rtx_insn *last2, *last;
+  rtx_insn *last2, *last, *first = get_last_insn ();
   edge e;
   edge_iterator ei;
   profile_probability probability;
@@ -4453,6 +4455,58 @@ expand_gimple_tailcall (basic_block bb,
   return NULL;
 
  found:
+
+  if (asan_epilogue_seq)
+    {
+      /* We need to emit a copy of the asan_epilogue_seq before
+	 the insns emitted by expand_gimple_stmt above.  The sequence
+	 can contain labels, which need to be remapped.  */
+      hash_map<rtx, rtx> label_map;
+      start_sequence ();
+      emit_note (NOTE_INSN_DELETED);
+      for (rtx_insn *insn = asan_epilogue_seq; insn; insn = NEXT_INSN (insn))
+	switch (GET_CODE (insn))
+	  {
+	  case INSN:
+	  case CALL_INSN:
+	  case JUMP_INSN:
+	    emit_copy_of_insn_after (insn, get_last_insn ());
+	    break;
+	  case CODE_LABEL:
+	    label_map.put ((rtx) insn, (rtx) emit_label (gen_label_rtx ()));
+	    break;
+	  case BARRIER:
+	    emit_barrier ();
+	    break;
+	  default:
+	    gcc_unreachable ();
+	  }
+      for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn))
+	if (JUMP_P (insn))
+	  {
+	    subrtx_ptr_iterator::array_type array;
+	    FOR_EACH_SUBRTX_PTR (iter, array, &PATTERN (insn), ALL)
+	      {
+		rtx *loc = *iter;
+		if (LABEL_REF_P (*loc))
+		  {
+		    rtx *lab = label_map.get ((rtx) label_ref_label (*loc));
+		    gcc_assert (lab);
+		    set_label_ref_label (*loc, as_a <rtx_insn *> (*lab));
+		  }
+	      }
+	    if (JUMP_LABEL (insn))
+	      {
+		rtx *lab = label_map.get (JUMP_LABEL (insn));
+		gcc_assert (lab);
+		JUMP_LABEL (insn) = *lab;
+	      }
+	  }
+      asan_epilogue_seq = NEXT_INSN (get_insns ());
+      end_sequence ();
+      emit_insn_before (asan_epilogue_seq, NEXT_INSN (first));
+    }
+
   /* ??? Wouldn't it be better to just reset any pending stack adjust?
      Any instructions emitted here are about to be deleted.  */
   do_pending_stack_adjust ();
@@ -6118,7 +6172,7 @@ reorder_operands (basic_block bb)
 /* Expand basic block BB from GIMPLE trees to RTL.  */
 
 static basic_block
-expand_gimple_basic_block (basic_block bb, bool disable_tail_calls)
+expand_gimple_basic_block (basic_block bb, rtx_insn *asan_epilogue_seq)
 {
   gimple_stmt_iterator gsi;
   gimple *stmt = NULL;
@@ -6423,14 +6477,16 @@ expand_gimple_basic_block (basic_block b
 	{
 	  gcall *call_stmt = dyn_cast <gcall *> (stmt);
 	  if (call_stmt
+	      && asan_epilogue_seq
 	      && gimple_call_tail_p (call_stmt)
-	      && disable_tail_calls)
+	      && !gimple_call_must_tail_p (call_stmt))
 	    gimple_call_set_tail (call_stmt, false);
 
 	  if (call_stmt && gimple_call_tail_p (call_stmt))
 	    {
 	      bool can_fallthru;
-	      new_bb = expand_gimple_tailcall (bb, call_stmt, &can_fallthru);
+	      new_bb = expand_gimple_tailcall (bb, call_stmt, &can_fallthru,
+					       asan_epilogue_seq);
 	      if (new_bb)
 		{
 		  if (can_fallthru)
@@ -7207,7 +7263,7 @@ pass_expand::execute (function *fun)
   head_end_for_bb.create (last_basic_block_for_fn (fun));
   FOR_BB_BETWEEN (bb, init_block->next_bb, EXIT_BLOCK_PTR_FOR_FN (fun),
 		  next_bb)
-    bb = expand_gimple_basic_block (bb, var_ret_seq != NULL_RTX);
+    bb = expand_gimple_basic_block (bb, var_ret_seq);
   disable_ranger (fun);
   FOR_BB_BETWEEN (bb, init_block->next_bb, EXIT_BLOCK_PTR_FOR_FN (fun),
 		  next_bb)
Comment 12 GCC Commits 2025-06-23 14:09:29 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:b9523a935aaa28ffae9118e199a2f43a8a98e27e

commit r16-1625-gb9523a935aaa28ffae9118e199a2f43a8a98e27e
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Jun 23 15:58:55 2025 +0200

    expand: Allow musttail tail calls with -fsanitize=address [PR120608]
    
    The following testcase is rejected by GCC 15 but accepted (with
    s/gnu/clang/) by clang.
    The problem is that we want to execute a sequence of instructions to
    unpoison all automatic variables in the function and mark the var block
    allocated for use-after-return sanitization poisoned after the call,
    so we were just disabling tail calls if there are any instructions
    returned from asan_emit_stack_protection.
    It is fine and necessary for normal tail calls, but for musttail
    tail calls we actually document that accessing the automatic vars of
    the caller is UB as if they end their lifetime right before the tail
    call, so we also want address sanitizer user-after-return to diagnose
    that.
    
    The following patch will only disable normal tail calls when that sequence
    is present, for musttail it will arrange to emit a copy of that sequence
    before the tail call sequence.  That sequence only tweaks the shadow memory
    and nothing in the code emitted by call expansion should touch the shadow
    memory, so it is ok to emit it already before argument setup.
    
    2025-06-23  Jakub Jelinek  <jakub@redhat.com>
    
            PR middle-end/120608
            * cfgexpand.cc: Include rtl-iter.h.
            (expand_gimple_tailcall): Add ASAN_EPILOG_SEQ argument, if non-NULL
            and expand_gimple_stmt emitted a tail call, emit a copy of that
            insn sequence before the call sequence.
            (expand_gimple_basic_block): Remove DISABLE_TAIL_CALLS argument, add
            ASAN_EPILOG_SEQ argument.  Disable tail call flag only on non-musttail
            calls if that flag is set, pass it to expand_gimple_tailcall.
            (pass_expand::execute): Pass VAR_RET_SEQ directly as last
            expand_gimple_basic_block argument rather than its comparison with
            NULL.
    
            * g++.dg/asan/pr120608.C: New test.
Comment 13 GCC Commits 2025-06-23 14:09:34 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:35a26f2ec55d20d524464c33b68b23328a7f6bbe

commit r16-1626-g35a26f2ec55d20d524464c33b68b23328a7f6bbe
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Jun 23 16:08:34 2025 +0200

    tailc: Allow musttail tail calls with -fsanitize=address [PR120608]
    
    These testcases show another problem with -fsanitize=address
    vs. musttail tail calls.  In particular, there can be
      .ASAN_MARK (POISON, &a, 4);
    etc. calls after a tail call and those just prevent the tailc pass
    to mark the musttail calls as [tail call].
    Normally, the sanopt pass (which comes after tailc) will optimize those
    away, the optimization is if there are no .ASAN_CHECK calls or normal
    function calls dominated by those .ASAN_MARK (POSION, ...) calls, the
    poison is not needed, because in the epilog sequence (the one dealt with
    in the patch posted earlier today) all the stack slots are unpoisoned anyway
    (or poisoned for use-after-return).
    Unlike __builtin_tsan_exit_function, .ASAN_MARK is not a real function
    and is always expanded inline, so can be never tail called successfully,
    so the patch just ignores those for the cfun->has_musttail && diag_musttail
    cases.  If there is a non-musttail call, it will fail worst case during
    expansion because there is the epilog asan sequence.
    
    2025-06-12  Jakub Jelinek  <jakub@redhat.com>
    
            PR middle-end/120608
            * tree-tailcall.cc (empty_eh_cleanup): Ignore .ASAN_MARK (POISON)
            internal calls for the cfun->has_musttail case and diag_musttail.
            (find_tail_calls): Likewise.
    
            * c-c++-common/asan/pr120608-1.c: New test.
            * c-c++-common/asan/pr120608-2.c: New test.
Comment 14 GCC Commits 2025-06-23 14:25:29 UTC
The releases/gcc-15 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:e5cf6027581770e97790f6495a56515ea4d0f7c2

commit r15-9854-ge5cf6027581770e97790f6495a56515ea4d0f7c2
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Jun 23 15:58:55 2025 +0200

    expand: Allow musttail tail calls with -fsanitize=address [PR120608]
    
    The following testcase is rejected by GCC 15 but accepted (with
    s/gnu/clang/) by clang.
    The problem is that we want to execute a sequence of instructions to
    unpoison all automatic variables in the function and mark the var block
    allocated for use-after-return sanitization poisoned after the call,
    so we were just disabling tail calls if there are any instructions
    returned from asan_emit_stack_protection.
    It is fine and necessary for normal tail calls, but for musttail
    tail calls we actually document that accessing the automatic vars of
    the caller is UB as if they end their lifetime right before the tail
    call, so we also want address sanitizer user-after-return to diagnose
    that.
    
    The following patch will only disable normal tail calls when that sequence
    is present, for musttail it will arrange to emit a copy of that sequence
    before the tail call sequence.  That sequence only tweaks the shadow memory
    and nothing in the code emitted by call expansion should touch the shadow
    memory, so it is ok to emit it already before argument setup.
    
    2025-06-23  Jakub Jelinek  <jakub@redhat.com>
    
            PR middle-end/120608
            * cfgexpand.cc: Include rtl-iter.h.
            (expand_gimple_tailcall): Add ASAN_EPILOG_SEQ argument, if non-NULL
            and expand_gimple_stmt emitted a tail call, emit a copy of that
            insn sequence before the call sequence.
            (expand_gimple_basic_block): Remove DISABLE_TAIL_CALLS argument, add
            ASAN_EPILOG_SEQ argument.  Disable tail call flag only on non-musttail
            calls if that flag is set, pass it to expand_gimple_tailcall.
            (pass_expand::execute): Pass VAR_RET_SEQ directly as last
            expand_gimple_basic_block argument rather than its comparison with
            NULL.
    
            * g++.dg/asan/pr120608.C: New test.
    
    (cherry picked from commit b9523a935aaa28ffae9118e199a2f43a8a98e27e)
Comment 15 GCC Commits 2025-06-23 14:25:34 UTC
The releases/gcc-15 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:fa2e03effa5251a6f7c8b79a8e3be81c90fb5e4f

commit r15-9855-gfa2e03effa5251a6f7c8b79a8e3be81c90fb5e4f
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Mon Jun 23 16:08:34 2025 +0200

    tailc: Allow musttail tail calls with -fsanitize=address [PR120608]
    
    These testcases show another problem with -fsanitize=address
    vs. musttail tail calls.  In particular, there can be
      .ASAN_MARK (POISON, &a, 4);
    etc. calls after a tail call and those just prevent the tailc pass
    to mark the musttail calls as [tail call].
    Normally, the sanopt pass (which comes after tailc) will optimize those
    away, the optimization is if there are no .ASAN_CHECK calls or normal
    function calls dominated by those .ASAN_MARK (POSION, ...) calls, the
    poison is not needed, because in the epilog sequence (the one dealt with
    in the patch posted earlier today) all the stack slots are unpoisoned anyway
    (or poisoned for use-after-return).
    Unlike __builtin_tsan_exit_function, .ASAN_MARK is not a real function
    and is always expanded inline, so can be never tail called successfully,
    so the patch just ignores those for the cfun->has_musttail && diag_musttail
    cases.  If there is a non-musttail call, it will fail worst case during
    expansion because there is the epilog asan sequence.
    
    2025-06-12  Jakub Jelinek  <jakub@redhat.com>
    
            PR middle-end/120608
            * tree-tailcall.cc (empty_eh_cleanup): Ignore .ASAN_MARK (POISON)
            internal calls for the cfun->has_musttail case and diag_musttail.
            (find_tail_calls): Likewise.
    
            * c-c++-common/asan/pr120608-1.c: New test.
            * c-c++-common/asan/pr120608-2.c: New test.
    
    (cherry picked from commit 35a26f2ec55d20d524464c33b68b23328a7f6bbe)
Comment 16 Jakub Jelinek 2025-06-23 14:28:10 UTC
Should be fixed now.
Comment 17 Carlos Galvez 2025-06-25 08:08:02 UTC
@Jakub Jelinek

Thanks a lot for the fixes! Unfortunately the problem still remains on my original post, with the same preprocessed file, but on a different line:

g++ -fsanitize=address error.ii
src/google/protobuf/generated_message_tctable_lite.cc:1065:57: error: cannot tail-call: other reasons

This is on GCC 5bc92717b804483a17dd5095f8b6d4fd75a472b1
Comment 18 Sam James 2025-06-25 23:44:23 UTC
,
Comment 19 Jakub Jelinek 2025-06-26 10:02:10 UTC
Why are you using the attribute at -O0?

In any case, this boils down to roughly -O0 -fsanitize=address
[[gnu::noipa]] int
foo (int x)
{
  return x;
}

[[gnu::noipa]] void
bar (int *x, int *y, int *z)
{
  (void) x;
  (void) y;
  (void) z;
}

[[gnu::noipa]] int
baz (int x)
{
  int a = 4;
  {
    int b = 8;
    {
      int c = 10;
      bar (&a, &b, &c);
      if (a + b + c == 22)
	[[gnu::musttail]] return foo (x);
      bar (&a, &b, &c);
    }
    bar (&a, &b, &a);
  }
  bar (&a, &a, &a);
  return 42;
}

During gimplification, .ASAN_MARK (POISON, ...); calls are added as try ... finally.
So we get something like:
  try
    {
      .ASAN_MARK (UNPOISON, &a, 4);
      a = 4;
      {
        int b;

        try
          {
            .ASAN_MARK (UNPOISON, &b, 4);
            b = 8;
            {
              int c;

              try
                {
                  .ASAN_MARK (UNPOISON, &c, 4);
                  c = 10;
...
                  D.3414 = foo (x); [must tail call]
                  // predicted unlikely by early return (on trees) predictor.
                  return D.3414;
...
                }
              finally
                {
                  .ASAN_MARK (POISON, &c, 4);
                }
...
          }
        finally
          {
            .ASAN_MARK (POISON, &b, 4);
          }
...
    }
  finally
    {
      .ASAN_MARK (POISON, &a, 4);
    }

Now, the eh pass turns those into
  D.3414 = foo (x); [must tail call]
  // predicted unlikely by early return (on trees) predictor.
  finally_tmp.3 = 0;
  goto <D.3417>;
...
  <D.3417>:
  .ASAN_MARK (POISON, &c, 4);
  switch (finally_tmp.3) <default: <D.3420>, case 1: <D.3418>>
  <D.3418>:
  goto <D.3419>;
  <D.3420>:
  finally_tmp.4 = 0;
  goto <D.3422>;
...
  <D.3422>:
  .ASAN_MARK (POISON, &b, 4);
  switch (finally_tmp.4) <default: <D.3425>, case 1: <D.3423>>
  <D.3423>:
  goto <D.3424>;
  <D.3425>:
  goto <D.3426>;
...
  <D.3426>:
  .ASAN_MARK (POISON, &a, 4);
  goto <D.3415>;
  <D.3415>:
  return D.3414;

And note we've been asked not to optimize anything and so we don't.
Before sanopt0 pass we still have
  _26 = foo (x_24(D)); [must tail call]
  // predicted unlikely by early return (on trees) predictor.
  finally_tmp.3_27 = 0;
  goto <bb 5>; [INV]
...
  <bb 5> :
  # _6 = PHI <_26(3), _23(D)(4)>
  # finally_tmp.3_8 = PHI <finally_tmp.3_27(3), finally_tmp.3_22(4)>
  .ASAN_MARK (POISON, &c, 4);
  if (finally_tmp.3_8 == 1)
    goto <bb 7>; [INV]
  else
    goto <bb 6>; [INV]
  
  <bb 6> :
<L4>:
  finally_tmp.4_31 = 0;
  goto <bb 8>; [INV]
...
  <bb 8> :
  # finally_tmp.4_9 = PHI <finally_tmp.4_31(6), finally_tmp.4_30(7)>
  .ASAN_MARK (POISON, &b, 4);
  if (finally_tmp.4_9 == 1)
    goto <bb 9>; [INV]
  else
    goto <bb 10>; [INV]
...
  <bb 10> :
  # _7 = PHI <_6(8), _34(9)>
  .ASAN_MARK (POISON, &a, 4);

  <bb 11> :
<L11>:
  return _7;

And then sanopt0 actually comes before musttail pass (the -O0 special copy of that), so .ASAN_MARK calls
are lowered into something musttail pass has no easy way to match.

So, in order to deal with this, we'd need to do something with pass ordering:
  NEXT_PASS (pass_sanopt);
  NEXT_PASS (pass_cleanup_eh);
  NEXT_PASS (pass_musttail);
We want the musttail pass before sanopt, but am not sure if we still rely on cleanup_eh or not (I think tailc/musttail
pass has workarounds for that), so maybe simply moving pass_musttail 2 lines up would work.
And another problem is the lack of forward propagation of the finally_tmp.* SSA_NAMEs into PHI nodes and whether
tailc/musttail will be able to deal with the GIMPLE_CONDs in there.  If we track through which edges we go from the
musttail call to the return path and we see GIMPLE_CONDs on that path, we could look it up; e.g. for the
if (finally_tmp.3_8 == 1) case, see it defined in # finally_tmp.3_8 = PHI <finally_tmp.3_27(3), finally_tmp.3_22(4)>
and because we came to bb 5 through the 3->5 edge, look at finally_tmp.3_27 SSA_NAME_DEF_STMT and because it is 0,
figure out the condition is false, etc.
Comment 20 Jakub Jelinek 2025-06-27 16:56:10 UTC
Created attachment 61735 [details]
gcc16-pr120608.patch

Untested fix.
Comment 21 Carlos Galvez 2025-06-29 13:14:53 UTC
> Why are you using the attribute at -O0?

We typically run our sanitizer builds at -O0 to ensure no UB is optimized away before the sanitizer gets a chance to detect it. Is there a more suitable optimization level for sanitizer?

I can also mention the problematic code is not ours, but as good practice we aim to build all source code (internal and external) with the same flags.
Comment 22 GCC Commits 2025-07-01 09:27:54 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:b610132ddbe4cb870b9c2752053616dcf12979fe

commit r16-1886-gb610132ddbe4cb870b9c2752053616dcf12979fe
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Jul 1 11:26:45 2025 +0200

    tailc: Handle musttail in case of non-cleaned-up cleanups, especially ASan related [PR120608]
    
    The following testcases FAIL at -O0 -fsanitize=address.  The problem is
    we end up with something like
      _26 = foo (x_24(D)); [must tail call]
      // predicted unlikely by early return (on trees) predictor.
      finally_tmp.3_27 = 0;
      goto <bb 5>; [INV]
    ...
      <bb 5> :
      # _6 = PHI <_26(3), _23(D)(4)>
      # finally_tmp.3_8 = PHI <finally_tmp.3_27(3), finally_tmp.3_22(4)>
      .ASAN_MARK (POISON, &c, 4);
      if (finally_tmp.3_8 == 1)
        goto <bb 7>; [INV]
      else
        goto <bb 6>; [INV]
    
      <bb 6> :
    <L4>:
      finally_tmp.4_31 = 0;
      goto <bb 8>; [INV]
    ...
      <bb 8> :
      # finally_tmp.4_9 = PHI <finally_tmp.4_31(6), finally_tmp.4_30(7)>
      .ASAN_MARK (POISON, &b, 4);
      if (finally_tmp.4_9 == 1)
        goto <bb 9>; [INV]
      else
        goto <bb 10>; [INV]
    ...
      <bb 10> :
      # _7 = PHI <_6(8), _34(9)>
      .ASAN_MARK (POISON, &a, 4);
    
      <bb 11> :
    <L11>:
      return _7;
    before the sanopt pass.  This is -O0, we don't try to do forward
    propagation, jump threading etc.  And what is worse, the sanopt
    pass lowers the .ASAN_MARK calls that the tailc/musttail passes
    already handle into somewthing that they can't easily pattern match.
    
    The following patch fixes that by
    1) moving the musttail pass 2 passes earlier (this is mostly just
       for -O0/-Og, for normal optimization levels musttail calls are
       handled in the tailc pass), i.e. across the sanopt and cleanup_eh
       passes
    2) recognizes these finally_tmp SSA_NAME assignments, PHIs using those
       and GIMPLE_CONDs deciding based on those both on the backwards
       walk (when we start from the edges to EXIT) and forwards walk
       (when we find a candidate tail call and process assignments
       after those up to the return statement).  For backwards walk,
       ESUCC argument has been added which is either NULL for the
       noreturn musttail case, or the succ edge through which we've
       reached bb and if it sees GIMPLE_COND with such comparison,
       based on the ESUCC and comparison it will remember which later
       edges to ignore later on and which bb must be walked up to the
       start during tail call discovery (the one with the PHI).
    3) the move of musttail pass across cleanup_eh pass resulted in
       g++.dg/opt/pr119613.C regressions but moving cleanup_eh before
       sanopt doesn't work too well, so I've extended
       empty_eh_cleanup to also handle resx which doesn't throw
       externally
    
    I know moving a pass on release branches feels risky, though the
    musttail pass is only relevant to functions with musttail calls,
    so something quite rare and only at -O0/-Og (unless one e.g.
    disables the tailc pass).
    
    2025-07-01  Jakub Jelinek  <jakub@redhat.com>
    
            PR middle-end/120608
            * passes.def (pass_musttail): Move before pass_sanopt.
            * tree-tailcall.cc (empty_eh_cleanup): Handle GIMPLE_RESX
            which doesn't throw externally through recursion on single
            eh edge (if any and cnt still allows that).
            (find_tail_calls): Add ESUCC, IGNORED_EDGES and MUST_SEE_BBS
            arguments.  Handle GIMPLE_CONDs for non-simplified cleanups with
            finally_tmp temporaries both on backward and forward walks, adjust
            recursive call.
            (tree_optimize_tail_calls_1): Adjust find_tail_calls callers.
    
            * c-c++-common/asan/pr120608-3.c: New test.
            * c-c++-common/asan/pr120608-4.c: New test.
            * g++.dg/asan/pr120608-3.C: New test.
            * g++.dg/asan/pr120608-4.C: New test.
Comment 23 Jakub Jelinek 2025-07-01 09:33:59 UTC
Fixed for 16+ so far.
Regarding -O0 -fsanitize=address, that is just fine, what I was questioning is using the musttail attribute for -O0.  I don't care about quality of generated code, but I do care that this particular call must be tail call optimized.  It can be meaningful if the program would crash otherwise, even at -O0, because of too deep recursion.  But if musttail is just used as an optimization rather than that the code will not really work without it (and in the protobuf case there are many architectures where the attribute is not used, so it must work even without it), then using it for -O0 looks quite pointless to me.  So guarding it on __OPTIMIZE__ > 0 might make sense.
Comment 24 GCC Commits 2025-07-01 10:44:49 UTC
The releases/gcc-15 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:ce869854db676c62cd5377adfd4ff62c3bd13257

commit r15-9885-gce869854db676c62cd5377adfd4ff62c3bd13257
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Jul 1 11:26:45 2025 +0200

    tailc: Handle musttail in case of non-cleaned-up cleanups, especially ASan related [PR120608]
    
    The following testcases FAIL at -O0 -fsanitize=address.  The problem is
    we end up with something like
      _26 = foo (x_24(D)); [must tail call]
      // predicted unlikely by early return (on trees) predictor.
      finally_tmp.3_27 = 0;
      goto <bb 5>; [INV]
    ...
      <bb 5> :
      # _6 = PHI <_26(3), _23(D)(4)>
      # finally_tmp.3_8 = PHI <finally_tmp.3_27(3), finally_tmp.3_22(4)>
      .ASAN_MARK (POISON, &c, 4);
      if (finally_tmp.3_8 == 1)
        goto <bb 7>; [INV]
      else
        goto <bb 6>; [INV]
    
      <bb 6> :
    <L4>:
      finally_tmp.4_31 = 0;
      goto <bb 8>; [INV]
    ...
      <bb 8> :
      # finally_tmp.4_9 = PHI <finally_tmp.4_31(6), finally_tmp.4_30(7)>
      .ASAN_MARK (POISON, &b, 4);
      if (finally_tmp.4_9 == 1)
        goto <bb 9>; [INV]
      else
        goto <bb 10>; [INV]
    ...
      <bb 10> :
      # _7 = PHI <_6(8), _34(9)>
      .ASAN_MARK (POISON, &a, 4);
    
      <bb 11> :
    <L11>:
      return _7;
    before the sanopt pass.  This is -O0, we don't try to do forward
    propagation, jump threading etc.  And what is worse, the sanopt
    pass lowers the .ASAN_MARK calls that the tailc/musttail passes
    already handle into somewthing that they can't easily pattern match.
    
    The following patch fixes that by
    1) moving the musttail pass 2 passes earlier (this is mostly just
       for -O0/-Og, for normal optimization levels musttail calls are
       handled in the tailc pass), i.e. across the sanopt and cleanup_eh
       passes
    2) recognizes these finally_tmp SSA_NAME assignments, PHIs using those
       and GIMPLE_CONDs deciding based on those both on the backwards
       walk (when we start from the edges to EXIT) and forwards walk
       (when we find a candidate tail call and process assignments
       after those up to the return statement).  For backwards walk,
       ESUCC argument has been added which is either NULL for the
       noreturn musttail case, or the succ edge through which we've
       reached bb and if it sees GIMPLE_COND with such comparison,
       based on the ESUCC and comparison it will remember which later
       edges to ignore later on and which bb must be walked up to the
       start during tail call discovery (the one with the PHI).
    3) the move of musttail pass across cleanup_eh pass resulted in
       g++.dg/opt/pr119613.C regressions but moving cleanup_eh before
       sanopt doesn't work too well, so I've extended
       empty_eh_cleanup to also handle resx which doesn't throw
       externally
    
    I know moving a pass on release branches feels risky, though the
    musttail pass is only relevant to functions with musttail calls,
    so something quite rare and only at -O0/-Og (unless one e.g.
    disables the tailc pass).
    
    2025-07-01  Jakub Jelinek  <jakub@redhat.com>
    
            PR middle-end/120608
            * passes.def (pass_musttail): Move before pass_sanopt.
            * tree-tailcall.cc (empty_eh_cleanup): Handle GIMPLE_RESX
            which doesn't throw externally through recursion on single
            eh edge (if any and cnt still allows that).
            (find_tail_calls): Add ESUCC, IGNORED_EDGES and MUST_SEE_BBS
            arguments.  Handle GIMPLE_CONDs for non-simplified cleanups with
            finally_tmp temporaries both on backward and forward walks, adjust
            recursive call.
            (tree_optimize_tail_calls_1): Adjust find_tail_calls callers.
    
            * c-c++-common/asan/pr120608-3.c: New test.
            * c-c++-common/asan/pr120608-4.c: New test.
            * g++.dg/asan/pr120608-3.C: New test.
            * g++.dg/asan/pr120608-4.C: New test.
    
    (cherry picked from commit b610132ddbe4cb870b9c2752053616dcf12979fe)
Comment 25 Jakub Jelinek 2025-07-14 10:37:20 UTC
Fixed.
Comment 26 Carlos Galvez 2025-08-04 07:20:17 UTC
Hello!

I am checking this again after my summer vacation, on latest trunk (09f0768b55b96c861811a8989d7c1cc59b4c29b6), and I still get errors on the original preprocessed source file I posted, with the same compilation command:

src/google/protobuf/generated_message_tctable_lite.cc:1065:57: error: cannot tail-call: return value changed after call


This time is not exactly the same error, however. Please let me know if you'd like me to open a separate ticket just for that or would like to continue the tracking here.

Thanks for the help!
Comment 27 Jakub Jelinek 2025-08-04 07:22:03 UTC
Separate bugreport, with the https://gcc.gnu.org/bugs.html requested information.  Thanks.
Comment 28 Carlos Galvez 2025-08-04 07:30:53 UTC
Thanks, here it is :) 

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121389