Bug 121389 - [15 regression] Failed musttail with -fsanitize=address -O0 (error: cannot tail-call: return value changed after call)
Summary: [15 regression] Failed musttail with -fsanitize=address -O0 (error: cannot ta...
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 16.0
: P3 normal
Target Milestone: 15.3
Assignee: Jakub Jelinek
URL:
Keywords: rejects-valid, tail-call
Depends on:
Blocks:
 
Reported: 2025-08-04 07:30 UTC by Carlos Galvez
Modified: 2025-08-08 19:43 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2025-08-05 00:00:00


Attachments
Preprocessed source (316.46 KB, application/gzip)
2025-08-04 07:30 UTC, Carlos Galvez
Details
Reduced testcase (134 bytes, text/plain)
2025-08-05 04:01 UTC, Andrew Pinski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Carlos Galvez 2025-08-04 07:30:19 UTC
Created attachment 62042 [details]
Preprocessed source

Hi,

We are bumping our GCC installation from 6db9d4e954bff3dfd926c7c9b71e41e47b7089c8 to 09f0768b55b96c861811a8989d7c1cc59b4c29b6 and encounter a compiler error, please find the relevant info below.

- the exact version of GCC: 09f0768b55b96c861811a8989d7c1cc59b4c29b6
- the system type: Ubuntu 22.04
- the options given when GCC was configured/built:     

    ./configure \
    --enable-languages=c,c++ \
    --enable-checking=release \
    --disable-multilib

- the complete command line that triggers the bug:

  g++ -fsanitize=address -c error.ii

- the compiler output (error messages, warnings, etc.):

In static member function ‘static const char* google::protobuf::internal::TcParser::RepeatedEnum(google::protobuf::MessageLite*, const char*, google::protobuf::internal::ParseContext*, const google::protobuf::internal::TcParseTableBase*, uint64_t, google::protobuf::internal::TcFieldData) [with TagType = unsigned char; short unsigned int xform_val = 768]’,
    inlined from ‘static const char* google::protobuf::internal::TcParser::FastErR1(google::protobuf::MessageLite*, const char*, google::protobuf::internal::ParseContext*, const google::protobuf::internal::TcParseTableBase*, uint64_t, google::protobuf::internal::TcFieldData)’ at src/google/protobuf/generated_message_tctable_lite.cc:1076:75:
src/google/protobuf/generated_message_tctable_lite.cc:1065:57: error: cannot tail-call: return value changed after call

- the preprocessed file (*.i*) that triggers the bug: see attached.

Thanks!
Comment 1 Sam James 2025-08-04 20:29:42 UTC
Oh, wait..

(In reply to Carlos Galvez from comment #0)
> 6db9d4e954bff3dfd926c7c9b71e41e47b7089c8 to
> 09f0768b55b96c861811a8989d7c1cc59b4c29b6

For ease of comparison: r15-7128-g6db9d4e954bff3 to r16-2727-g09f0768b55b96c
Comment 2 Andrew Pinski 2025-08-05 01:52:02 UTC
Reducing ...
Comment 3 Andrew Pinski 2025-08-05 04:01:08 UTC
Created attachment 62051 [details]
Reduced testcase

I added back some code because the loop became not happening.
Comment 4 Andrew Pinski 2025-08-05 04:01:24 UTC
.
Comment 5 Jakub Jelinek 2025-08-05 13:46:59 UTC
Reduced testcase:
int foo (void);
int bar (void);
int baz (unsigned *);

int
bar (void)
{
  do
    {
      unsigned t;
      int u = baz (&t);
      if (u == 42)
	[[gnu::musttail]] return foo ();
      if (u == -42)
        break;
    }
  while (true);
  return 42;
}
Comment 6 Jakub Jelinek 2025-08-05 13:51:16 UTC
In the r16-1886-gb610132ddbe4cb870b9c2752053616dcf12979fe commit I've been expecting finally_tmp.N will be only 0/1, but on this testcase it uses 0/1/2 values and even the assumption that the SSA_NAMEs (e.g. PHI result of finally_tmp.N has a single user is wrong, there can be multiple).
Comment 7 Jakub Jelinek 2025-08-05 13:53:08 UTC
Again I'll mention my rant that using heavily musttail attributes #ifndef __OPTIMIZE__ doesn't make much sense unless it is really required for the program not running out of stack.  In the protobuf case with hundreds of musttail call attributes in each TU I strongly doubt about that.
Comment 8 Jakub Jelinek 2025-08-05 14:11:24 UTC
Modified testcase which uses finally_tmp.N from 0 to 5.
int foo (void);
int bar (void);
int baz (unsigned *);

int
bar (void)
{
  for (int a = 0; a < 420; ++a)
    {
      for (int b = 0; b < 420; ++b)
	{
	  for (int c = 0; c < 420; ++c)
	    {
	      unsigned t;
	      int u = baz (&t);
	      if (u == 42)
		[[gnu::musttail]] return foo ();
	      if (u == -42)
		break;
	      if (u == 16)
		goto l1;
	      if (u == 18)
		goto l2;
	      if (u == 20)
		goto l3;
	    }
	  l3:;
	}
      l2:;
    }
  l1:;
  return 42;
}
Comment 9 Jakub Jelinek 2025-08-05 14:21:58 UTC
And yet another version which with -fsanitize=address -fdisable-tree-switchlower_O0 has there a GIMPLE_SWITCH to process rather than comparison:

int foo (void);
int bar (void);
int baz (unsigned *);

int
bar (void)
{
  for (int a = 0; a < 420; ++a)
    {
      for (int b = 0; b < 420; ++b)
	{
	  for (int c = 0; c < 420; ++c)
	    {
	      unsigned t;
	      int u = baz (&t);
	      if (u == 42)
		[[gnu::musttail]] return foo ();
	      if (u == -42)
		break;
	      if (u == 16)
		goto l1;
	      if (u == 18)
		goto l2;
	      if (u == 20)
		goto l3;
	      switch (u)
		{
		case 100: goto l100;
		case 101: goto l101;
		case 102: goto l102;
		case 103: goto l103;
		case 104: goto l104;
		case 105: goto l105;
		case 106: goto l106;
		case 107: goto l107;
		case 108: goto l108;
		case 109: goto l109;
		case 110: goto l110;
		case 111: goto l111;
		case 112: goto l112;
		case 113: goto l113;
		case 114: goto l114;
		case 115: goto l115;
		case 116: goto l116;
		case 117: goto l117;
		case 118: goto l118;
		case 119: goto l119;
		case 120: goto l120;
		case 121: goto l121;
		case 122: goto l122;
		case 123: goto l123;
		case 124: goto l124;
		case 125: goto l125;
		case 126: goto l126;
		case 127: goto l127;
		case 128: goto l128;
		case 129: goto l129;
		}
	    }
	  l3:;
	  foo ();
	  l100:
	  foo ();
	  l101:
	  foo ();
	  l102:
	  foo ();
	  l103:
	  foo ();
	  l104:
	  foo ();
	  l105:
	  foo ();
	  l106:
	  foo ();
	  l107:
	  foo ();
	  l108:
	  foo ();
	  l109:;
	}
      l2:;
      foo ();
      l110:
      foo ();
      l111:
      foo ();
      l112:
      foo ();
      l113:
      foo ();
      l114:
      foo ();
      l115:
      foo ();
      l116:
      foo ();
      l117:
      foo ();
      l118:
      foo ();
      l119:;
    }
  l1:;
  foo ();
  l120:
  foo ();
  l121:
  foo ();
  l122:
  foo ();
  l123:
  foo ();
  l124:
  foo ();
  l125:
  foo ();
  l126:
  foo ();
  l127:
  foo ();
  l128:
  foo ();
  l129:;
  return 42;
}
Comment 10 Jakub Jelinek 2025-08-05 14:24:30 UTC
And the same without -fdisable-tree-switchlower_O0 has there not just equality/non-equality comparisons but also finally_tmp.NNN_MMM > 33 and the like.
Comment 11 Carlos Galvez 2025-08-06 06:10:26 UTC
> Again I'll mention my rant that using heavily musttail attributes #ifndef __OPTIMIZE__ doesn't make much sense unless it is really required for the program not running out of stack.  In the protobuf case with hundreds of musttail call attributes in each TU I strongly doubt about that.

I'm simply a user of protobuf so I don't really have any opinion on the matter nor deep knowledge as to what the "best" solution is, I'm just reporting my findings in case there's anything you guys think is worth fixing on the GCC side. 

A solution checking the __OPTIMIZE__ macro as you propose sounds reasonable to me, perhaps you'd like to bring it up for discussion in the related Github ticket on the protobuf repo (I see you've been involved already)?
Comment 12 GCC Commits 2025-08-08 07:21:34 UTC
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:18c32a391685256d2fbd6e3e6b2f7f93200bb912

commit r16-3077-g18c32a391685256d2fbd6e3e6b2f7f93200bb912
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Aug 8 09:20:51 2025 +0200

    tailc: Handle other forms of finally_tmp.N conditional cleanups after musttail [PR121389]
    
    My earlier r16-1886 PR120608 change incorrectly assumed that the
    finally_tmp.N vars introduced by eh pass will be only initialized
    to values 0 and 1 and there will be only EQ_EXPR/NE_EXPR comparisons
    of those.
    
    The following testcases show that is a bad assumption, the eh pass
    sets finally_tmp.N vars to 0 up to some highest index depending on
    hoiw many different exits there are from the finally region.
    And it emits then switch (finally_tmp.N) statement for all the
    different cases.  So, if it uses more than 0/1 indexes, the lowering
    of the switch can turn it into a series of GIMPLE_CONDs,
      if (finally_tmp.N_M > 15)
        goto ...
      else
        goto ...
    
      if (finally_tmp.N_M > 7)
        goto ...
      else
        goto ...
    etc. (and that also means no longer single uses).  And if unlucky,
    we can see a non-lowered GIMPLE_SWITCH as well.
    
    So, the following patch removes the assumption that it has to be 0/1
    and EQ_EXPR/NE_EXPR, allows all the normal integral comparisons
    and handles GIMPLE_SWITCH too.
    
    2025-08-08  Jakub Jelinek  <jakub@redhat.com>
    
            PR middle-end/121389
            * tree-tailcall.cc (find_tail_calls): For finally_tmp.N
            handle not just GIMPLE_CONDs with EQ_EXPR/NE_EXPR and only
            values 0 and 1, but arbitrary non-negative values, arbitrary
            comparisons in conditions and also GIMPLE_SWITCH next to
            GIMPLE_CONDs.
    
            * c-c++-common/asan/pr121389-1.c: New test.
            * c-c++-common/asan/pr121389-2.c: New test.
            * c-c++-common/asan/pr121389-3.c: New test.
            * c-c++-common/asan/pr121389-4.c: New test.
Comment 13 Jakub Jelinek 2025-08-08 07:22:33 UTC
Fixed for 16+ so far.
Comment 14 Richard Biener 2025-08-08 07:33:46 UTC
GCC 15.2 is being released, retargeting bugs to 15.3.
Comment 15 GCC Commits 2025-08-08 08:38:24 UTC
The releases/gcc-15 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:d3c5366520e52057bed5b3ba6898366c79687c34

commit r15-10207-gd3c5366520e52057bed5b3ba6898366c79687c34
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Fri Aug 8 09:20:51 2025 +0200

    tailc: Handle other forms of finally_tmp.N conditional cleanups after musttail [PR121389]
    
    My earlier r16-1886 PR120608 change incorrectly assumed that the
    finally_tmp.N vars introduced by eh pass will be only initialized
    to values 0 and 1 and there will be only EQ_EXPR/NE_EXPR comparisons
    of those.
    
    The following testcases show that is a bad assumption, the eh pass
    sets finally_tmp.N vars to 0 up to some highest index depending on
    hoiw many different exits there are from the finally region.
    And it emits then switch (finally_tmp.N) statement for all the
    different cases.  So, if it uses more than 0/1 indexes, the lowering
    of the switch can turn it into a series of GIMPLE_CONDs,
      if (finally_tmp.N_M > 15)
        goto ...
      else
        goto ...
    
      if (finally_tmp.N_M > 7)
        goto ...
      else
        goto ...
    etc. (and that also means no longer single uses).  And if unlucky,
    we can see a non-lowered GIMPLE_SWITCH as well.
    
    So, the following patch removes the assumption that it has to be 0/1
    and EQ_EXPR/NE_EXPR, allows all the normal integral comparisons
    and handles GIMPLE_SWITCH too.
    
    2025-08-08  Jakub Jelinek  <jakub@redhat.com>
    
            PR middle-end/121389
            * tree-tailcall.cc (find_tail_calls): For finally_tmp.N
            handle not just GIMPLE_CONDs with EQ_EXPR/NE_EXPR and only
            values 0 and 1, but arbitrary non-negative values, arbitrary
            comparisons in conditions and also GIMPLE_SWITCH next to
            GIMPLE_CONDs.
    
            * c-c++-common/asan/pr121389-1.c: New test.
            * c-c++-common/asan/pr121389-2.c: New test.
            * c-c++-common/asan/pr121389-3.c: New test.
            * c-c++-common/asan/pr121389-4.c: New test.
    
    (cherry picked from commit 18c32a391685256d2fbd6e3e6b2f7f93200bb912)
Comment 16 Carlos Galvez 2025-08-08 19:29:44 UTC
Thanks for the fix, I've tested on my end and now everything compiles without issues!
Comment 17 Sam James 2025-08-08 19:43:49 UTC
(In reply to Carlos Galvez from comment #11)
> A solution checking the __OPTIMIZE__ macro as you propose sounds reasonable
> to me, perhaps you'd like to bring it up for discussion in the related
> Github ticket on the protobuf repo (I see you've been involved already)?

I will try find some time to do this.