Bug 59813

Summary: tail-call elimination didn't fire for left-shift of char to cout
Product: gcc Reporter: Nikolay Orliuk <virkony>
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: normal CC: daniel.kruegler, dje, jakub, ktietz, webrown.cpp
Priority: P3 Keywords: missed-optimization
Version: 4.8.2   
Target Milestone: ---   
URL: http://gcc.gnu.org/ml/gcc-patches/2019-02/msg00424.html
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2014-01-15 00:00:00
Bug Depends on:    
Bug Blocks: 77938    
Attachments: gcc10-pr59813-aarch64.patch

Description Nikolay Orliuk 2014-01-14 21:35:28 UTC
#include <iostream>
using namespace std;

void foo()
{
    cout << "x" << endl; // ok
    cout << 'x' << endl; // kills tail-call elimination in gcc 4.8.2
    foo();
}

int main() { foo(); return 0; }

// core-dups by long stack while in 4.7.3 works as expected (infinite loop)
Comment 1 Nikolay Orliuk 2014-01-14 21:59:14 UTC
In 4.7.3 that code works, but changing it to

void foo()
{
    cout << "x" << endl; // ok
    cout << 'x' << endl; // kills tail-call elimination in gcc 4.8.2
    struct {} bar; // kills tail-call elimination in 4.7.3
    foo();
}

Removing either "bar" or 'x' results in normal infinite loop.
In 4.5.4 this works fine as is.
Comment 2 Nikolay Orliuk 2014-01-14 22:05:53 UTC
My 4.5.4 built without graphite support.
Both 4.7.3 and 4.8.2 built with cloog 0.17.0 and isl 0.11.2
Comment 3 Andrew Pinski 2014-01-15 00:13:22 UTC
The early inliner is inlining a function which contains a variable which maybe escapes (address taken) which is causing the tail-call elimination not to happen.  There are a few ways of fixing this:
1)  Changing the the early inlining heuristics so it will not inline functions where local variables have their address taken.
2) Or have a tail-call optimize pass before early inlining.
Comment 4 Nikolay Orliuk 2014-01-15 07:17:59 UTC
Andrew Pinski, as long as address of variable isn't taken out of scope of function that is being tail-call optimized there is no need to worry about it and it is safe to optimize. Am I wrong?

If stdc++ lib contains code like:

ostream &operator<<(ostream &os, char x)
{ os.__escape = &x; }

That's probably wrong and should be fixed.

Tried to override with:

ostream &operator<<(ostream &os, char x)
{ cout.put(x); return os; } // works on all 4.5.4, 4.7.3 and 4.8.2

ostream &operator<<(ostream &os, char x)
{ cout.write(&x, 1); return os; } // fails even without "bar" for 4.7.3 and 4.8.2

Note that strange behaviour of 4.7.3. Is that means that adding "bar" fires inlining?..
Comment 5 Jakub Jelinek 2014-01-15 07:23:44 UTC
If you only care about tail recursion and not tail call optimization, perhaps, but either solution would be very dumb.  We now have var ={v} {CLOBBER}; stmts, even if those vars escape and are address taken, if there is a clobber stmt for those vars on every path from the escape point to the tail call candidate, we could perform tail recursion or tail call optimization, what we are really after is that the address of no automatic variable from the current function can escape to the tail recursion/call candidate.
As in:
void bar (char *);
void
foo (char *p)
{
  char c;
  bar (&c);
  bar (NULL); // We can't tail call optimize this
}

void
baz (char *p)
{
  {
    char c;
    bar (&c);
  }
  bar (NULL); // We could tail call optimize this
}
Comment 6 Jakub Jelinek 2019-01-25 15:51:42 UTC
*** Bug 89060 has been marked as a duplicate of this bug. ***
Comment 7 Jakub Jelinek 2019-01-25 15:52:49 UTC
*** Bug 77938 has been marked as a duplicate of this bug. ***
Comment 8 Jakub Jelinek 2019-05-08 17:07:18 UTC
Author: jakub
Date: Wed May  8 17:06:46 2019
New Revision: 271013

URL: https://gcc.gnu.org/viewcvs?rev=271013&root=gcc&view=rev
Log:
	PR c++/59813
	PR tree-optimization/89060
	* tree-ssa-live.h (live_vars_map): New typedef.
	(compute_live_vars, live_vars_at_stmt, destroy_live_vars): Declare.
	* tree-ssa-live.c: Include gimple-walk.h and cfganal.h.
	(struct compute_live_vars_data): New type.
	(compute_live_vars_visit, compute_live_vars_1, compute_live_vars,
	live_vars_at_stmt, destroy_live_vars): New functions.
	* tree-tailcall.c: Include tree-ssa-live.h.
	(live_vars, live_vars_vec): New global variables.
	(find_tail_calls): Perform variable life analysis before punting.
	(tree_optimize_tail_calls_1): Clean up live_vars and live_vars_vec.
	* tree-inline.h (struct copy_body_data): Add eh_landing_pad_dest
	member.
	* tree-inline.c (add_clobbers_to_eh_landing_pad): Remove BB argument.
	Perform variable life analysis to select variables that really need
	clobbers added.
	(copy_edges_for_bb): Don't call add_clobbers_to_eh_landing_pad here,
	instead set id->eh_landing_pad_dest and assert it is the same.
	(copy_cfg_body): Call it here if id->eh_landing_pad_dest is non-NULL.

	* gcc.dg/tree-ssa/pr89060.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/pr89060.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-inline.c
    trunk/gcc/tree-inline.h
    trunk/gcc/tree-ssa-live.c
    trunk/gcc/tree-ssa-live.h
    trunk/gcc/tree-tailcall.c
Comment 9 Christophe Lyon 2019-05-09 12:50:39 UTC
(In reply to Jakub Jelinek from comment #8)
> Author: jakub
> Date: Wed May  8 17:06:46 2019
> New Revision: 271013
> 
> URL: https://gcc.gnu.org/viewcvs?rev=271013&root=gcc&view=rev
> Log:
> 	PR c++/59813
> 	PR tree-optimization/89060
> 	* tree-ssa-live.h (live_vars_map): New typedef.
> 	(compute_live_vars, live_vars_at_stmt, destroy_live_vars): Declare.
> 	* tree-ssa-live.c: Include gimple-walk.h and cfganal.h.
> 	(struct compute_live_vars_data): New type.
> 	(compute_live_vars_visit, compute_live_vars_1, compute_live_vars,
> 	live_vars_at_stmt, destroy_live_vars): New functions.
> 	* tree-tailcall.c: Include tree-ssa-live.h.
> 	(live_vars, live_vars_vec): New global variables.
> 	(find_tail_calls): Perform variable life analysis before punting.
> 	(tree_optimize_tail_calls_1): Clean up live_vars and live_vars_vec.
> 	* tree-inline.h (struct copy_body_data): Add eh_landing_pad_dest
> 	member.
> 	* tree-inline.c (add_clobbers_to_eh_landing_pad): Remove BB argument.
> 	Perform variable life analysis to select variables that really need
> 	clobbers added.
> 	(copy_edges_for_bb): Don't call add_clobbers_to_eh_landing_pad here,
> 	instead set id->eh_landing_pad_dest and assert it is the same.
> 	(copy_cfg_body): Call it here if id->eh_landing_pad_dest is non-NULL.
> 
> 	* gcc.dg/tree-ssa/pr89060.c: New test.
> 
> Added:
>     trunk/gcc/testsuite/gcc.dg/tree-ssa/pr89060.c
> Modified:
>     trunk/gcc/ChangeLog
>     trunk/gcc/testsuite/ChangeLog
>     trunk/gcc/tree-inline.c
>     trunk/gcc/tree-inline.h
>     trunk/gcc/tree-ssa-live.c
>     trunk/gcc/tree-ssa-live.h
>     trunk/gcc/tree-tailcall.c


This patch introduced regressions in libstdc++ on aarch64, I'm now seeing these failures:
    18_support/exception_ptr/lifespan.cc execution test
    18_support/nested_exception/throw_with_nested.cc execution test
    18_support/uncaught_exception/14026.cc execution test
    20_util/specialized_algorithms/memory_management_tools/1.cc execution test
    20_util/unsynchronized_pool_resource/allocate.cc execution test
    20_util/variant/exception_safety.cc execution test
    20_util/variant/run.cc execution test
    21_strings/basic_string/cons/char/69092.cc execution test
    22_locale/locale/cons/12438.cc execution test
    22_locale/numpunct/members/pod/2.cc execution test
    23_containers/deque/cons/2.cc execution test
    23_containers/deque/requirements/exception/basic.cc execution test
    23_containers/deque/requirements/exception/propagation_consistent.cc execution test
    23_containers/forward_list/requirements/exception/basic.cc execution test
    23_containers/forward_list/requirements/exception/propagation_consistent.cc execution test
    23_containers/list/operations/78389.cc execution test
    23_containers/list/requirements/exception/basic.cc execution test
    23_containers/list/requirements/exception/propagation_consistent.cc execution test
    23_containers/map/requirements/exception/basic.cc execution test
    23_containers/map/requirements/exception/propagation_consistent.cc execution test
    23_containers/multimap/requirements/exception/basic.cc execution test
    23_containers/multimap/requirements/exception/propagation_consistent.cc execution test
    23_containers/multiset/requirements/exception/basic.cc execution test
    23_containers/multiset/requirements/exception/propagation_consistent.cc execution test
    23_containers/set/requirements/exception/basic.cc execution test
    23_containers/set/requirements/exception/propagation_consistent.cc execution test
    23_containers/unordered_map/requirements/exception/basic.cc execution test
    23_containers/unordered_map/requirements/exception/propagation_consistent.cc execution test
    23_containers/unordered_multimap/requirements/exception/basic.cc execution test
    23_containers/unordered_multimap/requirements/exception/propagation_consistent.cc execution test
    23_containers/unordered_multiset/insert/hash_policy.cc execution test
    23_containers/unordered_multiset/requirements/exception/basic.cc execution test
    23_containers/unordered_multiset/requirements/exception/propagation_consistent.cc execution test
    23_containers/unordered_set/insert/hash_policy.cc execution test
    23_containers/unordered_set/max_load_factor/robustness.cc execution test
    23_containers/unordered_set/requirements/exception/basic.cc execution test
    23_containers/unordered_set/requirements/exception/propagation_consistent.cc execution test
    23_containers/vector/capacity/2.cc execution test
    23_containers/vector/capacity/resize/strong_guarantee.cc execution test
    23_containers/vector/cons/4.cc execution test
    23_containers/vector/cons/86292.cc execution test
    23_containers/vector/modifiers/push_back/strong_guarantee.cc execution test
    23_containers/vector/requirements/exception/basic.cc execution test
    23_containers/vector/requirements/exception/propagation_consistent.cc execution test
    25_algorithms/stable_partition/mem_check.cc execution test
    25_algorithms/stable_sort/mem_check.cc execution test
    27_io/basic_filebuf/close/81256.cc execution test
    27_io/basic_filebuf/cons/wchar_t/10132-1.cc execution test
    27_io/basic_filebuf/overflow/char/13858.cc execution test
    27_io/basic_filebuf/overflow/wchar_t/13858.cc execution test
    27_io/basic_filebuf/seekoff/wchar_t/3.cc execution test
    27_io/basic_filebuf/seekpos/wchar_t/1.cc execution test
    27_io/basic_filebuf/sync/char/9182-1.cc execution test
    27_io/basic_fstream/53984.cc execution test
    27_io/basic_istream/exceptions/char/9561.cc execution test
    27_io/basic_istream/exceptions/wchar_t/9561.cc execution test
    27_io/basic_istream/extractors_arithmetic/char/exceptions_badbit_throw.cc execution test
    27_io/basic_istream/extractors_arithmetic/wchar_t/exceptions_badbit_throw.cc execution test
    27_io/basic_istream/extractors_other/char/exceptions_failbit_throw.cc execution test
    27_io/basic_istream/extractors_other/wchar_t/exceptions_failbit_throw.cc execution test
    27_io/basic_istream/seekg/char/exceptions_badbit_throw.cc execution test
    27_io/basic_istream/seekg/wchar_t/exceptions_badbit_throw.cc execution test
    27_io/basic_istream/tellg/char/exceptions_badbit_throw.cc execution test
    27_io/basic_istream/tellg/wchar_t/exceptions_badbit_throw.cc execution test
    27_io/basic_ostream/exceptions/char/9561.cc execution test
    27_io/basic_ostream/exceptions/wchar_t/9561.cc execution test
    27_io/basic_ostream/flush/char/exceptions_badbit_throw.cc execution test
    27_io/basic_ostream/flush/wchar_t/exceptions_badbit_throw.cc execution test
    27_io/basic_ostream/inserters_arithmetic/char/9555-oa.cc execution test
    27_io/basic_ostream/inserters_arithmetic/char/exceptions_badbit_throw.cc execution test
    27_io/basic_ostream/inserters_arithmetic/wchar_t/9555-oa.cc execution test
    27_io/basic_ostream/inserters_arithmetic/wchar_t/exceptions_badbit_throw.cc execution test
    27_io/basic_ostream/inserters_character/char/9555-oc.cc execution test
    27_io/basic_ostream/inserters_character/wchar_t/9555-oc.cc execution test
    27_io/basic_ostream/inserters_other/char/exceptions_failbit_throw.cc execution test
    27_io/basic_ostream/inserters_other/wchar_t/exceptions_failbit_throw.cc execution test
    27_io/basic_ostream/put/char/1.cc execution test
    27_io/basic_ostream/put/wchar_t/1.cc execution test
    27_io/basic_ostream/seekp/char/exceptions_badbit_throw.cc execution test
    27_io/basic_ostream/seekp/wchar_t/exceptions_badbit_throw.cc execution test
    27_io/basic_ostream/tellp/char/exceptions_badbit_throw.cc execution test
    27_io/basic_ostream/tellp/wchar_t/exceptions_badbit_throw.cc execution test
    27_io/basic_ostream/write/char/1.cc execution test
    27_io/basic_ostream/write/wchar_t/1.cc execution test
    27_io/ios_base/failure/dual_abi.cc execution test
    30_threads/async/except.cc execution test
    ext/enc_filebuf/char/13189.cc execution test
    ext/enc_filebuf/wchar_t/13189.cc execution test
    ext/pb_ds/example/hash_illegal_resize.cc execution test
    ext/vstring/requirements/exception/propagation_consistent.cc execution test
    libstdc++-prettyprinters/cxx17.cc execution test
Comment 10 Christophe Lyon 2019-05-09 13:09:27 UTC
And some regressions in g++ too:
    g++.dg/compat/eh/unexpected1 cp_compat_x_tst.o-cp_compat_y_tst.o execute 
    g++.dg/cpp0x/lambda/lambda-eh2.C  -std=gnu++14 execution test
    g++.dg/cpp0x/nullptr35.C  -std=c++14 execution test
    g++.dg/cpp0x/nullptr35.C  -std=c++17 execution test
    g++.dg/eh/registers1.C  -std=gnu++14 execution test
    g++.dg/eh/registers1.C  -std=gnu++17 execution test
    g++.dg/eh/registers1.C  -std=gnu++98 execution test
    g++.dg/eh/uncaught1.C  -std=gnu++14 execution test
    g++.dg/eh/uncaught1.C  -std=gnu++17 execution test
    g++.dg/eh/uncaught1.C  -std=gnu++98 execution test
    g++.dg/eh/uncaught4.C  -std=gnu++14 execution test
    g++.dg/eh/uncaught4.C  -std=gnu++17 execution test
    g++.dg/eh/uncaught4.C  -std=gnu++98 execution test
    g++.dg/eh/unexpected1.C  -std=c++14 execution test
    g++.dg/eh/unexpected1.C  -std=c++98 execution test
    g++.old-deja/g++.abi/cxa_vec.C  -std=gnu++14 execution test
    g++.old-deja/g++.abi/cxa_vec.C  -std=gnu++17 execution test
    g++.old-deja/g++.abi/cxa_vec.C  -std=gnu++98 execution test
    g++.old-deja/g++.eh/badalloc1.C  -std=c++14 execution test
    g++.old-deja/g++.eh/badalloc1.C  -std=c++17 execution test
    g++.old-deja/g++.eh/badalloc1.C  -std=c++98 execution test
    g++.old-deja/g++.eh/fntry1.C  -std=c++14 execution test
    g++.old-deja/g++.eh/fntry1.C  -std=c++17 execution test
    g++.old-deja/g++.eh/fntry1.C  -std=c++98 execution test
    g++.old-deja/g++.eh/rethrow1.C  -std=c++14 execution test
    g++.old-deja/g++.eh/rethrow1.C  -std=c++17 execution test
    g++.old-deja/g++.eh/rethrow1.C  -std=c++98 execution test
    g++.old-deja/g++.eh/rethrow2.C  -std=c++14 execution test
    g++.old-deja/g++.eh/rethrow2.C  -std=c++17 execution test
    g++.old-deja/g++.eh/rethrow2.C  -std=c++98 execution test
    g++.old-deja/g++.eh/rethrow3.C  -std=c++14 execution test
    g++.old-deja/g++.eh/rethrow3.C  -std=c++17 execution test
    g++.old-deja/g++.eh/rethrow3.C  -std=c++98 execution test
    g++.old-deja/g++.eh/rethrow4.C  -std=c++14 execution test
    g++.old-deja/g++.eh/rethrow4.C  -std=c++17 execution test
    g++.old-deja/g++.eh/rethrow4.C  -std=c++98 execution test
    g++.old-deja/g++.eh/rethrow5.C  -std=c++14 execution test
    g++.old-deja/g++.eh/rethrow5.C  -std=c++17 execution test
    g++.old-deja/g++.eh/rethrow5.C  -std=c++98 execution test
    g++.old-deja/g++.eh/rethrow6.C  -std=c++14 execution test
    g++.old-deja/g++.eh/rethrow6.C  -std=c++17 execution test
    g++.old-deja/g++.eh/rethrow6.C  -std=c++98 execution test
    g++.old-deja/g++.eh/spec2.C  -std=c++14 execution test
    g++.old-deja/g++.eh/spec2.C  -std=c++98 execution test
    g++.old-deja/g++.mike/eh23.C  -std=gnu++14 execution test
    g++.old-deja/g++.mike/eh23.C  -std=gnu++17 execution test
    g++.old-deja/g++.mike/eh23.C  -std=gnu++98 execution test
    g++.old-deja/g++.mike/eh33.C  -std=gnu++14 execution test
    g++.old-deja/g++.mike/eh33.C  -std=gnu++98 execution test
    g++.old-deja/g++.mike/eh39.C  -std=gnu++14 execution test
    g++.old-deja/g++.mike/eh39.C  -std=gnu++17 execution test
    g++.old-deja/g++.mike/eh39.C  -std=gnu++98 execution test
    g++.old-deja/g++.mike/eh40.C  -std=gnu++14 execution test
    g++.old-deja/g++.mike/eh40.C  -std=gnu++17 execution test
    g++.old-deja/g++.mike/eh40.C  -std=gnu++98 execution test
    g++.old-deja/g++.mike/eh50.C  -std=gnu++14 execution test
    g++.old-deja/g++.mike/eh50.C  -std=gnu++98 execution test
    g++.old-deja/g++.mike/eh51.C  -std=gnu++14 execution test
    g++.old-deja/g++.mike/eh51.C  -std=gnu++98 execution test
    g++.old-deja/g++.robertl/eb31.C  -std=c++14 execution test
    g++.old-deja/g++.robertl/eb31.C  -std=c++17 execution test
    g++.old-deja/g++.robertl/eb31.C  -std=c++98 execution test
Comment 11 Jakub Jelinek 2019-05-09 13:34:39 UTC
Is something in libstdc++ miscompiled or something in the tests?
Like, can you try those tests against libstdc++ built with that change reverted, but test with gcc with that revision in?
If it is in libstdc++, can you bisect to a particular *.o file (built by r271012 vs. r271013)?
Is it the tree-inline.[ch] change, or the tree-tailcall.c change?  For both one needs the tree-ssa-live.[ch] changes, but otherwise the changes are independent (essentially, it could be committed as the tree-ssa-live.[ch] patch, and then either nothing, or tree-inline.[ch], or tree-tailcall.c, or both).
Comment 12 Jakub Jelinek 2019-05-09 16:02:53 UTC
Are you sure about the bisection btw?  I've just reverted those changes, rebuilt cc1plus and rebuilt libstdc++ with that, but get still the same failures.
Comment 13 Jakub Jelinek 2019-05-09 16:38:46 UTC
Ah, seems it is libgcc_s.so.1 rather than libstdc++.  Bisecting.
Comment 14 Jakub Jelinek 2019-05-09 17:27:40 UTC
The only difference the patch makes that matters for those tests is in unwind-dw2.c, where in _Unwind_Resume_or_Rethrow function there is:
-  _20 = _Unwind_RaiseException (exc_4(D));
+  _20 = _Unwind_RaiseException (exc_4(D)); [tail call]
change.  This is in:
{
  struct _Unwind_Context this_context, cur_context;
  _Unwind_Reason_Code code;
  unsigned long frames;



  if (exc->private_1 == 0)
    return _Unwind_RaiseException (exc);
...
}
where this_context and cur_context variables are indeed addressable variables, but are only first touched in the ... code, so before my change we'd refuse to make _Unwind_RaiseException call into a tail call, but now it is a tail call and I don't see anything that would make it look wrong at the GIMPLE level.
If the aarch64 backend can't handle tail calls in functions doing __builtin_eh_return or something similar, then it needs to punt in those special cases.
Comment 15 Jakub Jelinek 2019-05-09 18:13:52 UTC
Created attachment 46332 [details]
gcc10-pr59813-aarch64.patch

Untested fix.  The problem is that after adding sp addition back to the caller's sp in the sibcall epilogue the aarch64 epilogue appends add sp, sp, x4 instruction, which adds a value of a random register to stack pointer (in this case 1) and so the tail called function SIGBUSes.
That addition must be done only in the eh epilogue (aarch64 seems to have only !for_sibcall and for_sibcall style epilogues, so one has to hope that a function with __builtin_eh_return doesn't have more than one normal epilogue, i.e. that they are all merged together).
Comment 16 Jakub Jelinek 2019-05-11 09:33:53 UTC
Author: jakub
Date: Sat May 11 09:33:22 2019
New Revision: 271093

URL: https://gcc.gnu.org/viewcvs?rev=271093&root=gcc&view=rev
Log:
	PR c++/59813
	* config/aarch64/aarch64.c (aarch64_expand_epilogue): Don't add
	EH_RETURN_STACKADJ_RTX to sp in sibcall epilogues.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/aarch64/aarch64.c
Comment 17 Jakub Jelinek 2019-05-20 21:34:18 UTC
Author: jakub
Date: Mon May 20 21:33:46 2019
New Revision: 271440

URL: https://gcc.gnu.org/viewcvs?rev=271440&root=gcc&view=rev
Log:
	PR c++/59813
	PR target/90418
	* function.h (struct function): Add calls_eh_return member.
	* gimplify.c (gimplify_call_expr): Set cfun->calls_eh_return when
	gimplifying __builtin_eh_return call.
	* tree-inline.c (initialize_cfun): Copy calls_eh_return from src_cfun
	to cfun.
	(expand_call_inline): Or in src_cfun->calls_eh_return into
	dst_cfun->calls_eh_return.
	* tree-tailcall.c (suitable_for_tail_call_opt_p): Return false if
	cfun->calls_eh_return.
	* lto-streamer-in.c (input_struct_function_base): Read calls_eh_return.
	* lto-streamer-out.c (output_struct_function_base): Write
	calls_eh_return.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/function.h
    trunk/gcc/gimplify.c
    trunk/gcc/lto-streamer-in.c
    trunk/gcc/lto-streamer-out.c
    trunk/gcc/tree-inline.c
    trunk/gcc/tree-tailcall.c