#include <iostream> using namespace std; void foo() { cout << "x" << endl; // ok cout << 'x' << endl; // kills tail-call elimination in gcc 4.8.2 foo(); } int main() { foo(); return 0; } // core-dups by long stack while in 4.7.3 works as expected (infinite loop)
In 4.7.3 that code works, but changing it to void foo() { cout << "x" << endl; // ok cout << 'x' << endl; // kills tail-call elimination in gcc 4.8.2 struct {} bar; // kills tail-call elimination in 4.7.3 foo(); } Removing either "bar" or 'x' results in normal infinite loop. In 4.5.4 this works fine as is.
My 4.5.4 built without graphite support. Both 4.7.3 and 4.8.2 built with cloog 0.17.0 and isl 0.11.2
The early inliner is inlining a function which contains a variable which maybe escapes (address taken) which is causing the tail-call elimination not to happen. There are a few ways of fixing this: 1) Changing the the early inlining heuristics so it will not inline functions where local variables have their address taken. 2) Or have a tail-call optimize pass before early inlining.
Andrew Pinski, as long as address of variable isn't taken out of scope of function that is being tail-call optimized there is no need to worry about it and it is safe to optimize. Am I wrong? If stdc++ lib contains code like: ostream &operator<<(ostream &os, char x) { os.__escape = &x; } That's probably wrong and should be fixed. Tried to override with: ostream &operator<<(ostream &os, char x) { cout.put(x); return os; } // works on all 4.5.4, 4.7.3 and 4.8.2 ostream &operator<<(ostream &os, char x) { cout.write(&x, 1); return os; } // fails even without "bar" for 4.7.3 and 4.8.2 Note that strange behaviour of 4.7.3. Is that means that adding "bar" fires inlining?..
If you only care about tail recursion and not tail call optimization, perhaps, but either solution would be very dumb. We now have var ={v} {CLOBBER}; stmts, even if those vars escape and are address taken, if there is a clobber stmt for those vars on every path from the escape point to the tail call candidate, we could perform tail recursion or tail call optimization, what we are really after is that the address of no automatic variable from the current function can escape to the tail recursion/call candidate. As in: void bar (char *); void foo (char *p) { char c; bar (&c); bar (NULL); // We can't tail call optimize this } void baz (char *p) { { char c; bar (&c); } bar (NULL); // We could tail call optimize this }
*** Bug 89060 has been marked as a duplicate of this bug. ***
*** Bug 77938 has been marked as a duplicate of this bug. ***
Author: jakub Date: Wed May 8 17:06:46 2019 New Revision: 271013 URL: https://gcc.gnu.org/viewcvs?rev=271013&root=gcc&view=rev Log: PR c++/59813 PR tree-optimization/89060 * tree-ssa-live.h (live_vars_map): New typedef. (compute_live_vars, live_vars_at_stmt, destroy_live_vars): Declare. * tree-ssa-live.c: Include gimple-walk.h and cfganal.h. (struct compute_live_vars_data): New type. (compute_live_vars_visit, compute_live_vars_1, compute_live_vars, live_vars_at_stmt, destroy_live_vars): New functions. * tree-tailcall.c: Include tree-ssa-live.h. (live_vars, live_vars_vec): New global variables. (find_tail_calls): Perform variable life analysis before punting. (tree_optimize_tail_calls_1): Clean up live_vars and live_vars_vec. * tree-inline.h (struct copy_body_data): Add eh_landing_pad_dest member. * tree-inline.c (add_clobbers_to_eh_landing_pad): Remove BB argument. Perform variable life analysis to select variables that really need clobbers added. (copy_edges_for_bb): Don't call add_clobbers_to_eh_landing_pad here, instead set id->eh_landing_pad_dest and assert it is the same. (copy_cfg_body): Call it here if id->eh_landing_pad_dest is non-NULL. * gcc.dg/tree-ssa/pr89060.c: New test. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/pr89060.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-inline.c trunk/gcc/tree-inline.h trunk/gcc/tree-ssa-live.c trunk/gcc/tree-ssa-live.h trunk/gcc/tree-tailcall.c
(In reply to Jakub Jelinek from comment #8) > Author: jakub > Date: Wed May 8 17:06:46 2019 > New Revision: 271013 > > URL: https://gcc.gnu.org/viewcvs?rev=271013&root=gcc&view=rev > Log: > PR c++/59813 > PR tree-optimization/89060 > * tree-ssa-live.h (live_vars_map): New typedef. > (compute_live_vars, live_vars_at_stmt, destroy_live_vars): Declare. > * tree-ssa-live.c: Include gimple-walk.h and cfganal.h. > (struct compute_live_vars_data): New type. > (compute_live_vars_visit, compute_live_vars_1, compute_live_vars, > live_vars_at_stmt, destroy_live_vars): New functions. > * tree-tailcall.c: Include tree-ssa-live.h. > (live_vars, live_vars_vec): New global variables. > (find_tail_calls): Perform variable life analysis before punting. > (tree_optimize_tail_calls_1): Clean up live_vars and live_vars_vec. > * tree-inline.h (struct copy_body_data): Add eh_landing_pad_dest > member. > * tree-inline.c (add_clobbers_to_eh_landing_pad): Remove BB argument. > Perform variable life analysis to select variables that really need > clobbers added. > (copy_edges_for_bb): Don't call add_clobbers_to_eh_landing_pad here, > instead set id->eh_landing_pad_dest and assert it is the same. > (copy_cfg_body): Call it here if id->eh_landing_pad_dest is non-NULL. > > * gcc.dg/tree-ssa/pr89060.c: New test. > > Added: > trunk/gcc/testsuite/gcc.dg/tree-ssa/pr89060.c > Modified: > trunk/gcc/ChangeLog > trunk/gcc/testsuite/ChangeLog > trunk/gcc/tree-inline.c > trunk/gcc/tree-inline.h > trunk/gcc/tree-ssa-live.c > trunk/gcc/tree-ssa-live.h > trunk/gcc/tree-tailcall.c This patch introduced regressions in libstdc++ on aarch64, I'm now seeing these failures: 18_support/exception_ptr/lifespan.cc execution test 18_support/nested_exception/throw_with_nested.cc execution test 18_support/uncaught_exception/14026.cc execution test 20_util/specialized_algorithms/memory_management_tools/1.cc execution test 20_util/unsynchronized_pool_resource/allocate.cc execution test 20_util/variant/exception_safety.cc execution test 20_util/variant/run.cc execution test 21_strings/basic_string/cons/char/69092.cc execution test 22_locale/locale/cons/12438.cc execution test 22_locale/numpunct/members/pod/2.cc execution test 23_containers/deque/cons/2.cc execution test 23_containers/deque/requirements/exception/basic.cc execution test 23_containers/deque/requirements/exception/propagation_consistent.cc execution test 23_containers/forward_list/requirements/exception/basic.cc execution test 23_containers/forward_list/requirements/exception/propagation_consistent.cc execution test 23_containers/list/operations/78389.cc execution test 23_containers/list/requirements/exception/basic.cc execution test 23_containers/list/requirements/exception/propagation_consistent.cc execution test 23_containers/map/requirements/exception/basic.cc execution test 23_containers/map/requirements/exception/propagation_consistent.cc execution test 23_containers/multimap/requirements/exception/basic.cc execution test 23_containers/multimap/requirements/exception/propagation_consistent.cc execution test 23_containers/multiset/requirements/exception/basic.cc execution test 23_containers/multiset/requirements/exception/propagation_consistent.cc execution test 23_containers/set/requirements/exception/basic.cc execution test 23_containers/set/requirements/exception/propagation_consistent.cc execution test 23_containers/unordered_map/requirements/exception/basic.cc execution test 23_containers/unordered_map/requirements/exception/propagation_consistent.cc execution test 23_containers/unordered_multimap/requirements/exception/basic.cc execution test 23_containers/unordered_multimap/requirements/exception/propagation_consistent.cc execution test 23_containers/unordered_multiset/insert/hash_policy.cc execution test 23_containers/unordered_multiset/requirements/exception/basic.cc execution test 23_containers/unordered_multiset/requirements/exception/propagation_consistent.cc execution test 23_containers/unordered_set/insert/hash_policy.cc execution test 23_containers/unordered_set/max_load_factor/robustness.cc execution test 23_containers/unordered_set/requirements/exception/basic.cc execution test 23_containers/unordered_set/requirements/exception/propagation_consistent.cc execution test 23_containers/vector/capacity/2.cc execution test 23_containers/vector/capacity/resize/strong_guarantee.cc execution test 23_containers/vector/cons/4.cc execution test 23_containers/vector/cons/86292.cc execution test 23_containers/vector/modifiers/push_back/strong_guarantee.cc execution test 23_containers/vector/requirements/exception/basic.cc execution test 23_containers/vector/requirements/exception/propagation_consistent.cc execution test 25_algorithms/stable_partition/mem_check.cc execution test 25_algorithms/stable_sort/mem_check.cc execution test 27_io/basic_filebuf/close/81256.cc execution test 27_io/basic_filebuf/cons/wchar_t/10132-1.cc execution test 27_io/basic_filebuf/overflow/char/13858.cc execution test 27_io/basic_filebuf/overflow/wchar_t/13858.cc execution test 27_io/basic_filebuf/seekoff/wchar_t/3.cc execution test 27_io/basic_filebuf/seekpos/wchar_t/1.cc execution test 27_io/basic_filebuf/sync/char/9182-1.cc execution test 27_io/basic_fstream/53984.cc execution test 27_io/basic_istream/exceptions/char/9561.cc execution test 27_io/basic_istream/exceptions/wchar_t/9561.cc execution test 27_io/basic_istream/extractors_arithmetic/char/exceptions_badbit_throw.cc execution test 27_io/basic_istream/extractors_arithmetic/wchar_t/exceptions_badbit_throw.cc execution test 27_io/basic_istream/extractors_other/char/exceptions_failbit_throw.cc execution test 27_io/basic_istream/extractors_other/wchar_t/exceptions_failbit_throw.cc execution test 27_io/basic_istream/seekg/char/exceptions_badbit_throw.cc execution test 27_io/basic_istream/seekg/wchar_t/exceptions_badbit_throw.cc execution test 27_io/basic_istream/tellg/char/exceptions_badbit_throw.cc execution test 27_io/basic_istream/tellg/wchar_t/exceptions_badbit_throw.cc execution test 27_io/basic_ostream/exceptions/char/9561.cc execution test 27_io/basic_ostream/exceptions/wchar_t/9561.cc execution test 27_io/basic_ostream/flush/char/exceptions_badbit_throw.cc execution test 27_io/basic_ostream/flush/wchar_t/exceptions_badbit_throw.cc execution test 27_io/basic_ostream/inserters_arithmetic/char/9555-oa.cc execution test 27_io/basic_ostream/inserters_arithmetic/char/exceptions_badbit_throw.cc execution test 27_io/basic_ostream/inserters_arithmetic/wchar_t/9555-oa.cc execution test 27_io/basic_ostream/inserters_arithmetic/wchar_t/exceptions_badbit_throw.cc execution test 27_io/basic_ostream/inserters_character/char/9555-oc.cc execution test 27_io/basic_ostream/inserters_character/wchar_t/9555-oc.cc execution test 27_io/basic_ostream/inserters_other/char/exceptions_failbit_throw.cc execution test 27_io/basic_ostream/inserters_other/wchar_t/exceptions_failbit_throw.cc execution test 27_io/basic_ostream/put/char/1.cc execution test 27_io/basic_ostream/put/wchar_t/1.cc execution test 27_io/basic_ostream/seekp/char/exceptions_badbit_throw.cc execution test 27_io/basic_ostream/seekp/wchar_t/exceptions_badbit_throw.cc execution test 27_io/basic_ostream/tellp/char/exceptions_badbit_throw.cc execution test 27_io/basic_ostream/tellp/wchar_t/exceptions_badbit_throw.cc execution test 27_io/basic_ostream/write/char/1.cc execution test 27_io/basic_ostream/write/wchar_t/1.cc execution test 27_io/ios_base/failure/dual_abi.cc execution test 30_threads/async/except.cc execution test ext/enc_filebuf/char/13189.cc execution test ext/enc_filebuf/wchar_t/13189.cc execution test ext/pb_ds/example/hash_illegal_resize.cc execution test ext/vstring/requirements/exception/propagation_consistent.cc execution test libstdc++-prettyprinters/cxx17.cc execution test
And some regressions in g++ too: g++.dg/compat/eh/unexpected1 cp_compat_x_tst.o-cp_compat_y_tst.o execute g++.dg/cpp0x/lambda/lambda-eh2.C -std=gnu++14 execution test g++.dg/cpp0x/nullptr35.C -std=c++14 execution test g++.dg/cpp0x/nullptr35.C -std=c++17 execution test g++.dg/eh/registers1.C -std=gnu++14 execution test g++.dg/eh/registers1.C -std=gnu++17 execution test g++.dg/eh/registers1.C -std=gnu++98 execution test g++.dg/eh/uncaught1.C -std=gnu++14 execution test g++.dg/eh/uncaught1.C -std=gnu++17 execution test g++.dg/eh/uncaught1.C -std=gnu++98 execution test g++.dg/eh/uncaught4.C -std=gnu++14 execution test g++.dg/eh/uncaught4.C -std=gnu++17 execution test g++.dg/eh/uncaught4.C -std=gnu++98 execution test g++.dg/eh/unexpected1.C -std=c++14 execution test g++.dg/eh/unexpected1.C -std=c++98 execution test g++.old-deja/g++.abi/cxa_vec.C -std=gnu++14 execution test g++.old-deja/g++.abi/cxa_vec.C -std=gnu++17 execution test g++.old-deja/g++.abi/cxa_vec.C -std=gnu++98 execution test g++.old-deja/g++.eh/badalloc1.C -std=c++14 execution test g++.old-deja/g++.eh/badalloc1.C -std=c++17 execution test g++.old-deja/g++.eh/badalloc1.C -std=c++98 execution test g++.old-deja/g++.eh/fntry1.C -std=c++14 execution test g++.old-deja/g++.eh/fntry1.C -std=c++17 execution test g++.old-deja/g++.eh/fntry1.C -std=c++98 execution test g++.old-deja/g++.eh/rethrow1.C -std=c++14 execution test g++.old-deja/g++.eh/rethrow1.C -std=c++17 execution test g++.old-deja/g++.eh/rethrow1.C -std=c++98 execution test g++.old-deja/g++.eh/rethrow2.C -std=c++14 execution test g++.old-deja/g++.eh/rethrow2.C -std=c++17 execution test g++.old-deja/g++.eh/rethrow2.C -std=c++98 execution test g++.old-deja/g++.eh/rethrow3.C -std=c++14 execution test g++.old-deja/g++.eh/rethrow3.C -std=c++17 execution test g++.old-deja/g++.eh/rethrow3.C -std=c++98 execution test g++.old-deja/g++.eh/rethrow4.C -std=c++14 execution test g++.old-deja/g++.eh/rethrow4.C -std=c++17 execution test g++.old-deja/g++.eh/rethrow4.C -std=c++98 execution test g++.old-deja/g++.eh/rethrow5.C -std=c++14 execution test g++.old-deja/g++.eh/rethrow5.C -std=c++17 execution test g++.old-deja/g++.eh/rethrow5.C -std=c++98 execution test g++.old-deja/g++.eh/rethrow6.C -std=c++14 execution test g++.old-deja/g++.eh/rethrow6.C -std=c++17 execution test g++.old-deja/g++.eh/rethrow6.C -std=c++98 execution test g++.old-deja/g++.eh/spec2.C -std=c++14 execution test g++.old-deja/g++.eh/spec2.C -std=c++98 execution test g++.old-deja/g++.mike/eh23.C -std=gnu++14 execution test g++.old-deja/g++.mike/eh23.C -std=gnu++17 execution test g++.old-deja/g++.mike/eh23.C -std=gnu++98 execution test g++.old-deja/g++.mike/eh33.C -std=gnu++14 execution test g++.old-deja/g++.mike/eh33.C -std=gnu++98 execution test g++.old-deja/g++.mike/eh39.C -std=gnu++14 execution test g++.old-deja/g++.mike/eh39.C -std=gnu++17 execution test g++.old-deja/g++.mike/eh39.C -std=gnu++98 execution test g++.old-deja/g++.mike/eh40.C -std=gnu++14 execution test g++.old-deja/g++.mike/eh40.C -std=gnu++17 execution test g++.old-deja/g++.mike/eh40.C -std=gnu++98 execution test g++.old-deja/g++.mike/eh50.C -std=gnu++14 execution test g++.old-deja/g++.mike/eh50.C -std=gnu++98 execution test g++.old-deja/g++.mike/eh51.C -std=gnu++14 execution test g++.old-deja/g++.mike/eh51.C -std=gnu++98 execution test g++.old-deja/g++.robertl/eb31.C -std=c++14 execution test g++.old-deja/g++.robertl/eb31.C -std=c++17 execution test g++.old-deja/g++.robertl/eb31.C -std=c++98 execution test
Is something in libstdc++ miscompiled or something in the tests? Like, can you try those tests against libstdc++ built with that change reverted, but test with gcc with that revision in? If it is in libstdc++, can you bisect to a particular *.o file (built by r271012 vs. r271013)? Is it the tree-inline.[ch] change, or the tree-tailcall.c change? For both one needs the tree-ssa-live.[ch] changes, but otherwise the changes are independent (essentially, it could be committed as the tree-ssa-live.[ch] patch, and then either nothing, or tree-inline.[ch], or tree-tailcall.c, or both).
Are you sure about the bisection btw? I've just reverted those changes, rebuilt cc1plus and rebuilt libstdc++ with that, but get still the same failures.
Ah, seems it is libgcc_s.so.1 rather than libstdc++. Bisecting.
The only difference the patch makes that matters for those tests is in unwind-dw2.c, where in _Unwind_Resume_or_Rethrow function there is: - _20 = _Unwind_RaiseException (exc_4(D)); + _20 = _Unwind_RaiseException (exc_4(D)); [tail call] change. This is in: { struct _Unwind_Context this_context, cur_context; _Unwind_Reason_Code code; unsigned long frames; if (exc->private_1 == 0) return _Unwind_RaiseException (exc); ... } where this_context and cur_context variables are indeed addressable variables, but are only first touched in the ... code, so before my change we'd refuse to make _Unwind_RaiseException call into a tail call, but now it is a tail call and I don't see anything that would make it look wrong at the GIMPLE level. If the aarch64 backend can't handle tail calls in functions doing __builtin_eh_return or something similar, then it needs to punt in those special cases.
Created attachment 46332 [details] gcc10-pr59813-aarch64.patch Untested fix. The problem is that after adding sp addition back to the caller's sp in the sibcall epilogue the aarch64 epilogue appends add sp, sp, x4 instruction, which adds a value of a random register to stack pointer (in this case 1) and so the tail called function SIGBUSes. That addition must be done only in the eh epilogue (aarch64 seems to have only !for_sibcall and for_sibcall style epilogues, so one has to hope that a function with __builtin_eh_return doesn't have more than one normal epilogue, i.e. that they are all merged together).
Author: jakub Date: Sat May 11 09:33:22 2019 New Revision: 271093 URL: https://gcc.gnu.org/viewcvs?rev=271093&root=gcc&view=rev Log: PR c++/59813 * config/aarch64/aarch64.c (aarch64_expand_epilogue): Don't add EH_RETURN_STACKADJ_RTX to sp in sibcall epilogues. Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64.c
Author: jakub Date: Mon May 20 21:33:46 2019 New Revision: 271440 URL: https://gcc.gnu.org/viewcvs?rev=271440&root=gcc&view=rev Log: PR c++/59813 PR target/90418 * function.h (struct function): Add calls_eh_return member. * gimplify.c (gimplify_call_expr): Set cfun->calls_eh_return when gimplifying __builtin_eh_return call. * tree-inline.c (initialize_cfun): Copy calls_eh_return from src_cfun to cfun. (expand_call_inline): Or in src_cfun->calls_eh_return into dst_cfun->calls_eh_return. * tree-tailcall.c (suitable_for_tail_call_opt_p): Return false if cfun->calls_eh_return. * lto-streamer-in.c (input_struct_function_base): Read calls_eh_return. * lto-streamer-out.c (output_struct_function_base): Write calls_eh_return. Modified: trunk/gcc/ChangeLog trunk/gcc/function.h trunk/gcc/gimplify.c trunk/gcc/lto-streamer-in.c trunk/gcc/lto-streamer-out.c trunk/gcc/tree-inline.c trunk/gcc/tree-tailcall.c