Summary: | Inline break tail-call optimization | ||
---|---|---|---|
Product: | gcc | Reporter: | Aso Renji <asorenji> |
Component: | tree-optimization | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | dimhen |
Priority: | P3 | Keywords: | missed-optimization |
Version: | 6.3.0 | ||
Target Milestone: | 10.0 | ||
Host: | Target: | ||
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | 2019-07-23 00:00:00 |
Description
Aso Renji
2018-06-04 20:06:12 UTC
Indeed. The testcase is easier to analyze if you replace iostreams with a plain printf. The key element seems to be taking the address of a local variable. Confirmed. It is because the address of 'value' escapes which is easier to see when you instead do static void __attribute__((noinline)) stackConsume(const char*stack){ char value; __builtin_putchar ((long)&value); } this then causes us to hit /* Make sure the tail invocation of this function does not indirectly refer to local variables. (Passing variables directly by value is OK.) */ FOR_EACH_LOCAL_DECL (cfun, idx, var) { if (TREE_CODE (var) != PARM_DECL && auto_var_in_fn_p (var, cfun->decl) && may_be_aliased (var) && (ref_maybe_used_by_stmt_p (call, var) || call_may_clobber_ref_p (call, var))) return; } where we think that the recursive call might possibly access that very escaped stack slot (for example via a global pointer). We do not have sophisticated enough analysis to fix that easily. Works for me on gcc 9.1.0. I compile it as: $ g++ test.cpp -o a -O2 And then running `./a` results in a bunch of: Consumed 80 bytes Consumed 80 bytes Consumed 80 bytes Consumed 80 bytes Consumed 80 bytes Consumed 80 bytes Consumed 80 bytes Consumed 80 bytes Consumed 80 bytes Consumed 80 bytes Consumed 80 bytes (In reply to Konstantin Kharlamov from comment #4) > Works for me on gcc 9.1.0. I compile it as: > > $ g++ test.cpp -o a -O2 > > And then running `./a` results in a bunch of: > > Consumed 80 bytes > Consumed 80 bytes > Consumed 80 bytes > Consumed 80 bytes > Consumed 80 bytes > Consumed 80 bytes > Consumed 80 bytes > Consumed 80 bytes > Consumed 80 bytes > Consumed 80 bytes > Consumed 80 bytes Just tested with 8.3.0 version on the other PC, same there, i.e. stack space does not increase when built with -O2. So this was fixed at least since 8.3.0. (In reply to Konstantin Kharlamov from comment #5) > Just tested with 8.3.0 version on the other PC, same there, i.e. stack space > does not increase when built with -O2. So this was fixed at least since > 8.3.0. Tested _after_ remove __attribute__((noinline)) from code? g++ --version g++ (Debian 8.3.0-6) 8.3.0 bug still present for me. Change "static __attribute__((noinline)) void" to "static void" and: Consumed 8384063 bytes Consumed 8384127 bytes Consumed 8384191 bytes Consumed 8384255 bytes Consumed 8384319 bytes Consumed 8384383 bytes Consumed 8384447 bytes Ошибка сегментирования (In reply to Aso Renji from comment #6) > (In reply to Konstantin Kharlamov from comment #5) > > Just tested with 8.3.0 version on the other PC, same there, i.e. stack space > > does not increase when built with -O2. So this was fixed at least since > > 8.3.0. > > Tested _after_ remove __attribute__((noinline)) from code? > g++ --version g++ (Debian 8.3.0-6) 8.3.0 bug still present for me. > Change "static __attribute__((noinline)) void" to "static void" and: > > Consumed 8384063 bytes > Consumed 8384127 bytes > Consumed 8384191 bytes > Consumed 8384255 bytes > Consumed 8384319 bytes > Consumed 8384383 bytes > Consumed 8384447 bytes > Ошибка сегментирования Oh, sorry, I didn't notice the testcase requires modification to start crashing. Yeah, it crashes with 9.1.0 too then. (In reply to Konstantin Kharlamov from comment #7) > Oh, sorry, I didn't notice the testcase requires modification to start > crashing. Yeah, it crashes with 9.1.0 too then. It's unhelpful to post a reproducer for a bug that doesn't reproduce the bug. The code provided should demonstrate the bug, without requiring changes. In the case of removing the noinline, GCC 10+ is able to tail call this function just fine and we don't get an overall stack increase. |