Created attachment 42816 [details] reduced testcase Output: $ x86_64-pc-linux-gnu-gcc -O2 -fno-tree-dce -mno-push-args testcase.c $ ./a.out Segmentation fault foo() disassembly: foo: sub rsp, 16 pxor xmm0, xmm0 movaps XMMWORD PTR [rsp], xmm0 <== crashes HERE pop rax mov rax, QWORD PTR g[rip] pop rdx ret According to the ABI, rsp is aligned to 16 bytes before the function call; thus, rsp % 16 == 0 at the point of crash. Also, I do not understand why bar() is apparently inlined, when the function has noinline,noclone attributes (noipa helps; -fno-ipa-pure-const helps too); but "noipa" implies just noinline,noclone,no_icf according to https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes
(In reply to Zdenek Sojka from comment #0) > Also, I do not understand why bar() is apparently inlined, when the function bar isn't inlined. Instead ipa-pure-const determines it is a const function and because nothing really needs its return value, it is just not called at all. > has noinline,noclone attributes (noipa helps; -fno-ipa-pure-const helps > too); but "noipa" implies just noinline,noclone,no_icf according to > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common- > Function-Attributes No, it doesn't say anything like that. It says it implies those attributes, but is not a mere union of those 3 attributes, but disables many other IPA optimizes between that function and the callers. E.g. mere noinline, noclone, no_icf doesn't disable IPA VRP, or IPA bitwise CCP, etc.
Add -mno-stv fixes the crash.
RTL expand is correct: ;; _2 = bar (0, 0, 0, 0, 0); (insn 6 5 7 (parallel [ (set (reg/f:DI 7 sp) (plus:DI (reg/f:DI 7 sp) (const_int -16 [0xfffffffffffffff0]))) (clobber (reg:CC 17 flags)) ]) "x.i":17 -1 (expr_list:REG_ARGS_SIZE (const_int 16 [0x10]) (nil))) (insn 7 6 8 (set (reg:DI 90) (reg/f:DI 84 virtual-outgoing-args)) "x.i":17 -1 (nil)) (insn 8 7 9 (set (reg:DI 91) (const_int 0 [0])) "x.i":17 -1 (nil)) (insn 9 8 10 (set (reg:DI 92) (const_int 0 [0])) "x.i":17 -1 (nil)) (insn 10 9 11 (set (mem:TI (reg:DI 90) [0 S16 A128]) (const_int 0 [0])) "x.i":17 -1 (nil)) But vregs generates: (insn 5 2 6 2 (set (reg:DI 87 [ v.0_1 ]) (mem/c:DI (symbol_ref:DI ("v") [flags 0x2] <var_decl 0x7f520f0ddab0 v>) [1 v+0 S8 A64])) "x.i":17 85 {*movdi_internal} (nil)) (insn 6 5 7 2 (parallel [ (set (reg/f:DI 7 sp) (plus:DI (reg/f:DI 7 sp) (const_int -16 [0xfffffffffffffff0]))) (clobber (reg:CC 17 flags)) ]) "x.i":17 222 {*adddi_1} (expr_list:REG_ARGS_SIZE (const_int 16 [0x10]) (nil))) (insn 7 6 8 2 (set (reg:DI 90) (reg/f:DI 7 sp)) "x.i":17 85 {*movdi_internal} (nil)) (insn 8 7 9 2 (set (reg:DI 91) (const_int 0 [0])) "x.i":17 85 {*movdi_internal} (nil)) (insn 9 8 10 2 (set (reg:DI 92) (const_int 0 [0])) "x.i":17 85 {*movdi_internal} (nil)) (insn 10 9 11 2 (set (mem:TI (reg:DI 90) [0 S16 A128]) <<<< The alignment is wrong. (const_int 0 [0])) "x.i":17 84 {*movti_internal} (nil))
call expand is wrong: (insn 6 5 7 2 (parallel [ (set (reg/f:DI 7 sp) (plus:DI (reg/f:DI 7 sp) (const_int -16 [0xfffffffffffffff0]))) (clobber (reg:CC 17 flags)) ]) "x.i":17 222 {*adddi_1} (expr_list:REG_ARGS_SIZE (const_int 16 [0x10]) (nil))) doesn't count the alignment requirement later for (insn 10 9 11 (set (mem:TI (reg:DI 90) [0 S16 A128]) (const_int 0 [0])) "x.i":17 -1 (nil))
call expand is OK. pro_and_epilogue doesn't adjust outgoing stack to properly align stack for (insn 27 10 17 2 (set (mem:V1TI (reg/f:DI 7 sp) [0 S16 A128]) (reg:V1TI 21 xmm0 [95])) "x.i":17 1258 {movv1ti_internal} (nil)) since ud_dce removes the call to bar.
(In reply to Jakub Jelinek from comment #1) > (In reply to Zdenek Sojka from comment #0) > > Also, I do not understand why bar() is apparently inlined, when the function > > bar isn't inlined. Instead ipa-pure-const determines it is a const function > and because nothing really needs its return value, it is just not called at > all. Thank you for the explanation. That also explains why -fno-ipa-pure-const prevents removing the call. > > > has noinline,noclone attributes (noipa helps; -fno-ipa-pure-const helps > > too); but "noipa" implies just noinline,noclone,no_icf according to > > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common- > > Function-Attributes > > No, it doesn't say anything like that. It says it implies those attributes, > but is not a mere union of those 3 attributes, but disables many other IPA > optimizes between that function and the callers. > E.g. mere noinline, noclone, no_icf doesn't disable IPA VRP, or IPA bitwise > CCP, etc. I am sorry, I totally misread the description, so it had the very opposite meaning. Thank you for the comment.
Author: hjl Date: Thu Jan 11 20:44:46 2018 New Revision: 256555 URL: https://gcc.gnu.org/viewcvs?rev=256555&root=gcc&view=rev Log: i386: Align stack frame if argument is passed on stack When a function call is removed, it may become a leaf function. But if argument may be passed on stack, we need to align the stack frame when there is no tail call. Tested on Linux/i686 and Linux/x86-64. gcc/ PR target/83330 * config/i386/i386.c (ix86_compute_frame_layout): Align stack frame if argument is passed on stack. gcc/testsuite/ PR target/83330 * gcc.target/i386/pr83330.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr83330.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/testsuite/ChangeLog
Author: hjl Date: Mon Jan 15 16:13:23 2018 New Revision: 256703 URL: https://gcc.gnu.org/viewcvs?rev=256703&root=gcc&view=rev Log: i386: Align stack frame if argument is passed on stack When a function call is removed, it may become a leaf function. But if argument may be passed on stack, we need to align the stack frame when there is no tail call. Tested on Linux/i686 and Linux/x86-64. gcc/ Backport from mainline PR target/83330 * config/i386/i386.c (ix86_function_arg_advance): Set outgoing_args_on_stack to true if there are outgoing arguments on stack. (ix86_function_arg): Likewise. (ix86_compute_frame_layout): Align stack frame if argument is passed on stack. * config/i386/i386.h (machine_function): Add outgoing_args_on_stack. gcc/testsuite/ Backport from mainline PR target/83330 * gcc.target/i386/pr83330.c: New test. Added: branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr83330.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/i386/i386.c branches/gcc-7-branch/gcc/config/i386/i386.h branches/gcc-7-branch/gcc/testsuite/ChangeLog
Fixed for GCC 8 and 7.3