I think armeb-pc-linux-gnueabi-gcc 4.6.2pre 20110813 (cross-compiled on x86-64) is a bit too optimistic with -O2: unsigned var[2]; void test(int arg) { unsigned v = *(volatile unsigned *)(&var[arg]); *(volatile unsigned *)(&var[arg]) = v; } produces: 00000000 <test>: 0: e12fff1e bx lr
Not ARM-specific, also happens on i686-linux.
Mine. /* If the statement is a scalar store, see if the expression has the same value number as its rhs. If so, the store is dead. */ else if (gimple_assign_single_p (stmt) && !is_gimple_reg (gimple_assign_lhs (stmt)) && (TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME || is_gimple_min_invariant (gimple_assign_rhs1 (stmt)))) needs a && !gimple_has_volatile_ops (stmt).
Actually I was wrong guessing - the tree level is fine, it is combine that removes the "noop" move completely: Trying 6 -> 11: Failed to match this instruction: (set (mem/s:SI (plus:DI (ashiftrt:DI (mult:DI (subreg:DI (reg/v:SI 60 [ arg ]) 0) (const_int 4294967296 [0x100000000])) (const_int 30 [0x1e])) (symbol_ref:DI ("var") <var_decl 0x7ffff7ee3140 var>)) [2 var S4 A32]) (mem/s/v:SI (plus:DI (ashiftrt:DI (mult:DI (subreg:DI (reg/v:SI 60 [ arg ]) 0) (const_int 4294967296 [0x100000000])) (const_int 30 [0x1e])) (symbol_ref:DI ("var") <var_decl 0x7ffff7ee3140 var>)) [2 var S4 A32])) rejecting combination of insns 6 and 11 original costs 4 + 9 = 13 replacement cost 31 deleting noop move 11 Confirmed on x86_64 as well.
noop_move_p returns true for this - ignoding the side-effects.
Actually it is not noop_move_p that's at fault here, but the disgusting hack for NOOP_MOVE_INSN_CODE. The insn is marked as a NOOP_MOVE somewhere else in combine.
int set_noop_p (const_rtx set) { rtx src = SET_SRC (set); rtx dst = SET_DEST (set); if (dst == pc_rtx && src == pc_rtx) return 1; if (MEM_P (dst) && MEM_P (src)) return rtx_equal_p (dst, src) && !side_effects_p (dst); Note there is no check on side_effects_p(src). Breakpoint 8, set_noop_p (set=0x7ffff70a6d98) at ../../trunk/gcc/rtlanal.c:1094 1094 rtx src = SET_SRC (set); (gdb) step 1095 rtx dst = SET_DEST (set); (gdb) next 1097 if (dst == pc_rtx && src == pc_rtx) (gdb) p debug_rtx(dst) (mem/s:SI (plus:DI (mult:DI (reg:DI 61 [ arg ]) (const_int 4 [0x4])) (symbol_ref:DI ("var") <var_decl 0x7ffff7eb0140 var>)) [2 var S4 A32]) $8 = void (gdb) p debug_rtx(src) (mem/s/v:SI (plus:DI (mult:DI (reg:DI 61 [ arg ]) (const_int 4 [0x4])) (symbol_ref:DI ("var") <var_decl 0x7ffff7eb0140 var>)) [2 var S4 A32]) $9 = void (gdb) step 1100 if (MEM_P (dst) && MEM_P (src)) (gdb) p side_effects_p(src) $11 = 1 (gdb) step 1101 return rtx_equal_p (dst, src) && !side_effects_p (dst); (gdb) p rtx_equal_p (dst, src) $12 = 1 (gdb) p side_effects_p(dst) $10 = 0 (gdb) Note that dst is not a volatile MEM for some reason.
Comes from SSA expand => Matz
(In reply to comment #7) > Comes from SSA expand => Matz Comes from SSA expand because it is already wrong in the .expand dump: ;; MEM[(volatile unsigned int *)&var][arg_1(D)] ={v} v_2; (insn 9 8 10 (set (reg:DI 63) (sign_extend:DI (reg/v:SI 60 [ argD.1604 ]))) t.c:6 -1 (nil)) (insn 10 9 11 (set (reg/f:DI 64) (symbol_ref:DI ("var") <var_decl 0x7f4b8054f140 var>)) t.c:6 -1 (nil)) (insn 11 10 0 (set (mem/s:SI (plus:DI (mult:DI (reg:DI 63) (const_int 4 [0x4])) (reg/f:DI 64)) [2 varD.1603 S4 A32]) (reg/v:SI 59 [ vD.1607 ])) t.c:6 -1 (nil)) It seems to me that the MEM in insn 11 should be mem/s/v.
GCC 4.6.2 is being released.
Created attachment 25929 [details] gcc47-pr50078.patch IMHO it is a forwprop bug, which changes a MEM_REF in this case into an ARRAY_REF (with MEM_REF operand), but copies TREE_THIS_VOLATILE from the original MEM_REF not to the ARRAY_REF, but to the inner MEM_REF.
Author: jakub Date: Mon Nov 28 21:03:11 2011 New Revision: 181786 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181786 Log: PR tree-optimization/50078 * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Copy over TREE_THIS_VOLATILE also from the old to new lhs resp. rhs. * gcc.dg/pr50078.c: New test. Added: trunk/gcc/testsuite/gcc.dg/pr50078.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-forwprop.c
Fixed on the trunk so far.
Author: jakub Date: Fri Dec 9 11:32:35 2011 New Revision: 182157 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=182157 Log: Backport from mainline 2011-12-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/51466 * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Also copy TREE_SIDE_EFFECTS. * gcc.c-torture/execute/pr51466.c: New test. 2011-11-28 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/50078 * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Copy over TREE_THIS_VOLATILE also from the old to new lhs resp. rhs. * gcc.dg/pr50078.c: New test. Added: branches/gcc-4_6-branch/gcc/testsuite/gcc.c-torture/execute/pr51466.c branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/pr50078.c Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/testsuite/ChangeLog branches/gcc-4_6-branch/gcc/tree-ssa-forwprop.c
The original test case now compiles correctly for me with gcc-4.6-20111209 for both i686-linux and armv5tel-linux-gnueabi targets.
Fixed.
Author: xguo Date: Mon Jun 11 09:51:05 2012 New Revision: 188383 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188383 Log: 2012-06-11 Terry Guo <terry.guo@arm.com> Backport from mainline 2011-12-08 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/51466 * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Also copy TREE_SIDE_EFFECTS. 2011-11-28 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/50078 * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Copy over TREE_THIS_VOLATILE also from the old to new lhs resp. rhs. Added: branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.c-torture/execute/pr51466.c branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.dg/pr50078.c Modified: branches/ARM/embedded-4_6-branch/gcc/ChangeLog.arm branches/ARM/embedded-4_6-branch/gcc/testsuite/ChangeLog.arm branches/ARM/embedded-4_6-branch/gcc/tree-ssa-forwprop.c