Bug 50078 - [4.6 Regression] combine wrong code: volatile accesses optimized out
Summary: [4.6 Regression] combine wrong code: volatile accesses optimized out
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.6.2
: P2 normal
Target Milestone: 4.6.3
Assignee: Jakub Jelinek
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2011-08-13 23:02 UTC by Krzysztof Halasa
Modified: 2012-06-11 09:51 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.5.3
Known to fail:
Last reconfirmed: 2011-08-14 00:00:00


Attachments
gcc47-pr50078.patch (955 bytes, patch)
2011-11-28 09:12 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Krzysztof Halasa 2011-08-13 23:02:33 UTC
I think armeb-pc-linux-gnueabi-gcc 4.6.2pre 20110813 (cross-compiled on x86-64) is a bit too optimistic with -O2:

unsigned var[2];

void test(int arg)
{
	unsigned v = *(volatile unsigned *)(&var[arg]);
	*(volatile unsigned *)(&var[arg]) = v;
}

produces:
00000000 <test>:
   0:   e12fff1e        bx      lr
Comment 1 Mikael Pettersson 2011-08-14 09:22:01 UTC
Not ARM-specific, also happens on i686-linux.
Comment 2 Richard Biener 2011-08-14 09:37:26 UTC
Mine.

          /* If the statement is a scalar store, see if the expression
             has the same value number as its rhs.  If so, the store is
             dead.  */
          else if (gimple_assign_single_p (stmt)
                   && !is_gimple_reg (gimple_assign_lhs (stmt))
                   && (TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
                       || is_gimple_min_invariant (gimple_assign_rhs1 (stmt))))

needs a && !gimple_has_volatile_ops (stmt).
Comment 3 Richard Biener 2011-08-15 09:11:50 UTC
Actually I was wrong guessing - the tree level is fine, it is combine that
removes the "noop" move completely:

Trying 6 -> 11:
Failed to match this instruction:
(set (mem/s:SI (plus:DI (ashiftrt:DI (mult:DI (subreg:DI (reg/v:SI 60 [ arg ]) 0)
                    (const_int 4294967296 [0x100000000]))
                (const_int 30 [0x1e]))
            (symbol_ref:DI ("var")  <var_decl 0x7ffff7ee3140 var>)) [2 var S4 A32])
    (mem/s/v:SI (plus:DI (ashiftrt:DI (mult:DI (subreg:DI (reg/v:SI 60 [ arg ]) 0)
                    (const_int 4294967296 [0x100000000]))
                (const_int 30 [0x1e]))
            (symbol_ref:DI ("var")  <var_decl 0x7ffff7ee3140 var>)) [2 var S4 A32]))
rejecting combination of insns 6 and 11
original costs 4 + 9 = 13
replacement cost 31
deleting noop move 11

Confirmed on x86_64 as well.
Comment 4 Richard Biener 2011-08-15 09:13:40 UTC
noop_move_p returns true for this - ignoding the side-effects.
Comment 5 Steven Bosscher 2011-09-11 15:14:19 UTC
Actually it is not noop_move_p that's at fault here, but the disgusting hack for NOOP_MOVE_INSN_CODE. The insn is marked as a NOOP_MOVE somewhere else in combine.
Comment 6 Steven Bosscher 2011-09-11 15:22:36 UTC
int
set_noop_p (const_rtx set)
{
  rtx src = SET_SRC (set);
  rtx dst = SET_DEST (set);

  if (dst == pc_rtx && src == pc_rtx)
    return 1;

  if (MEM_P (dst) && MEM_P (src))
    return rtx_equal_p (dst, src) && !side_effects_p (dst);

Note there is no check on side_effects_p(src).

Breakpoint 8, set_noop_p (set=0x7ffff70a6d98) at ../../trunk/gcc/rtlanal.c:1094
1094	  rtx src = SET_SRC (set);
(gdb) step
1095	  rtx dst = SET_DEST (set);
(gdb) next
1097	  if (dst == pc_rtx && src == pc_rtx)
(gdb) p debug_rtx(dst)
(mem/s:SI (plus:DI (mult:DI (reg:DI 61 [ arg ])
            (const_int 4 [0x4]))
        (symbol_ref:DI ("var") <var_decl 0x7ffff7eb0140 var>)) [2 var S4 A32])
$8 = void
(gdb) p debug_rtx(src)
(mem/s/v:SI (plus:DI (mult:DI (reg:DI 61 [ arg ])
            (const_int 4 [0x4]))
        (symbol_ref:DI ("var") <var_decl 0x7ffff7eb0140 var>)) [2 var S4 A32])
$9 = void
(gdb) step
1100	  if (MEM_P (dst) && MEM_P (src))
(gdb) p side_effects_p(src)
$11 = 1
(gdb) step
1101	    return rtx_equal_p (dst, src) && !side_effects_p (dst);
(gdb) p rtx_equal_p (dst, src)
$12 = 1
(gdb) p side_effects_p(dst)
$10 = 0
(gdb) 

Note that dst is not a volatile MEM for some reason.
Comment 7 Steven Bosscher 2011-09-11 15:31:35 UTC
Comes from SSA expand => Matz
Comment 8 Steven Bosscher 2011-09-11 15:47:21 UTC
(In reply to comment #7)
> Comes from SSA expand => Matz

Comes from SSA expand because it is already wrong in the .expand dump:
;; MEM[(volatile unsigned int *)&var][arg_1(D)] ={v} v_2;

(insn 9 8 10 (set (reg:DI 63)
        (sign_extend:DI (reg/v:SI 60 [ argD.1604 ]))) t.c:6 -1
     (nil))

(insn 10 9 11 (set (reg/f:DI 64)
        (symbol_ref:DI ("var") <var_decl 0x7f4b8054f140 var>)) t.c:6 -1
     (nil))

(insn 11 10 0 (set (mem/s:SI (plus:DI (mult:DI (reg:DI 63)
                    (const_int 4 [0x4]))
                (reg/f:DI 64)) [2 varD.1603 S4 A32])
        (reg/v:SI 59 [ vD.1607 ])) t.c:6 -1
     (nil))

It seems to me that the MEM in insn 11 should be mem/s/v.
Comment 9 Jakub Jelinek 2011-10-26 17:13:43 UTC
GCC 4.6.2 is being released.
Comment 10 Jakub Jelinek 2011-11-28 09:12:39 UTC
Created attachment 25929 [details]
gcc47-pr50078.patch

IMHO it is a forwprop bug, which changes a MEM_REF in this case into an ARRAY_REF (with MEM_REF operand), but copies TREE_THIS_VOLATILE from the original MEM_REF not to the ARRAY_REF, but to the inner MEM_REF.
Comment 11 Jakub Jelinek 2011-11-28 21:03:16 UTC
Author: jakub
Date: Mon Nov 28 21:03:11 2011
New Revision: 181786

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=181786
Log:
	PR tree-optimization/50078
	* tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Copy over
	TREE_THIS_VOLATILE also from the old to new lhs resp. rhs.

	* gcc.dg/pr50078.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/pr50078.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-forwprop.c
Comment 12 Jakub Jelinek 2011-11-28 21:09:57 UTC
Fixed on the trunk so far.
Comment 13 Jakub Jelinek 2011-12-09 11:32:40 UTC
Author: jakub
Date: Fri Dec  9 11:32:35 2011
New Revision: 182157

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=182157
Log:
	Backport from mainline
	2011-12-08  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/51466
	* tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Also copy
	TREE_SIDE_EFFECTS.

	* gcc.c-torture/execute/pr51466.c: New test.

	2011-11-28  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/50078
	* tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Copy over
	TREE_THIS_VOLATILE also from the old to new lhs resp. rhs.

	* gcc.dg/pr50078.c: New test.

Added:
    branches/gcc-4_6-branch/gcc/testsuite/gcc.c-torture/execute/pr51466.c
    branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/pr50078.c
Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_6-branch/gcc/tree-ssa-forwprop.c
Comment 14 Mikael Pettersson 2011-12-10 16:24:38 UTC
The original test case now compiles correctly for me with gcc-4.6-20111209 for both i686-linux and armv5tel-linux-gnueabi targets.
Comment 15 Jakub Jelinek 2011-12-12 17:33:55 UTC
Fixed.
Comment 16 xuepeng guo 2012-06-11 09:51:12 UTC
Author: xguo
Date: Mon Jun 11 09:51:05 2012
New Revision: 188383

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188383
Log:
2012-06-11  Terry Guo  <terry.guo@arm.com>

        Backport from mainline
        2011-12-08  Jakub Jelinek  <jakub@redhat.com>

        PR tree-optimization/51466
        * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Also copy
        TREE_SIDE_EFFECTS.

        2011-11-28  Jakub Jelinek  <jakub@redhat.com>

        PR tree-optimization/50078
        * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Copy over
        TREE_THIS_VOLATILE also from the old to new lhs resp. rhs.

Added:
    branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.c-torture/execute/pr51466.c
    branches/ARM/embedded-4_6-branch/gcc/testsuite/gcc.dg/pr50078.c
Modified:
    branches/ARM/embedded-4_6-branch/gcc/ChangeLog.arm
    branches/ARM/embedded-4_6-branch/gcc/testsuite/ChangeLog.arm
    branches/ARM/embedded-4_6-branch/gcc/tree-ssa-forwprop.c