Bug 24804 - [3.4 Regression] Produces wrong code
Summary: [3.4 Regression] Produces wrong code
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 3.4.4
: P3 critical
Target Milestone: 4.0.0
Assignee: Not yet assigned to anyone
Keywords: monitored, wrong-code
: 24812 (view as bug list)
Depends on:
Reported: 2005-11-11 17:15 UTC by Alexander Bottema
Modified: 2006-03-08 23:40 UTC (History)
6 users (show)

See Also:
Host: ix86-linux
Target: ix86-linux
Build: ix86-linux
Known to work: 3.4.0 4.1.0 3.3.3
Known to fail: 3.4.5
Last reconfirmed: 2005-11-11 17:48:26


Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Bottema 2005-11-11 17:15:04 UTC
Compile the following program (minimized from a larger context) with
'g++ -O3 -fno-strict-aliasing foo.cpp' (on a Linux i386 32-bit machine
with gcc 3.4.4)

class DummyType
    inline DummyType() { }
    inline ~DummyType() { }

class Foo
    Foo() : X0(0), X4(0) { }

    int X0, X1, X2, X3, X4;

int main()
        Foo f;

    Foo f2;
    while (1) {
	if (f2.X4 != 0) {
  	   f2.X4 = 0;
	} else {

  return 0;

It will hang if you execute it.

It does not hang if you do _one_ of the following things:

- Compile with -O1 (instead of -O3)
- Remove compiler flag -fno-strict-aliasing
- Replace all occurrences of 'f2.X4' with 'f2.X3'
- Remove the statement 'DummyType();'
- Remove the statement 'Foo f;'
Comment 1 Andrew Pinski 2005-11-11 17:41:15 UTC
Can you give the output of "gcc -v"?
Comment 2 Andrew Pinski 2005-11-11 17:48:26 UTC
Confirmed, only a 3.4 regression.
Comment 3 Andrew Pinski 2005-11-11 20:56:56 UTC
*** Bug 24812 has been marked as a duplicate of this bug. ***
Comment 4 Jim Wilson 2005-11-29 03:39:50 UTC
The failure happens in store_motion in gcse.c.

We have two objects on the stack with disjoint lifetimes that overlap.  They have different MEM_EXPRS, and some of the mems also have different alias sets.  They have different MEM_EXPRs as they are from different objects.
(insn 34 3 37 0 (set (mem/s/j:SI (plus:SI (reg/f:SI 20 frame)
                (const_int -32 [0xffffffffffffffe0])) [0 <variable>.X0+0 S4 A32\])
        (const_int 0 [0x0])) 43 {*movsi_1_nointernunit} (nil)
(insn 92 89 99 0 (set (mem/s/j:SI (plus:SI (reg/f:SI 20 frame)
                (const_int -32 [0xffffffffffffffe0])) [0 <variable>.X4+0 S4 A32\])
        (const_int 0 [0x0])) 43 {*movsi_1_nointernunit} (nil)
(insn 116 162 117 2 (set (mem/s/j:SI (plus:SI (reg/f:SI 20 frame)
                (const_int -32 [0xffffffffffffffe0])) [0 f2.X4+0 S4 A128])
        (const_int 0 [0x0])) 43 {*movsi_1_nointernunit} (nil)

This in itself is fairly harmless.  However, a problem occurs when we try to keep track of mems.  We call ldst_entry which computes a hash code, which is identical for the two mems, and then puts them into the same ls_expr structure.  This ls_expr structure only holds one mem rtx.  Which means the aliasing info is now wrong for the other mem rtx.  Eventually we call true_dependence with a read for the other mem, and it decides that they can't alias because of the differing MEM_EXPRs.

It appears that the solution here is to somehow combine the aliasing info when putting multiple mems into a single ls_expr structure.  If we put two MEMs with differing MEM_EXPRs into the same ls_expr structure, then we should create a new mem with a cleared MEM_EXPR field.  Similarly, if we have two MEMs with different alias sets, then we may need to say that they can alias anything.

There is a comment that indicates that we are deliberately ignoring the alias sets when computing the hash codes, as this caused problems for profile feedback directed optimization.  I haven't looked at the details here.

The testcase doesn't fail with gcc-4.0 and up, because after tree-ssa opts there isn't anything left for the RTL gcse pass to do.  However, I believe the bug is still there in the code, it is just very much harder to reproduce now.
Comment 5 Andrew Pinski 2005-11-29 04:04:59 UTC
Subject: Re:  [3.4 Regression] Produces wrong code

> ------- Comment #4 from wilson at gcc dot gnu dot org  2005-11-29 03:39 -------
> The testcase doesn't fail with gcc-4.0 and up, because after tree-ssa opts
> there isn't anything left for the RTL gcse pass to do.  However, I believe the
> bug is still there in the code, it is just very much harder to reproduce now.

This might be also related to PR 25130 (it might be exposing the bug in 4.1 even).

-- Pinski
Comment 6 Jim Wilson 2005-11-29 05:57:53 UTC
PR 25130 is a gcse problem, and there are some curious similarities.  We have two objects on the stack with the same address, and gcse is emitting new RTL referring to the "wrong" one, which means we have mems with bad MEM_EXPR fields after gcse is finished.  However, the underlying failure is different here.  It seems to be a problem with the load motion logic.  I will put some details into that PR.
Comment 7 Volker Reichelt 2005-11-29 11:26:09 UTC
The command line flags "-O -fgcse" are sufficient to reproduce the bug.
The constructor of DummyType can be omitted.
Comment 8 Gabriel Dos Reis 2005-12-01 08:33:53 UTC
Moved to 3.4.6
Comment 9 Andrew Pinski 2006-03-08 23:40:23 UTC
Fixed in 4.0.0.  3.4.6 has been tagged already and has been released (no announcement has been made but it is up on the ftp server already).