Bug 41630 - [4.3/4.4 Regression] Optimization error on vectors of uint64_t
Summary: [4.3/4.4 Regression] Optimization error on vectors of uint64_t
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 4.4.1
: P3 normal
Target Milestone: 4.3.5
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2009-10-08 13:49 UTC by Emanuele Cesena
Modified: 2010-03-15 15:23 UTC (History)
3 users (show)

See Also:
Host: x86_64-gnu-linux
Target:
Build:
Known to work: 4.2.4 4.5.0
Known to fail: 4.3.4 4.4.1
Last reconfirmed: 2009-10-08 14:52:06


Attachments
source (296 bytes, text/plain)
2009-10-08 13:50 UTC, Emanuele Cesena
Details
preprocessed file (3.54 KB, application/octet-stream)
2009-10-08 13:51 UTC, Emanuele Cesena
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Emanuele Cesena 2009-10-08 13:49:29 UTC
System.
Fedora 11 - Linux 2.6.30.8-64.fc11.x86_64 #1 SMP
gcc-4.4.1, Release: 2.fc11 (Fedora's package)


Problem in short.
definitions:
  typedef uint64_t obj[1];
obj x0, x1, X[2];
then the following code doesn't work:
  X[0][0] = x0[0];
  X[1][0] = x1[0];
while this works:
  *X[0] = *x0;
  *X[1] = *x1;
(As far as I know these are equivalent).
Problem only with -O3 and 64-bit code.
Works perfectly at least with gcc34, -O2 and/or 32-bit code.


Detailed information.

The program gcc-bug.c compiled as
  gcc -Wall -O3 -o gcc-bug gcc-bug.c
produce the following (wrong) output:
(1) x0 = 12345
(1) x1 = 67890
(2) x0 = 12345
(2) x1 = 4195296
instead of the correct one:
(1) x0 = 12345
(1) x1 = 67890
(2) x0 = 12345
(2) x1 = 67890

In attachment gcc-bug.c and gcc-bug.i, generated with -v -save-temps.
Comment 1 Emanuele Cesena 2009-10-08 13:50:39 UTC
Created attachment 18750 [details]
source
Comment 2 Emanuele Cesena 2009-10-08 13:51:51 UTC
Created attachment 18751 [details]
preprocessed file
Comment 3 Paolo Carlini 2009-10-08 14:16:53 UTC
Whatever it is, doesn't happen in mainline.
Comment 4 Richard Biener 2009-10-08 14:52:06 UTC
Simplified testcase, fails at -O1.  Likely an aliasing issue, but I didn't
yet fully investigate (nor ruled out a non-conforming testcase - though
TBAA is out of the question here):

typedef unsigned long obj[1];
extern void abort (void);
static void test_level2(obj X[])
{
    if (*X[0] != 12345
        || *X[1] != 67890)
      abort ();
}
static void test_level1(obj x0, obj x1)
{
    obj X[2];

    X[0][0] = x0[0];
    X[1][0] = x1[0];

    if (*x0 != 12345
        || *x1 != 67890)
      abort ();
    test_level2 (X);
}
int main()
{
    obj X[2];
    *X[0] = 12345;
    *X[1] = 67890;
    test_level1(X[0], X[1]);
    return 0;
}
Comment 5 Richard Biener 2009-10-08 15:03:30 UTC
What we can see after inlining is

<bb 2>:
  X[0][0] ={v} 12345;
  D.1614_1 = (long unsigned int *) &X[1];
  *D.1614_1 ={v} 67890;
  D.1614_2 = (long unsigned int *) &X[1];
  X.0_3 = (long unsigned int *) &X;
  D.1623_5 = *X.0_3;
  X[0][0] ={v} D.1623_5;
  D.1622_6 = *D.1614_2;
  X[1][0] ={v} D.1622_6;
  D.1623_7 = *X.0_3;
  if (D.1623_7 != 12345)
    goto <bb 4>;
...
<bb 6>:
  D.1625_10 = X[0][1];
  if (D.1625_10 != 67890)
    goto <bb 7>;
  else
    goto <bb 8>;

so the final check is reading from X[0][1] but we only ever store to X[1][0].

So the testcase can be simplified to

typedef unsigned long obj[1];
extern void abort (void);
static void test_level2(obj X[])
{
    if (*X[1] != 67890)
      abort ();
}
int main()
{
    obj X[2];
    X[1][0] = 67890;
    test_level2(X);
    return 0;
}

or even to

typedef unsigned long obj[1];
extern void abort (void);
int main()
{
    obj X[2];
    X[1][0] = 67890;
    if (X[0][1] != 67890)
      abort ();
    return 0;
}

which will also fail with 4.2.4 (but still not 4.5.0).  But that also
raises the question of the validity again.
Comment 6 Richard Biener 2009-10-08 15:07:54 UTC
With 4.3 and 4.4 it is SRA that does not avoid generating wrong code, with
4.5 SRA optimizes the code correctly and recognizes both forms access the
same memory (and thus we optimize the program to return 0).

Workaround: -fno-tree-sra
Comment 7 Richard Biener 2010-03-15 14:56:53 UTC
Joseph, is this a valid testcase?

typedef unsigned long obj[1];
extern void abort (void);
int main()
{
    obj X[2];
    X[1][0] = 67890;
    if (X[0][1] != 67890)
      abort ();
    return 0;
}
Comment 8 jsm-csl@polyomino.org.uk 2010-03-15 15:17:32 UTC
Subject: Re:  [4.3/4.4 Regression] Optimization error on vectors
 of uint64_t

On Mon, 15 Mar 2010, rguenth at gcc dot gnu dot org wrote:

> Joseph, is this a valid testcase?
> 
> typedef unsigned long obj[1];
> extern void abort (void);
> int main()
> {
>     obj X[2];
>     X[1][0] = 67890;
>     if (X[0][1] != 67890)

This access to X[0][1] looks like an out-of-bounds access that is 
undefined behavior like the example in Annex J: "An array subscript is out 
of range, even if an object is apparently accessible with the given 
subscript (as in the lvalue expression a[1][7] given the declaration int 
a[4][5]) (6.5.6).".  (This originates in C90 DR#017; the example was added 
in C90 TC1.)

Comment 9 Richard Biener 2010-03-15 15:23:54 UTC
Invalid then.