41630 – [4.3/4.4 Regression] Optimization error on vectors of uint64_t

Bug 41630 - [4.3/4.4 Regression] Optimization error on vectors of uint64_t

Summary: [4.3/4.4 Regression] Optimization error on vectors of uint64_t

Status:	RESOLVED INVALID

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	c (show other bugs)
Version:	4.4.1

Importance:	P3 normal
Target Milestone:	4.3.5
Assignee:	Not yet assigned to anyone

URL:
Keywords:	wrong-code

Depends on:
Blocks:

Reported:	2009-10-08 13:49 UTC by Emanuele Cesena
Modified:	2010-03-15 15:23 UTC (History)
CC List:	3 users (show)

See Also:
Host:	x86_64-gnu-linux
Target:
Build:
Known to work:	4.2.4 4.5.0
Known to fail:	4.3.4 4.4.1
Last reconfirmed:	2009-10-08 14:52:06

Attachments
source (296 bytes, text/plain) 2009-10-08 13:50 UTC, Emanuele Cesena	Details
preprocessed file (3.54 KB, application/octet-stream) 2009-10-08 13:51 UTC, Emanuele Cesena	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Emanuele Cesena 2009-10-08 13:49:29 UTC

System.
Fedora 11 - Linux 2.6.30.8-64.fc11.x86_64 #1 SMP
gcc-4.4.1, Release: 2.fc11 (Fedora's package)


Problem in short.
definitions:
  typedef uint64_t obj[1];
obj x0, x1, X[2];
then the following code doesn't work:
  X[0][0] = x0[0];
  X[1][0] = x1[0];
while this works:
  *X[0] = *x0;
  *X[1] = *x1;
(As far as I know these are equivalent).
Problem only with -O3 and 64-bit code.
Works perfectly at least with gcc34, -O2 and/or 32-bit code.


Detailed information.

The program gcc-bug.c compiled as
  gcc -Wall -O3 -o gcc-bug gcc-bug.c
produce the following (wrong) output:
(1) x0 = 12345
(1) x1 = 67890
(2) x0 = 12345
(2) x1 = 4195296
instead of the correct one:
(1) x0 = 12345
(1) x1 = 67890
(2) x0 = 12345
(2) x1 = 67890

In attachment gcc-bug.c and gcc-bug.i, generated with -v -save-temps.

Comment 1 Emanuele Cesena 2009-10-08 13:50:39 UTC

Created attachment 18750 [details]
source

Comment 2 Emanuele Cesena 2009-10-08 13:51:51 UTC

Created attachment 18751 [details]
preprocessed file

Comment 3 Paolo Carlini 2009-10-08 14:16:53 UTC

Whatever it is, doesn't happen in mainline.

Comment 4 Richard Biener 2009-10-08 14:52:06 UTC

Simplified testcase, fails at -O1.  Likely an aliasing issue, but I didn't
yet fully investigate (nor ruled out a non-conforming testcase - though
TBAA is out of the question here):

typedef unsigned long obj[1];
extern void abort (void);
static void test_level2(obj X[])
{
    if (*X[0] != 12345
        || *X[1] != 67890)
      abort ();
}
static void test_level1(obj x0, obj x1)
{
    obj X[2];

    X[0][0] = x0[0];
    X[1][0] = x1[0];

    if (*x0 != 12345
        || *x1 != 67890)
      abort ();
    test_level2 (X);
}
int main()
{
    obj X[2];
    *X[0] = 12345;
    *X[1] = 67890;
    test_level1(X[0], X[1]);
    return 0;
}

Comment 5 Richard Biener 2009-10-08 15:03:30 UTC

What we can see after inlining is

<bb 2>:
  X[0][0] ={v} 12345;
  D.1614_1 = (long unsigned int *) &X[1];
  *D.1614_1 ={v} 67890;
  D.1614_2 = (long unsigned int *) &X[1];
  X.0_3 = (long unsigned int *) &X;
  D.1623_5 = *X.0_3;
  X[0][0] ={v} D.1623_5;
  D.1622_6 = *D.1614_2;
  X[1][0] ={v} D.1622_6;
  D.1623_7 = *X.0_3;
  if (D.1623_7 != 12345)
    goto <bb 4>;
...
<bb 6>:
  D.1625_10 = X[0][1];
  if (D.1625_10 != 67890)
    goto <bb 7>;
  else
    goto <bb 8>;

so the final check is reading from X[0][1] but we only ever store to X[1][0].

So the testcase can be simplified to

typedef unsigned long obj[1];
extern void abort (void);
static void test_level2(obj X[])
{
    if (*X[1] != 67890)
      abort ();
}
int main()
{
    obj X[2];
    X[1][0] = 67890;
    test_level2(X);
    return 0;
}

or even to

typedef unsigned long obj[1];
extern void abort (void);
int main()
{
    obj X[2];
    X[1][0] = 67890;
    if (X[0][1] != 67890)
      abort ();
    return 0;
}

which will also fail with 4.2.4 (but still not 4.5.0).  But that also
raises the question of the validity again.

Comment 6 Richard Biener 2009-10-08 15:07:54 UTC

With 4.3 and 4.4 it is SRA that does not avoid generating wrong code, with
4.5 SRA optimizes the code correctly and recognizes both forms access the
same memory (and thus we optimize the program to return 0).

Workaround: -fno-tree-sra

Comment 7 Richard Biener 2010-03-15 14:56:53 UTC

Joseph, is this a valid testcase?

typedef unsigned long obj[1];
extern void abort (void);
int main()
{
    obj X[2];
    X[1][0] = 67890;
    if (X[0][1] != 67890)
      abort ();
    return 0;
}

Comment 8 jsm-csl@polyomino.org.uk 2010-03-15 15:17:32 UTC

Subject: Re:  [4.3/4.4 Regression] Optimization error on vectors
 of uint64_t

On Mon, 15 Mar 2010, rguenth at gcc dot gnu dot org wrote:

> Joseph, is this a valid testcase?
> 
> typedef unsigned long obj[1];
> extern void abort (void);
> int main()
> {
>     obj X[2];
>     X[1][0] = 67890;
>     if (X[0][1] != 67890)

This access to X[0][1] looks like an out-of-bounds access that is 
undefined behavior like the example in Annex J: "An array subscript is out 
of range, even if an object is apparently accessible with the given 
subscript (as in the lvalue expression a[1][7] given the declaration int 
a[4][5]) (6.5.6).".  (This originates in C90 DR#017; the example was added 
in C90 TC1.)

Comment 9 Richard Biener 2010-03-15 15:23:54 UTC

Invalid then.