System. Fedora 11 - Linux 2.6.30.8-64.fc11.x86_64 #1 SMP gcc-4.4.1, Release: 2.fc11 (Fedora's package) Problem in short. definitions: typedef uint64_t obj[1]; obj x0, x1, X[2]; then the following code doesn't work: X[0][0] = x0[0]; X[1][0] = x1[0]; while this works: *X[0] = *x0; *X[1] = *x1; (As far as I know these are equivalent). Problem only with -O3 and 64-bit code. Works perfectly at least with gcc34, -O2 and/or 32-bit code. Detailed information. The program gcc-bug.c compiled as gcc -Wall -O3 -o gcc-bug gcc-bug.c produce the following (wrong) output: (1) x0 = 12345 (1) x1 = 67890 (2) x0 = 12345 (2) x1 = 4195296 instead of the correct one: (1) x0 = 12345 (1) x1 = 67890 (2) x0 = 12345 (2) x1 = 67890 In attachment gcc-bug.c and gcc-bug.i, generated with -v -save-temps.
Created attachment 18750 [details] source
Created attachment 18751 [details] preprocessed file
Whatever it is, doesn't happen in mainline.
Simplified testcase, fails at -O1. Likely an aliasing issue, but I didn't yet fully investigate (nor ruled out a non-conforming testcase - though TBAA is out of the question here): typedef unsigned long obj[1]; extern void abort (void); static void test_level2(obj X[]) { if (*X[0] != 12345 || *X[1] != 67890) abort (); } static void test_level1(obj x0, obj x1) { obj X[2]; X[0][0] = x0[0]; X[1][0] = x1[0]; if (*x0 != 12345 || *x1 != 67890) abort (); test_level2 (X); } int main() { obj X[2]; *X[0] = 12345; *X[1] = 67890; test_level1(X[0], X[1]); return 0; }
What we can see after inlining is <bb 2>: X[0][0] ={v} 12345; D.1614_1 = (long unsigned int *) &X[1]; *D.1614_1 ={v} 67890; D.1614_2 = (long unsigned int *) &X[1]; X.0_3 = (long unsigned int *) &X; D.1623_5 = *X.0_3; X[0][0] ={v} D.1623_5; D.1622_6 = *D.1614_2; X[1][0] ={v} D.1622_6; D.1623_7 = *X.0_3; if (D.1623_7 != 12345) goto <bb 4>; ... <bb 6>: D.1625_10 = X[0][1]; if (D.1625_10 != 67890) goto <bb 7>; else goto <bb 8>; so the final check is reading from X[0][1] but we only ever store to X[1][0]. So the testcase can be simplified to typedef unsigned long obj[1]; extern void abort (void); static void test_level2(obj X[]) { if (*X[1] != 67890) abort (); } int main() { obj X[2]; X[1][0] = 67890; test_level2(X); return 0; } or even to typedef unsigned long obj[1]; extern void abort (void); int main() { obj X[2]; X[1][0] = 67890; if (X[0][1] != 67890) abort (); return 0; } which will also fail with 4.2.4 (but still not 4.5.0). But that also raises the question of the validity again.
With 4.3 and 4.4 it is SRA that does not avoid generating wrong code, with 4.5 SRA optimizes the code correctly and recognizes both forms access the same memory (and thus we optimize the program to return 0). Workaround: -fno-tree-sra
Joseph, is this a valid testcase? typedef unsigned long obj[1]; extern void abort (void); int main() { obj X[2]; X[1][0] = 67890; if (X[0][1] != 67890) abort (); return 0; }
Subject: Re: [4.3/4.4 Regression] Optimization error on vectors of uint64_t On Mon, 15 Mar 2010, rguenth at gcc dot gnu dot org wrote: > Joseph, is this a valid testcase? > > typedef unsigned long obj[1]; > extern void abort (void); > int main() > { > obj X[2]; > X[1][0] = 67890; > if (X[0][1] != 67890) This access to X[0][1] looks like an out-of-bounds access that is undefined behavior like the example in Annex J: "An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a[1][7] given the declaration int a[4][5]) (6.5.6).". (This originates in C90 DR#017; the example was added in C90 TC1.)
Invalid then.