When the attached source file is compiled with 'gcc -O3 -c', the code that uses it produces wrong results. The problem disappears if 'gcc -O3 -fno-inline -c' or if the variables inside 'generate_point_symmetry' are declared as 'static'. This is the output of gcc-4.3 -v : Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: ../gcc-4.3/configure --prefix=/opt/gcc/ --program-suffix=-4.3 --enable-languages=c,fortran,c++ --with-arch=core2 --enable-libgomp Thread model: posix gcc version 4.3.1 20080419 (prerelease) (GCC) This is the system: Linux corvo 2.6.24.2-corvo-001 #3 SMP PREEMPT x86_64 GNU/Linux distribution is Debian GNU/Linux 4.0
Created attachment 15510 [details] the code
Please provide a self-contained testcase (see http://gcc.gnu.org/bugs.html ) that ideally abort()s on a wrong result.
The code comes from spglib, a library to calculate symmetry groups from crystals, so it is quite complex. The problem is that I didn't wrote it I don't understand it enough to be able to produce a small self-contained test case. I will try to do it, but this may take me some time.
Created attachment 15513 [details] Test case, 1st file (no includes)
Created attachment 15514 [details] Test case 2nd file
I have managed to create a test case: Correct case: xavier@corvo:~$ gcc-4.3 bravais.c mathfunc.c -O3 -fno-inline bravais.c: In function ‘main’: bravais.c:83: warning: incompatible implicit declaration of built-in function ‘printf’ xavier@corvo:~$ ./a.out 0 Wrong case: xavier@corvo:~$ gcc-4.3 bravais.c mathfunc.c -O3 bravais.c: In function ‘main’: bravais.c:83: warning: incompatible implicit declaration of built-in function ‘printf’ xavier@corvo:~$ ./a.out -1 Sorry for using two files, but the problem disappears if all functions are in a single file.
Created attachment 15515 [details] simplified bravais.c gcc -c mathfunc.c gcc -o t.ok bravais.c mathfunc.o -O gcc -o t.fail bravais.c mathfunc.o -O -funroll-loops ./t.ok ./t.fail Aborted
This goes wrong somewhere during RTL optimization.
Even more simplified testcase, with just one CU. Works at -O0/-O/-O2, fails at -O{,2} -funroll-loops or -O3. extern void abort (void); void __attribute__ ((noinline)) bar (int m[3][3], int a[3][3], int b[3][3]) { int i, j; for (i = 0; i < 3; i++) for (j = 0; j < 3; j++) m[i][j] = a[i][0] * b[0][j] + a[i][1] * b[1][j] + a[i][2] * b[2][j]; } static inline void __attribute__ ((always_inline)) foo (int x[][3][3], int g[3][3], int y, int z) { int i, j, k; for (i = 0; i < y; i++) for (j = 0; j < z - 1; j++) { k = i * (z - 1) + j + y; bar (x[k], g, x[k - y]); } } int g1[48][3][3] = { { {1, 0, 0}, {0, 1, 0}, {0, 0, 1} } }; int g2[3][3] = { {-1, 0, 0}, {0, -1, 0}, {0, 0, -1} }; int g3[3][3] = { {0, 1, 0}, {1, 0, 0}, {0, 0, 1} }; int g4[3][3] = { {-1, 0, 0}, {0, 1, 0}, {0, 0, -1} }; int g5[3][3] = { {-1, 0, 0}, {0, -1, 0}, {0, 0, 1} }; int main () { foo (g1, g2, 1, 2); foo (g1, g4, 2, 2); foo (g1, g5, 4, 2); foo (g1, g3, 8, 2); if (g1[1][1][0] != 0) abort (); return 0; }
And one with just one inlined fn: extern void abort (void); void __attribute__ ((noinline)) bar (int m[3][3], int a[3][3], int b[3][3]) { int i, j; for (i = 0; i < 3; i++) for (j = 0; j < 3; j++) m[i][j] = a[i][0] * b[0][j] + a[i][1] * b[1][j] + a[i][2] * b[2][j]; } static inline void __attribute__ ((always_inline)) foo (int x[][3][3], int g[3][3], int y, int z) { int i, j, k; for (i = 0; i < y; i++) for (j = 0; j < z - 1; j++) { k = i * (z - 1) + j + y; bar (x[k], g, x[k - y]); } } int g[48][3][3] = { { {1, 0, 0}, {0, 1, 0}, {0, 0, 1} }, { {-1, 0, 0}, {0, -1, 0}, {0, 0, -1} }, { {-1, 0, 0}, {0, 1, 0}, {0, 0, -1} }, { {1, 0, 0}, {0, -1, 0}, {0, 0, 1} }, { {-1, 0, 0}, {0, -1, 0}, {0, 0, 1} }, { {1, 0, 0}, {0, 1, 0}, {0, 0, -1} }, { {1, 0, 0}, {0, -1, 0}, { 0, 0, -1} }, { {-1, 0, 0}, {0, 1, 0}, {0, 0, 1} } }; int h[3][3] = { {0, 1, 0}, {1, 0, 0}, {0, 0, 1} }; int main () { foo (g, h, 8, 2); if (g[1][1][0] != 0) abort (); return 0; }
extern void abort (void); int g[48][3][3]; void __attribute__ ((noinline)) bar (int x[3][3], int y[3][3]) { static int i; if (x != g[i + 8] || y != g[i++]) abort (); } static inline void __attribute__ ((always_inline)) foo (int x[][3][3]) { int i; for (i = 0; i < 8; i++) #ifdef GOOD bar (x[i + 8], x[i]); #else { int k = i + 8; bar (x[k], x[k - 8]); } #endif } int main () { foo (g); return 0; } with -DGOOD doesn't fail at any optimization level, without it fails again with -O2 -funroll-loops, -O3 etc.
This is actually a tree optimization issue. In optimized dump without -DGOOD we have: bar (&g[0][0] + 288, &g[0][0]); bar (&g[0][0] + 324, &g[1][0]); bar (&g[0][0] + 360, &g[2][0]); bar (&g[0][0] + 396, &g[3][0]); bar (&g[1][0], &g[4][0]); bar (&g[0][0] + 468, &g[5][0]); bar (&g[0][0] + 504, &g[6][0]); bar (&g[0][0] + 540, &g[7][0]); note the bogus first argument for 5th bar call, should have been &g[0][0] + 432 aka &g[12][0]. In *.reassoc2 we have for the 4th and 5th bar calls: i_73 = i_56 + 1; k_80 = i_73 + 8; D.1588_81 = (long unsigned int) k_80; D.1589_82 = D.1588_81 * 36; D.1590_83 = D.1589_82 + -288; D.1591_84 = &g + D.1590_83; D.1592_85 = &(*D.1591_84)[0]; D.1594_86 = &g[0][0] + D.1589_82; bar (D.1594_86, D.1592_85); i_90 = i_73 + 1; k_97 = i_90 + 8; D.1588_98 = (long unsigned int) k_97; D.1589_99 = D.1588_98 * 36; D.1590_100 = D.1589_99 + -288; D.1591_101 = &g + D.1590_100; D.1592_102 = &(*D.1591_101)[0]; D.1594_103 = &g[0][0] + D.1589_99; bar (D.1594_103, D.1592_102); which looks correct, but in *.vrp2: i_73 = 3; k_80 = 11; D.1588_81 = 11; D.1589_82 = 396; D.1590_83 = 108; D.1591_84 = &g[3]; D.1592_85 = &(*D.1591_84)[0]; D.1594_86 = &g[0][0] + 396; bar (D.1594_86, D.1592_85); i_90 = 4; k_97 = 12; D.1588_98 = 12; D.1589_99 = 432; D.1590_100 = 144; D.1591_101 = &g[4]; D.1592_102 = &(*D.1591_101)[0]; D.1594_103 = &g[1][0]; bar (&g[1][0], D.1592_102); which is wrong. So to me this looks like vrp bug.
Created attachment 15524 [details] gcc43-pr36008.patch Fix I'm bootstrapping/regtesting ATM.
Subject: Bug 36008 Author: jakub Date: Thu Apr 24 16:08:11 2008 New Revision: 134634 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=134634 Log: PR tree-optimization/36008 * fold-const.c (try_move_mult_to_index): If s == NULL, divide the original op1, rather than delta by step. * gcc.c-torture/execute/20080424-1.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/20080424-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/fold-const.c trunk/gcc/testsuite/ChangeLog
Subject: Bug 36008 Author: jakub Date: Thu Apr 24 16:19:22 2008 New Revision: 134636 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=134636 Log: PR tree-optimization/36008 * fold-const.c (try_move_mult_to_index): If s == NULL, divide the original op1, rather than delta by step. * gcc.c-torture/execute/20080424-1.c: New test. Added: branches/gcc-4_3-branch/gcc/testsuite/gcc.c-torture/execute/20080424-1.c Modified: branches/gcc-4_3-branch/gcc/ChangeLog branches/gcc-4_3-branch/gcc/fold-const.c branches/gcc-4_3-branch/gcc/testsuite/ChangeLog
Fixed.