The following program should print 12345678. When it is built with "-O2 -m64" or "-O3 -fno-inline -m64" on sparc, it instead prints 0. I ran this test with the command "/opt/cfarm/release/4.4.0/bin/gcc -fno-inline -O3 -m64 -g combined.c" on gcc62 in the GCC compile farm. ---------------------------------------------------------------------- "/opt/cfarm/release/4.4.0/bin/gcc -v" prints: Using built-in specs. Target: sparc64-unknown-linux-gnu Configured with: ../gcc-4.4.0/configure --enable-languages=c,c++,fortran,ada --prefix=/opt/cfarm/release/4.4.0 --enable-__cxa_atexit --enable-threads=posix --disable-nls --with-mpfr=/opt/cfarm/mpfr-2.4.1 --with-gmp=/opt/cfarm/gmp-4.2.4 --with-cpu=v8 Thread model: posix gcc version 4.4.0 (GCC) ---------------------------------------------------------------------- #include <stdint.h> #include <stdio.h> #include <string.h> /* Stores 32-bit unsigned integer X at P, which need not be aligned. */ static void put_uint32 (uint32_t x, void *p) { memcpy (p, &x, sizeof x); } void store_12345678 (int type, void *number) { switch (type) { case 1: printf ("got here\n"); put_uint32 (0x12345678, number); break; case 7: put_uint32 (0, number); break; case 8: put_uint32 (0, number); break; case 9: put_uint32 (0, number); break; } } int main (void) { uint32_t x; store_12345678 (1, &x); printf ("%x\n", (unsigned int) x); return 0; }
Created attachment 18147 [details] preprocessed test input
Created attachment 18148 [details] test program (before preprocessing)
Confirmed, with gcc-4.3-20090705 it works, with gcc-4.4-20090630 it fails. Compiling with -S and comparing the .s files it looks like 4.4 completely mis-schedules the code for put_uint32: put_uint32: .register %g2, #scratch .register %g3, #scratch ldub [%sp+2175], %g1 ldub [%sp+2176], %g3 ldub [%sp+2177], %g2 ldub [%sp+2178], %g4 st %o0, [%sp+2175] stb %g4, [%o1+3] stb %g1, [%o1] stb %g3, [%o1+1] jmp %o7+8 stb %g2, [%o1+2] Notice how the store of %o0 to the four bytes at %sp+2175 comes after the corresponding byte loads, so %g1 to %g4 are loaded with garbage, likely zeroes. In contrast, gcc-4.3 generates the store before the loads: put_uint32: .register %g2, #scratch .register %g3, #scratch st %o0, [%sp+2175] ldub [%sp+2176], %g3 ldub [%sp+2177], %g4 ldub [%sp+2178], %g2 ldub [%sp+2175], %g1 stb %g2, [%o1+3] stb %g1, [%o1] stb %g3, [%o1+1] jmp %o7+8 stb %g4, [%o1+2]
A reghunt identified Jakub's (added to cc: list) r142481 (PR38367 fix) as the source of this regression.
Created attachment 18151 [details] gcc44-pr40668.patch Untested patch that fixes this testcase. I believe my commit was correct, but apparently it can be modified later on without adjusting MEM_OFFSET. I don't have a working SPARC box around ATM, so I can't bootstrap/regtest it there.
(In reply to comment #5) > Created an attachment (id=18151) [edit] > gcc44-pr40668.patch > > Untested patch that fixes this testcase. Thanks. This fixes the issue in a cross-compiler to sparc64-linux. I'm currently bootstrapping 4.4-20090630 plus this patch on an Ultra5, I'll follow up once that's complete (it will take quite a while).
4.4-20090630 plus this fix bootstrapped fine, fixed the test case, built a working 2.6.31-rc2 Linux kernel, and built a working Erlang VM.
Wow, that's amazingly fast turnaround. Thanks so much guys!
Subject: Bug 40668 Author: jakub Date: Sat Jul 11 09:23:32 2009 New Revision: 149511 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=149511 Log: PR target/40668 * function.c (assign_parm_setup_stack): Adjust MEM_OFFSET (data->stack_parm) if promoted_mode is different from nominal_mode on big endian. * gcc.c-torture/execute/pr40668.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr40668.c Modified: trunk/gcc/ChangeLog trunk/gcc/function.c trunk/gcc/testsuite/ChangeLog
Subject: Bug 40668 Author: jakub Date: Sat Jul 11 09:26:23 2009 New Revision: 149512 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=149512 Log: PR target/40668 * function.c (assign_parm_setup_stack): Adjust MEM_OFFSET (data->stack_parm) if promoted_mode is different from nominal_mode on big endian. * gcc.c-torture/execute/pr40668.c: New test. Added: branches/gcc-4_4-branch/gcc/testsuite/gcc.c-torture/execute/pr40668.c Modified: branches/gcc-4_4-branch/gcc/ChangeLog branches/gcc-4_4-branch/gcc/function.c branches/gcc-4_4-branch/gcc/testsuite/ChangeLog
By Jakub.