The patch that fixed alias analysis [1] triggered following testsuite failure [2] on alphaev68-linux-gnu: FAIL: gfortran.fortran-torture/execute/save_1.f90 execution, -O2 FAIL: gfortran.fortran-torture/execute/save_1.f90 execution, -O2 -fomit-frame-pointer -finline-functions FAIL: gfortran.fortran-torture/execute/save_1.f90 execution, -O2 -fomit-frame-pointer -finline-functions -funroll-loops FAIL: gfortran.fortran-torture/execute/save_1.f90 execution, -O2 -fbounds-check The testcase fails in the line where i is checked for 26. The value is 0: subroutine foo (b) logical b integer i, j character*24 s save if (b) then i = 26 j = 131 s = 'This is a test string' else if (i .ne. 26 .or. j .ne. 131) call abort if (s .ne. 'This is a test string') call abort end if end subroutine foo This is a runtime failure, but the problem can be seen with a crosscompiler, configured with --target=alpha-linux-gnu. The testcase should be compiled with "-O2 -mcpu=ev67 -mexplicit-relocs". Before _208r.sched1, we have: 10: r77:DI=high(`i.843') 11: r78:SI=0x1a 12: [r77:DI+low(`i.843')]=r78:SI REG_DEADr78:SI REG_DEADr77:DI ... 42: r102:DI=r81:DI+low(`s.845') REG_EQUAL`s.845' 43: r106:DI=[r102:DI+0xf&0xfffffffffffffff8] 44: r105:DI=[r81:DI+low(`s.845')&0xfffffffffffffff8] REG_EQUAL[`s.845'&0xfffffffffffffff8] ... 57: [r81:DI+low(`s.845')&0xfffffffffffffff8]=r105:DI REG_DEADr105:DI REG_DEADr81:DI (The last sequence is part of alpha_expand_block_move, specifically an unaligned access to the head of string "s.845"). In the above sequence, load and store to i.843 and s.845 are lined one after another. The problematic sequence is shown in function "foo" in _.208r.sched1 RTL dump, after scheduler reordered instructions: 17: r82:DI=high(`*$LC0') 16: r81:DI=high(`s.845') 70: r115:QI=0x20 10: r77:DI=high(`i.843') 18: r86:DI=r82:DI+low(`*$LC0') REG_EQUAL`*$LC0' 42: r102:DI=r81:DI+low(`s.845') REG_EQUAL`s.845' 19: r83:DI=[r82:DI+low(`*$LC0')&0xfffffffffffffff8] REG_DEADr82:DI REG_EQUAL[`*$LC0'&0xfffffffffffffff8] 44: r105:DI=[r81:DI+low(`s.845')&0xfffffffffffffff8] REG_EQUAL[`s.845'&0xfffffffffffffff8] 21: r89:DI=[r86:DI+0xf&0xfffffffffffffff8] 20: r84:DI=[r86:DI+0x8&0xfffffffffffffff8] 23: r91:DI=r86:DI&0x7 11: r78:SI=0x1a 43: r106:DI=[r102:DI+0xf&0xfffffffffffffff8] ... 12: [r77:DI+low(`i.843')]=r78:SI REG_DEADr78:SI REG_DEADr77:DI ... 57: [r81:DI+low(`s.845')&0xfffffffffffffff8]=r105:DI REG_DEADr105:DI REG_DEADr81:DI As it happens, i lives inside "aligned" s address range: (gdb) p &i $3 = (PTR TO -> ( integer(kind=4) )) 0x120012090 <i.853> (gdb) p &s $4 = (PTR TO -> ( character*24 )) 0x120012094 <s.855> Due to the way unaligned stores are implemented on alpha, where an aligned address is read first, the value is inserted and the combination is stored back to aligned address, we rewrite location of "i" with previous value, 0 in this case. To illustrate this problem, let's put breakpoints at the store of i (b1) and store of s (b2): 0x0000000120000a8c <+204>: cmoveq t10,0,t9 0x0000000120000a90 <+208>: ldah t10,0(gp) b1 0x0000000120000a94 <+212>: stl a3,-32640(t10) 0x0000000120000a98 <+216>: lda a3,131 0x0000000120000a9c <+220>: or t2,t8,t2 0x0000000120000aa0 <+224>: or t3,t9,t3 0x0000000120000aa4 <+228>: ldah t10,0(gp) 0x0000000120000aa8 <+232>: stb t6,23(t0) 0x0000000120000aac <+236>: insqh t2,t0,t8 0x0000000120000ab0 <+240>: mskqh t5,t0,t5 0x0000000120000ab4 <+244>: stl a3,-32644(t10) 0x0000000120000ab8 <+248>: insqh t3,t0,t1 0x0000000120000abc <+252>: insql t2,t0,t2 0x0000000120000ac0 <+256>: or t5,t8,t5 0x0000000120000ac4 <+260>: stq_u t5,15(t0) 0x0000000120000ac8 <+264>: insql t3,t0,t3 0x0000000120000acc <+268>: or t1,t2,t1 0x0000000120000ad0 <+272>: stq_u t1,8(t0) 0x0000000120000ad4 <+276>: stb a4,16(t0) 0x0000000120000ad8 <+280>: or t4,t3,t4 b2 0x0000000120000adc <+284>: stq_u t4,-32636(t7) 0x0000000120000ae0 <+288>: stb a5,17(t0) Breakpoint 1, 0x0000000120000a94 in foo (b=.TRUE.) at save_1.f90:7 7 i = 26 (gdb) p i $5 = 0 (gdb) s 8 j = 131 (gdb) p i $6 = 26 (gdb) c Continuing. Breakpoint 2, 0x0000000120000adc in foo (b=.TRUE.) at save_1.f90:9 9 s = 'This is a test string' (gdb) p i $7 = 26 (gdb) s 14 end subroutine foo (gdb) p i $8 = 0 (gdb) When compiled with additional -fno-schedule-insns, the test runs OK. [1] http://gcc.gnu.org/ml/gcc-patches/2012-11/msg02225.html [2] http://gcc.gnu.org/ml/gcc-testresults/2012-11/msg02492.html
Reverting r188868 as suggested by Steven [1] solves the problem, too. CC author of the patch. [1] http://gcc.gnu.org/ml/gcc-patches/2012-11/msg02549.html
Confirmed as 4.8 regression.
Mine
I don't understand how this sort of unaligned access that modifies unrelated objects can fit in with any reasonable threaded memory model, but I guess that's beyond the scope of this bug report ;-) Here's a patch that restores negative sizes after alignment adjustments, so that the tests for negative sizes elsewhere kick in. We might want to use absolute values elsewhere to get stricter results, but this should be at least conservatively better. Uros, I'm regstrapping this on x86_64; would you please give it a spin on alpha? TIA, --- a/gcc/alias.c +++ b/gcc/alias.c @@ -2100,14 +2100,20 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) /* Deal with alignment ANDs by adjusting offset and size so as to cover the maximum range, without taking any previously known - alignment into account. */ + alignment into account. Make a size negative after such an + adjustments, so that, if we end up with e.g. two SYMBOL_REFs, we + assume a potential overlap, because they may end up in contiguous + memory locations and the stricter-alignment access may span over + part of both. */ if (GET_CODE (x) == AND && CONST_INT_P (XEXP (x, 1))) { HOST_WIDE_INT sc = INTVAL (XEXP (x, 1)); unsigned HOST_WIDE_INT uc = sc; - if (xsize > 0 && sc < 0 && -uc == (uc & -uc)) + if (sc < 0 && -uc == (uc & -uc)) { - xsize -= sc + 1; + if (xsize > 0) + xsize = -xsize; + xsize += sc + 1; c -= sc + 1; return memrefs_conflict_p (xsize, canon_rtx (XEXP (x, 0)), ysize, y, c); @@ -2117,9 +2123,11 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, HOST_WIDE_INT c) { HOST_WIDE_INT sc = INTVAL (XEXP (y, 1)); unsigned HOST_WIDE_INT uc = sc; - if (ysize > 0 && sc < 0 && -uc == (uc & -uc)) + if (sc < 0 && -uc == (uc & -uc)) { - ysize -= sc + 1; + if (ysize > 0) + ysize = -ysize; + ysize += sc + 1; c += sc + 1; return memrefs_conflict_p (xsize, x, ysize, canon_rtx (XEXP (y, 0)), c);
(In reply to comment #4) > Uros, I'm regstrapping this on x86_64; would you please give it a spin on > alpha? TIA, Thanks, I have started the bootstrap, please expect results in ~10h.
(In reply to comment #4) > Uros, I'm regstrapping this on x86_64; would you please give it a spin on > alpha? TIA, The patch bootstraps and regtests OK on alpha [1]. The save_1.f90 failures are all gone now. [1] http://gcc.gnu.org/ml/gcc-testresults/2012-12/msg00803.html
Alexandre, do you plan to proceed with the patch in Comment #4?
The patch in #c4 is ok.
Author: aoliva Date: Wed Jan 16 04:31:30 2013 New Revision: 195227 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=195227 Log: PR rtl-optimization/55547 PR rtl-optimization/53827 PR debug/53671 PR debug/49888 * alias.c (memrefs_conflict_p): Set sizes to negative after AND adjustments. Modified: trunk/gcc/ChangeLog trunk/gcc/alias.c
Fixed, although there's a second patch (now pending review) that makes the handling of negative sizes for aligned addresses stricter and more correct in http://gcc.gnu.org/ml/gcc-patches/2013-01/msg00822.html
Unfortunately the fix caused some guality regressions, on x86_64 -m32 +FAIL: gcc.dg/guality/drap.c -O1 line 21 a == 5 +FAIL: gcc.dg/guality/drap.c -O1 line 22 b == 6 up to -Os, and +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 16 arg1 == 1 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 16 arg2 == 2 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 16 arg7 == 30 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 18 arg1 == 1 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 18 arg2 == 2 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 18 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 18 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 18 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 18 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-1.c -O1 line 18 arg7 == 30 (up to -Os), +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 16 arg1 == 1 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 16 arg2 == 2 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 16 arg7 == 30 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 18 arg1 == 1 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 18 arg2 == 2 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 18 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 18 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 18 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 18 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-2.c -O1 line 18 arg7 == 30 (up to -Os), +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 14 arg1 == 1 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 14 arg2 == 2 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 14 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 14 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 14 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 14 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 14 arg7 == 30 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 16 arg1 == 1 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 16 arg2 == 2 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-3.c -O1 line 16 arg7 == 30 (up to -Os) and finally +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 14 arg1 == 1 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 14 arg2 == 2 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 14 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 14 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 14 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 14 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 14 arg7 == 30 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 16 arg1 == 1 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 16 arg2 == 2 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 16 arg3 == 3 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 16 arg4 == 4 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 16 arg5 == 5 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 16 arg6 == 6 +FAIL: gcc.dg/guality/pr36728-4.c -O1 line 16 arg7 == 30 (up to -Os), and similarly for x86_64 -m64 (fewer 36728-*.c regressions, but still some and all drap.c regressions).
Ah, hjl opened PR56006 to track #c11.
Author: aoliva Date: Fri Jan 18 10:57:36 2013 New Revision: 195289 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=195289 Log: PR rtl-optimization/55547 PR rtl-optimization/53827 PR debug/53671 PR debug/49888 * alias.c (offset_overlap_p): New, factored out of... (memrefs_conflict_p): ... this. Use absolute sizes. Retain the conservative special case for symbolic constants. Don't adjust zero sizes on alignment. Modified: trunk/gcc/ChangeLog trunk/gcc/alias.c
Author: aoliva Date: Fri Jan 18 11:05:04 2013 New Revision: 195292 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=195292 Log: Mention PR debug/56006 in earlier commit for PR rtl-optimization/55547. Modified: trunk/gcc/ChangeLog