This is gcc-4.3.3 in Gentoo, compiled with CHOST=i686-pc-linux-gnu and the configure arguments: --prefix=/usr --bindir=/usr/i686-pc-linux-gnu/gcc-bin/4.3.3 --includedir=/usr/lib/gcc/i686-pc-linux-gnu/4.3.3/include --datadir=/usr/share/gcc-data/i686-pc-linux-gnu/4.3.3 --mandir=/usr/share/gcc-data/i686-pc-linux-gnu/4.3.3/man --infodir=/usr/share/gcc-data/i686-pc-linux-gnu/4.3.3/info --with-gxx-include-dir=/usr/lib/gcc/i686-pc-linux-gnu/4.3.3/include/g++-v4 --host=i686-pc-linux-gnu --build=i686-pc-linux-gnu --disable-altivec --disable-fixed-point --enable-nls --without-included-gettext --with-system-zlib --disable-checking --disable-werror --enable-secureplt --disable-multilib --enable-libmudflap --disable-libssp --enable-libgomp --enable-cld --enable-java-awt=gtk --with-arch=i686 --enable-languages=c,c++,java,treelang,fortran --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/ --with-pkgversion='Gentoo 4.3.3-r2 p1.1, pie-10.1.5' Some simple code works with -O1 but not -O2. I'll attach the C file and the disassembly from objdump. I'm no good with assembly, but I find it weird that 'last' is decremented by 8 at one point.
Created attachment 17842 [details] reverse.c (The strange usage of pointers is there since I found the bug in C++ code, but noticed it's present in C as well.)
`gcc -O1 reverse.c -o reverse -Wall && objdump -d reverse` [...] 08048470 <reverse>: 8048470: 55 push %ebp 8048471: 89 e5 mov %esp,%ebp 8048473: 53 push %ebx 8048474: 8b 4d 08 mov 0x8(%ebp),%ecx 8048477: 8b 45 0c mov 0xc(%ebp),%eax 804847a: 39 c8 cmp %ecx,%eax 804847c: 75 1a jne 8048498 <reverse+0x28> 804847e: eb 36 jmp 80484b6 <reverse+0x46> 8048480: 8b 11 mov (%ecx),%edx 8048482: 8b 03 mov (%ebx),%eax 8048484: 89 01 mov %eax,(%ecx) 8048486: 89 13 mov %edx,(%ebx) 8048488: 83 c1 04 add $0x4,%ecx 804848b: 39 d9 cmp %ebx,%ecx 804848d: 74 27 je 80484b6 <reverse+0x46> 804848f: 83 eb 04 sub $0x4,%ebx 8048492: 39 d9 cmp %ebx,%ecx 8048494: 75 ea jne 8048480 <reverse+0x10> 8048496: eb 1e jmp 80484b6 <reverse+0x46> 8048498: 8d 58 fc lea -0x4(%eax),%ebx 804849b: 39 d9 cmp %ebx,%ecx 804849d: 8d 76 00 lea 0x0(%esi),%esi 80484a0: 74 14 je 80484b6 <reverse+0x46> 80484a2: 8b 11 mov (%ecx),%edx 80484a4: 8b 03 mov (%ebx),%eax 80484a6: 89 01 mov %eax,(%ecx) 80484a8: 89 13 mov %edx,(%ebx) 80484aa: 83 c1 04 add $0x4,%ecx 80484ad: 39 cb cmp %ecx,%ebx 80484af: 74 05 je 80484b6 <reverse+0x46> 80484b1: 83 eb 04 sub $0x4,%ebx 80484b4: eb dc jmp 8048492 <reverse+0x22> 80484b6: 5b pop %ebx 80484b7: 5d pop %ebp 80484b8: c3 ret [...] `gcc -O2 reverse.c -o reverse -Wall && objdump -d reverse` [...] 08048470 <reverse>: 8048470: 55 push %ebp 8048471: 89 e5 mov %esp,%ebp 8048473: 56 push %esi 8048474: 8b 4d 08 mov 0x8(%ebp),%ecx 8048477: 53 push %ebx 8048478: 8b 5d 0c mov 0xc(%ebp),%ebx 804847b: 39 cb cmp %ecx,%ebx 804847d: 74 37 je 80484b6 <reverse+0x46> 804847f: 8d 73 fc lea -0x4(%ebx),%esi 8048482: 39 f1 cmp %esi,%ecx 8048484: 74 30 je 80484b6 <reverse+0x46> 8048486: 8b 43 fc mov -0x4(%ebx),%eax 8048489: 8b 11 mov (%ecx),%edx 804848b: 89 01 mov %eax,(%ecx) 804848d: 83 c1 04 add $0x4,%ecx 8048490: 39 ce cmp %ecx,%esi 8048492: 89 53 fc mov %edx,-0x4(%ebx) 8048495: 74 1f je 80484b6 <reverse+0x46> 8048497: 83 eb 08 sub $0x8,%ebx 804849a: eb 07 jmp 80484a3 <reverse+0x33> 804849c: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi 80484a0: 83 eb 04 sub $0x4,%ebx 80484a3: 39 d9 cmp %ebx,%ecx 80484a5: 74 0f je 80484b6 <reverse+0x46> 80484a7: 8b 03 mov (%ebx),%eax 80484a9: 8b 11 mov (%ecx),%edx 80484ab: 89 01 mov %eax,(%ecx) 80484ad: 83 c1 04 add $0x4,%ecx 80484b0: 39 d9 cmp %ebx,%ecx 80484b2: 89 13 mov %edx,(%ebx) 80484b4: 75 ea jne 80484a0 <reverse+0x30> 80484b6: 5b pop %ebx 80484b7: 5e pop %esi 80484b8: 5d pop %ebp 80484b9: c3 ret 80484ba: 8d b6 00 00 00 00 lea 0x0(%esi),%esi [...]
Sorry, but there's one last thing. Here's the output I get with -O1. 1 2 3 4 5 6 7 8 8 7 6 5 4 3 2 1 and with -O2: 1 2 3 4 5 6 7 8 8 7 6 4 5 3 2 1
Testcase that fails at -O1 (the key is that reverse needs to be inlined): extern void abort (void); static void __attribute__((always_inline)) reverse(int *first, int *last) { if (first == last--) return; while (first != last) { int t = *first; *first = *last; *last = t; if (++first == last--) break; } } int main() { int seq[] = { 1, 2, 3, 4, 5, 6, 7, 8 }; reverse(seq, seq + 8); if (seq[3] != 5 || seq[4] != 4) abort (); return 0; } On trunk we optimize it all to obviously wrong ... t_17 = seq[2]; D.1989_19 = seq[5]; seq[2] = D.1989_19; seq[5] = t_17; first_20 = &seq[3]; last_21 = &seq[4]; ivtmp.9_36 = 0; D.1253_2 = seq[3]; if (D.1253_2 != 5) thus we iterate one time less as necessary.
It is number of iteration analysis that gets it wrong (I suppose it might get confused by the two exits of the loop?). Disabling loop header copying makes the code easier to analyze.
(In reply to comment #5) > It is number of iteration analysis that gets it wrong (I suppose it might get > confused by the two exits of the loop?). Sort of; # of iterations analysis assumes that pointers never wrap, and uses this assumption to derive a wrong number of iterations for the first exit (which is not taken). We had a similar problem before (PR 25985), but I somehow persuaded myself that this cannot happen with pointers.
Subject: Re: [4.3/4.4/4.5 Regression] Number of iterations analysis wrong On Fri, 15 May 2009, rakdver at gcc dot gnu dot org wrote: > ------- Comment #6 from rakdver at gcc dot gnu dot org 2009-05-15 00:34 ------- > (In reply to comment #5) > > It is number of iteration analysis that gets it wrong (I suppose it might get > > confused by the two exits of the loop?). > > Sort of; # of iterations analysis assumes that pointers never wrap, and uses > this assumption to derive a wrong number of iterations for the first exit > (which is not taken). We had a similar problem before (PR 25985), but I > somehow persuaded myself that this cannot happen with pointers. Ah - it indeed cannot happen, but you need to assume that the offsets in POINTER_PLUS_EXPRs are signed (even though they are unsigned as they are of type sizetype). At least that should be the only "overflow" present in this testcase, no? Richard.
Subject: Re: [4.3/4.4/4.5 Regression] Number of iterations analysis wrong > > > It is number of iteration analysis that gets it wrong (I suppose it might get > > > confused by the two exits of the loop?). > > > > Sort of; # of iterations analysis assumes that pointers never wrap, and uses > > this assumption to derive a wrong number of iterations for the first exit > > (which is not taken). We had a similar problem before (PR 25985), but I > > somehow persuaded myself that this cannot happen with pointers. > > Ah - it indeed cannot happen, but you need to assume that the offsets > in POINTER_PLUS_EXPRs are signed (even though they are unsigned as > they are of type sizetype). that is not the problem. The problem is that # of iterations analysis has a look at the first test, of form [x,+,4] == [x + 28, +, -4], observes that the variables do not wrap, and concludes that the loop is not infinite. So far OK, but next we infer that this means that the exit is taken, i.e., that 28 % 8 = 0, and the things go down for there. I fixed this error before, but special-cased the pointer ivs in a strike of wishful thinking.
Subject: Bug 40087 Author: rakdver Date: Wed May 20 00:33:54 2009 New Revision: 147727 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147727 Log: PR tree-optimization/40087 * tree-ssa-loop-niter.c (number_of_iterations_ne_max, number_of_iterations_ne): Rename never_infinite argument. (number_of_iterations_lt_to_ne, number_of_iterations_lt, number_of_iterations_le): Handle pointer-type ivs when exit_must_be_taken is false. (number_of_iterations_cond): Do not always assume that exit_must_be_taken if the control variable is a pointer. * gcc.dg/tree-ssa/pr40087.c: New test. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/pr40087.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-loop-niter.c
Fixed on trunk sofar.
Let's backport this before 4.4.1 goes out.
Subject: Bug 40087 Author: rakdver Date: Fri May 22 20:43:39 2009 New Revision: 147806 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147806 Log: PR tree-optimization/40087 * tree-ssa-loop-niter.c (number_of_iterations_ne_max, number_of_iterations_ne): Rename never_infinite argument. (number_of_iterations_lt_to_ne, number_of_iterations_lt, number_of_iterations_le): Handle pointer-type ivs when exit_must_be_taken is false. (number_of_iterations_cond): Do not always assume that exit_must_be_taken if the control variable is a pointer. * gcc.dg/tree-ssa/pr40087.c: New test. Added: branches/gcc-4_4-branch/gcc/testsuite/gcc.dg/tree-ssa/pr40087.c Modified: branches/gcc-4_4-branch/gcc/ChangeLog branches/gcc-4_4-branch/gcc/testsuite/ChangeLog branches/gcc-4_4-branch/gcc/tree-ssa-loop-niter.c
And for 4.4.1.
ping 4.3.4...
(In reply to comment #14) > ping 4.3.4... The PR40087 fix depends on changes from the PR39455 fix. Both of them are 4.3 regressions, and I've used both fixes in my 4.3 tree for a while now without issues.
Back to P2. Testing a backport.
Fixed.
Subject: Bug 40087 Author: rguenth Date: Wed Jun 17 19:45:52 2009 New Revision: 148625 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=148625 Log: 2009-06-17 Richard Guenther <rguenther@suse.de> Backport from mainline 2009-03-16 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/39455 * tree-ssa-loop-niter.c (number_of_iterations_lt_to_ne): Fix types mismatches for POINTER_TYPE_P (type). (number_of_iterations_le): Likewise. * gcc.dg/pr39455.c: New test. 2009-05-19 Zdenek Dvorak <ook@ucw.cz> PR tree-optimization/40087 * tree-ssa-loop-niter.c (number_of_iterations_ne_max, number_of_iterations_ne): Rename never_infinite argument. (number_of_iterations_lt_to_ne, number_of_iterations_lt, number_of_iterations_le): Handle pointer-type ivs when exit_must_be_taken is false. (number_of_iterations_cond): Do not always assume that exit_must_be_taken if the control variable is a pointer. * gcc.dg/tree-ssa/pr40087.c: New test. Added: branches/gcc-4_3-branch/gcc/testsuite/gcc.dg/pr39455.c branches/gcc-4_3-branch/gcc/testsuite/gcc.dg/tree-ssa/pr40087.c Modified: branches/gcc-4_3-branch/gcc/ChangeLog branches/gcc-4_3-branch/gcc/testsuite/ChangeLog branches/gcc-4_3-branch/gcc/tree-ssa-loop-niter.c
Fi-xed!