On Linux/ia64, revision 144413 gave FAIL: gcc.dg/tree-ssa/loop-31.c scan-tree-dump-times optimized " \+ 2" 1 Revision 144404 is OK. Revision 144405: http://gcc.gnu.org/ml/gcc-cvs/2009-02/msg00572.html may be the cause.
Sooo - what is the content of the .optimized dump?
(In reply to comment #1) > Sooo - what is the content of the .optimized dump? > ;; Function foo (foo) Analyzing Edge Insertions. foo (int len, int v) { long unsigned int D.1255; long unsigned int ivtmp.14; <bb 2>: if (len > 0) goto <bb 3>; else goto <bb 5>; <bb 3>: ivtmp.14 = (long unsigned int) &a[0]; D.1255 = ((long unsigned int) &a + 2) + (long unsigned int) ((unsigned int) le n + 4294967295) * 2; <bb 4>: MEM[index: ivtmp.14] = (short int) (short int) v; ivtmp.14 = ivtmp.14 + 2; if (ivtmp.14 != D.1255) goto <bb 4>; else goto <bb 5>; <bb 5>: return a[0]; }
Note index: again, booooo. I thought I got rid of all those.
Subject: Re: [4.4 Regression] gcc.dg/tree-ssa/loop-31.c On Wed, 25 Feb 2009, hjl dot tools at gmail dot com wrote: > ------- Comment #2 from hjl dot tools at gmail dot com 2009-02-25 14:37 ------- > (In reply to comment #1) > > Sooo - what is the content of the .optimized dump? > > > > ;; Function foo (foo) > > Analyzing Edge Insertions. > foo (int len, int v) > { > long unsigned int D.1255; > long unsigned int ivtmp.14; > > <bb 2>: > if (len > 0) > goto <bb 3>; > else > goto <bb 5>; > > <bb 3>: > ivtmp.14 = (long unsigned int) &a[0]; > D.1255 = ((long unsigned int) &a + 2) + (long unsigned int) ((unsigned int) > le > n + 4294967295) * 2; This likely used to be folded to &a[len], but the addressing-mode is still what it is supposed to be. Can you attach the dump with the patch reverted as well? > <bb 4>: > MEM[index: ivtmp.14] = (short int) (short int) v; > ivtmp.14 = ivtmp.14 + 2; > if (ivtmp.14 != D.1255) > goto <bb 4>; > else > goto <bb 5>; > > <bb 5>: > return a[0]; > > } > > >
Revision 144404 gave: ;; Function foo (foo) Analyzing Edge Insertions. foo (int len, int v) { short int * D.1254; short int * ivtmp.14; <bb 2>: if (len > 0) goto <bb 3>; else goto <bb 5>; <bb 3>: D.1254 = &a[0] + ((long unsigned int) ((unsigned int) len + 4294967295) + 1) * 2; ivtmp.14 = &a[0]; <bb 4>: MEM[base: ivtmp.14] = (short int) (short int) v; ivtmp.14 = ivtmp.14 + 2; if (ivtmp.14 != D.1254) goto <bb 4>; else goto <bb 5>; <bb 5>: return a[0]; }
Subject: Re: [4.4 Regression] gcc.dg/tree-ssa/loop-31.c On Wed, 25 Feb 2009, hjl dot tools at gmail dot com wrote: > ------- Comment #5 from hjl dot tools at gmail dot com 2009-02-25 14:53 ------- > Revision 144404 gave: Is the assembly different?
Revision 144405 gave: .text .align 16 .global foo# .type foo#, @function .proc foo# foo: .prologue .body cmp4.ge p6, p7 = 0, r32 (p6) br.cond.spnt .L2 addl r14 = @ltoffx(a#), r1 ;; ld8.mov r14 = [r14], a# adds r16 = -1, r32 ;; addp4 r16 = r16, r0 addl r15 = @gprel(.LC0), gp ;; ld8 r15 = [r15] ;; shladd r15 = r16, 1, r15 .L3: st2 [r14] = r33, 2 ;; cmp.ne p6, p7 = r15, r14 (p6) br.cond.sptk .L3 .L2: addl r14 = @ltoffx(a#), r1 ;; ld8.mov r14 = [r14], a# ;; ld2 r8 = [r14] br.ret.sptk.many b0 ;; .endp foo# Revision 144404 gave: .proc foo# foo: .prologue .save ar.lc, r2 mov r2 = ar.lc .body cmp4.ge p6, p7 = 0, r32 (p6) br.cond.spnt .L2 adds r15 = -1, r32 ;; addp4 r15 = r15, r0 ;; adds r15 = 1, r15 addl r14 = @ltoffx(a#), r1 ;; ld8.mov r14 = [r14], a# ;; shladd r15 = r15, 1, r14 ;; sub r15 = r15, r14 ;; adds r15 = -2, r15 ;; shr.u r15 = r15, 1 ;; mov ar.lc = r15 .L3: st2 [r14] = r33, 2 ;; br.cloop.sptk.few .L3 .L2: addl r14 = @ltoffx(a#), r1 ;; ld8.mov r14 = [r14], a# ;; ld2 r8 = [r14] mov ar.lc = r2 br.ret.sptk.many b0 ;; .endp foo#
gcc.dg/tree-ssa/loop-31.c comes from PR 32283. This testcase only runs on arms and ia64.
A patch is posted at http://gcc.gnu.org/ml/gcc-patches/2009-02/msg01185.html
The difference between > D.1254 = &a[0] + ((long unsigned int) ((unsigned int) len + 4294967295) + 1) > * 2; (original) and > D.1255 = ((long unsigned int) &a + 2) + (long unsigned int) ((unsigned int) > len + 4294967295) * 2; (current) is mostly cosmetical; the test in the testcase should be made more robust, but other than that, there is no regression here.
(In reply to comment #10) > The difference between > > > D.1254 = &a[0] + ((long unsigned int) ((unsigned int) len + 4294967295) + 1) > > * 2; > > (original) and > > > D.1255 = ((long unsigned int) &a + 2) + (long unsigned int) ((unsigned int) > > len + 4294967295) * 2; > > (current) is mostly cosmetical; the test in the testcase should be made more > robust, but other than that, there is no regression here. > The new loop is .L3: st2 [r14] = r33, 2 ;; cmp.ne p6, p7 = r15, r14 (p6) br.cond.sptk .L3 The old loop is mov ar.lc = r15 .L3: st2 [r14] = r33, 2 ;; br.cloop.sptk.few .L3 They are quite different.
Subject: Re: [4.4 Regression] gcc.dg/tree-ssa/loop-31.c > ------- Comment #11 from hjl dot tools at gmail dot com 2009-02-25 19:18 ------- > (In reply to comment #10) > > The difference between > > > > > D.1254 = &a[0] + ((long unsigned int) ((unsigned int) len + 4294967295) + 1) > > > * 2; > > > > (original) and > > > > > D.1255 = ((long unsigned int) &a + 2) + (long unsigned int) ((unsigned int) > > > len + 4294967295) * 2; > > > > (current) is mostly cosmetical; the test in the testcase should be made more > > robust, but other than that, there is no regression here. > > > > The new loop is > > .L3: > st2 [r14] = r33, 2 > ;; > cmp.ne p6, p7 = r15, r14 > (p6) br.cond.sptk .L3 > > The old loop is > > mov ar.lc = r15 > .L3: > st2 [r14] = r33, 2 > ;; > br.cloop.sptk.few .L3 > > They are quite different. nevertheless, loop-31.c test is not supposed to test this; it just checks whether strength reduction is performed correctly. I think we already have another PR regarding the problem that strength reduction and iv elimination on tree level may cause rtl level # of iterations analysis to fail (which leads to the code difference).
Thus this is a testsuite issue.
Subject: Bug 39297 Author: sje Date: Tue Jun 23 18:28:26 2009 New Revision: 148862 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=148862 Log: 2009-06-23 Steve Ellcey <sje@cup.hp.com> PR testsuite/39297 * gcc.dg/ssa/loop-31.c: Change scan rules. Modified: trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/tree-ssa/loop-31.c
Test no longer fails due to checkin that fixes the scan for IA64. See http://gcc.gnu.org/ml/gcc-patches/2009-06/msg01600.html