This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/49955] Fails to do partial basic-block SLP
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 3 Aug 2011 15:17:12 +0000
- Subject: [Bug tree-optimization/49955] Fails to do partial basic-block SLP
- Auto-submitted: auto-generated
- References: <bug-49955-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49955
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Last reconfirmed| |2011.08.03 15:12:42
Ever Confirmed|0 |1
--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-08-03 15:12:42 UTC ---
The loop that remains after fixing PR49957 in 410.bwaves is the following,
which loop SLP does not handle (well, I'm not exactly sure) because
t.f:18: note: ==> examining statement: t1_62 = *q_61(D)[D.1645_60];
t.f:18: note: num. args = 4 (not unary/binary/ternary op).
t.f:18: note: vect_is_simple_use: operand *q_61(D)[D.1645_60]
t.f:18: note: not ssa-name.
t.f:18: note: use not simple.
t.f:18: note: no array mode for V2DF[5]
t.f:18: note: the size of the group of strided accesses is not a power of 2
t.f:18: note: not vectorized: relevant stmt not supported: t1_62 =
*q_61(D)[D.1645_60];
t.f:18: note: bad operation or unsupported loop bound.
t.f:1: note: vectorized 0 loops in function.
probably the issue that we can't handle this kind of "invariants" in the
SLP group? Thus, the SLP group should be q(2,..), q(3,...) ... q(5, ...)
which is size 4, q(1,..) should be treated as invariant.
subroutine shell(nx,ny,nz,q,dt,cfl,dx,dy,dz)
implicit none
integer nx,ny,nz,n,i,j,k
real*8 cfl,dx,dy,dz,dt
real*8 gm,Re,Pr,cfll,t1,t2,t3,t4,t5,t6,t7,t8,mu
real*8 q(5,nx,ny,nz)
C This particular problem is periodic only
cfll=0.1d0+(n-1.0d0)*cfl/20.0d0
if (cfll.ge.cfl) cfll=cfl
t8=0.0d0
do k=1,nz
do j=1,ny
do i=1,nx
t1=q(1,i,j,k)
t2=q(2,i,j,k)/t1
t3=q(3,i,j,k)/t1
t4=q(4,i,j,k)/t1
t5=(gm-1.0d0)*(q(5,i,j,k)-0.5d0*t1*(t2*t2+t3*t3+t4*t4))
t6=dSQRT(gm*t5/t1)
mu=gm*Pr*(gm*t5/t1)**0.75d0*2.0d0/Re/t1
t7=((dabs(t2)+t6)/dx+mu/dx**2)**2 +
1 ((dabs(t3)+t6)/dy+mu/dy**2)**2 +
2 ((dabs(t4)+t6)/dz+mu/dz**2)**2
t7=DSQRT(t7)
t8=max(t8,t7)
enddo
enddo
enddo
dt=cfll / t8
return
end