[Bug tree-optimization/49955] Fails to do partial basic-block SLP

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Aug 3 15:17:00 GMT 2011


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49955

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2011.08.03 15:12:42
     Ever Confirmed|0                           |1

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-08-03 15:12:42 UTC ---
The loop that remains after fixing PR49957 in 410.bwaves is the following,
which loop SLP does not handle (well, I'm not exactly sure) because

t.f:18: note: ==> examining statement: t1_62 = *q_61(D)[D.1645_60];

t.f:18: note: num. args = 4 (not unary/binary/ternary op).
t.f:18: note: vect_is_simple_use: operand *q_61(D)[D.1645_60]
t.f:18: note: not ssa-name.
t.f:18: note: use not simple.
t.f:18: note: no array mode for V2DF[5]
t.f:18: note: the size of the group of strided accesses is not a power of 2
t.f:18: note: not vectorized: relevant stmt not supported: t1_62 =
*q_61(D)[D.1645_60];

t.f:18: note: bad operation or unsupported loop bound.
t.f:1: note: vectorized 0 loops in function.

probably the issue that we can't handle this kind of "invariants" in the
SLP group?  Thus, the SLP group should be q(2,..), q(3,...) ... q(5, ...)
which is size 4, q(1,..) should be treated as invariant. 


      subroutine shell(nx,ny,nz,q,dt,cfl,dx,dy,dz)
      implicit none
      integer nx,ny,nz,n,i,j,k
      real*8 cfl,dx,dy,dz,dt
      real*8 gm,Re,Pr,cfll,t1,t2,t3,t4,t5,t6,t7,t8,mu

      real*8 q(5,nx,ny,nz)

C       This particular problem is periodic only


      cfll=0.1d0+(n-1.0d0)*cfl/20.0d0
      if (cfll.ge.cfl) cfll=cfl
      t8=0.0d0

      do k=1,nz
         do j=1,ny
            do i=1,nx
               t1=q(1,i,j,k)
               t2=q(2,i,j,k)/t1
               t3=q(3,i,j,k)/t1
               t4=q(4,i,j,k)/t1
               t5=(gm-1.0d0)*(q(5,i,j,k)-0.5d0*t1*(t2*t2+t3*t3+t4*t4))
               t6=dSQRT(gm*t5/t1)
               mu=gm*Pr*(gm*t5/t1)**0.75d0*2.0d0/Re/t1
               t7=((dabs(t2)+t6)/dx+mu/dx**2)**2 +
     1            ((dabs(t3)+t6)/dy+mu/dy**2)**2 +
     2            ((dabs(t4)+t6)/dz+mu/dz**2)**2
               t7=DSQRT(t7)
               t8=max(t8,t7)
            enddo
         enddo
      enddo
      dt=cfll / t8

      return
      end



More information about the Gcc-bugs mailing list