[PATCH] New early loop unrolling pass

Dominique Dhumieres dominiq@lps.ens.fr
Thu May 1 13:29:00 GMT 2008


> Of course, the patch didn't change.  I don't consider this a show-stopper
> as it obviously just exposes bugs in the vectorizer or its cost model.
> There is plenty of time to address this during stage1/2 or ignore it.
> 
> It would be helpful if you could provide a reduced runtime testcase with
> just one loop that shows this regression.

I am not sure the problem is a bug in the vectorizer, but rather than the 
early unrolling is too agressive and prevent the vectorization of the 
unrolled loop, as shown by the following reduced test:

integer, parameter :: n = 1000000
integer  :: i, j, k
real(8)  :: pi, sum1, sum2, theta, phi, sini, cosi, dotp
real(8)  :: a(3), b(9,3), c(3)
pi = acos(-1.0d0)
theta = pi/9.0d0
phi = pi/4.5d0
do k = 1, 9
   b(k,1) = 0.5d0*cos(k*phi)*sin(k*theta)
   b(k,2) = 0.5d0*sin(k*phi)*sin(k*theta)
   b(k,3) = 0.5d0*cos(k*theta)
end do
theta = pi/real(n,kind=8)
sum2 = 0.0
do i = 1, n
    sini = sin(i*theta)
    cosi = cos(i*theta)
    phi = pi/4.5d0
    sum1 = 0.0d0
    do j = 1, 9
	c(1) = 0.5d0*cos(j*phi)*sini
	c(2) = 0.5d0*sin(j*phi)*sini
	c(3) = 0.5d0*cosi
	do k =1, 9
!           a(1) = b(k,1) - c(1)
!           a(2) = b(k,2) - c(2)
!           a(3) = b(k,3) - c(3)
	   a = b(k,:) - c
	   dotp = a(1)*a(1) + a(2)*a(2) + a(3)*a(3)
!           dotp = dot_product(a,a)
	   sum1 = sum1 +dotp
	end do
    end do
    sum2 = sum2 + sum1/81.0d0
end do
print *, 3.0d0*sum2/(4.0d0*pi*real(n,kind=8))
end

[ibook-dhum] bug/timing% gfc -O3 -ffast-math -funroll-loops -ftree-loop-linear -ftree-vectorizer-verbose=2 test_vect.f90
test_vect.f90:24: note: LOOP VECTORIZED.
test_vect.f90:8: note: not vectorized: unsupported data-type complex(kind=8)
test_vect.f90:1: note: vectorized 1 loops in function.

The 'k' loop is vectorized if one implicit loop is left inside (either
"a = b(k,:) - c" or "dotp = dot_product(a,a)", but when these two implicit 
loops are unrolled by hand, it seems that the 'k' loop is now unrolled
preventing any vectorization:

     do k =1, 9
	a(1) = b(k,1) - c(1)
	a(2) = b(k,2) - c(2)
	a(3) = b(k,3) - c(3)
!         a = b(k,:) - c
	dotp = a(1)*a(1) + a(2)*a(2) + a(3)*a(3)
!         dotp = dot_product(a,a)
	sum1 = sum1 +dotp
     end do

[ibook-dhum] bug/timing% gfc -O3 -ffast-math -funroll-loops -ftree-loop-linear -ftree-vectorizer-verbose=2 test_vect.f90
test_vect.f90:20: note: not vectorized: unsupported data-type complex(kind=8)
test_vect.f90:8: note: not vectorized: unsupported data-type complex(kind=8)
test_vect.f90:1: note: vectorized 0 loops in function.

Am I correct to understand that the vectorizer operates on loops only?
If yes, vectorizable loops should probably not unrolled (at least without
care).

Last question: should I continue to use pr34265, or close it and onpen a 
new pr?

Cheers

Dominique



More information about the Gcc-patches mailing list