This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
Re: Vectorizing HIRLAM 4: complicated access patterns examined.
- From: Toon Moene <toon at moene dot indiv dot nluug dot nl>
- To: DORIT at il dot ibm dot com, toon at moene dot indiv dot nluug dot nl
- Cc: fortran at gcc dot gnu dot org, gcc at gcc dot gnu dot org, IRAR at il dot ibm dot com, pop at icps dot u-strasbg dot fr
- Date: Thu, 27 Oct 2005 20:24:49 +0200
- Subject: Re: Vectorizing HIRLAM 4: complicated access patterns examined.
Dorit wrote:
It looks like maybe a 64bit scalar-evolution issue - when I compile on
powerpc-linux with -m64, I also get the
"vect4.f:4: note: not consecutive access"
message.
This problem looks very similar to PR18403 which has been resolved a while
ago:
When compiling for 32bit, we get the following representation for the loop:
# i_2 = PHI <i_25(11), i_41(14)>;
<L12>:;
D.505_38 = i_2 + -1;
D.506_39 = (*b_14)[D.505_38];
(*a_9)[D.505_38] = D.506_39;
i_41 = i_2 + 1;
if (i_2 == D.489_27) goto <L26>; else goto <L27>;
When compiling for 64bit, there is an extra cast:
# i_2 = PHI <i_27(11), i_45(14)>;
<L12>:;
D.691_41 = (int8) i_2;
D.692_42 = D.691_41 + -1;
D.693_43 = (*b_16)[D.692_42];
(*a_10)[D.692_42] = D.693_43;
i_45 = i_2 + 1;
if (i_2 == D.674_29) goto <L26>; else goto <L27>;
Shouldn't the cast be hoisted out of the loop ? The cast of a loop
invariant variable (i_2) is itself loop-invariant.
Anyway, while we're waiting for Daniel to complete the cvs-svn
transition, we can have some more fun with vectors:
SUBROUTINE S(N)
DIMENSION A(N), B(N)
READ*,ISTART,ISTOP,B
DO I = ISTART, ISTOP
A(I) = B(I)
ENDDO
PRINT*,A
END
+ /usr/snp/bin/gfortran -g -S -O3 -ftree-vectorize -ftree-vectorizer-verbose=2 -msse2 vect4.f
vect4.f:4: note: not vectorized: complicated access pattern.
vect4.f:4: note: vectorized 0 loops in function.
+ /usr/snp/bin/gfortran -g -S -O3 -m32 -ftree-vectorize -ftree-vectorizer-verbose=2 -msse2 vect4.f
vect4.f:4: note: LOOP VECTORIZED.
vect4.f:4: note: vectorized 1 loops in function.
That's what your experience with powerpc64 was too.
SUBROUTINE S(N)
DIMENSION A(N), B(12)
COMMON /COM/ B
DO I = 1, 12
A(I) = B(I)
ENDDO
PRINT*,A(1:12)
END
+ /usr/snp/bin/gfortran -g -S -O3 -ftree-vectorize -ftree-vectorizer-verbose=2 -msse2 vect5.f
vect5.f:4: note: LOOP VECTORIZED.
vect5.f:4: note: vectorized 1 loops in function.
+ /usr/snp/bin/gfortran -g -S -O3 -m32 -ftree-vectorize -ftree-vectorizer-verbose=2 -msse2 vect5.f
vect5.f:4: note: LOOP VECTORIZED.
vect5.f:4: note: vectorized 1 loops in function.
Hmmm, this one is now also vectorised with -m64 - obviously a different problem.
SUBROUTINE S(N)
INTEGER N
COMMON /COM/ A(100)
REAL A
REAL B(N), C(N), D(N)
DO I = 1, N
B(I) = D(I)
ENDDO
DO I = 1, N
A(I) = B(I)
ENDDO
CALL S1(C(1))
END
+ /usr/snp/bin/gfortran -g -S -O3 -ftree-vectorize -ftree-vectorizer-verbose=2 -msse2 -ffixed-form vecta.f
vecta.f:6: note: LOOP VECTORIZED.
vecta.f:9: note: not vectorized: can't determine dependence between (*b_16)[D.928_30] and com.a[D.928_30]
vecta.f:9: note: vectorized 1 loops in function.
+ /usr/snp/bin/gfortran -g -S -O3 -m32 -ftree-vectorize -ftree-vectorizer-verbose=2 -msse2 -ffixed-form vecta.f
vecta.f:6: note: LOOP VECTORIZED.
vecta.f:9: note: LOOP VECTORIZED.
vecta.f:9: note: vectorized 2 loops in function.
But this one only works in 32 bits.
I'm now going to try to compile HIRLAM with 32 bit pointers.
Kind regards,
--
Toon Moene - e-mail: toon@moene.indiv.nluug.nl - phone: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
A maintainer of GNU Fortran 95: http://gcc.gnu.org/fortran/