This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug fortran/42118] Slow forall
- From: "ebay.20.tedlap at spamgourmet dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 08 Oct 2013 13:12:03 +0000
- Subject: [Bug fortran/42118] Slow forall
- Auto-submitted: auto-generated
- References: <bug-42118-4 at http dot gcc dot gnu dot org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42118
Lionel GUEZ <ebay.20.tedlap at spamgourmet dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |ebay.20.tedlap@spamgourmet.
| |com
--- Comment #6 from Lionel GUEZ <ebay.20.tedlap at spamgourmet dot com> ---
There is also the problem of the order of indices in a forall. I guess this is
in close relation to the comparison of do and forall. Consider the following
test program :
program test_forall
implicit none
integer, parameter:: n = 1000
integer i, j, k
double precision S(n, n, n)
forall (i = 1: n, j = 1: n, k = 1: n) S(i, j, k) = i * j * k
print *, "ijk, sum(s) = ", sum(s)
end program test_forall
According to the Fortran standard, the order of indices in the forall header is
of no consequence. So, in the above program, we should be able to write
equivalently :
forall (k = 1: n, j = 1: n, i = 1: n) S(i, j, k) = i * j * k
There is no way for the writer of the program to predict which of the two
versions should be faster. It is interesting to note that, with gfortran, the
forall with kji is much slower, while the inverse is true with the NAG compiler
(version 5.3). I think the two versions should have the same run time. I have
actually tested the two versions of the program with four compilers :
-- gfortran 4.4.6 with -O3
kji, sum(s) = 1.253753751250046E+017
real 1m32.511s
user 1m22.342s
sys 0m8.368s
ijk, sum(s) = 1.253753751250046E+017
real 0m12.962s
user 0m7.416s
sys 0m5.427s
-- nagfor 5.3 with -O4
kji, sum(s) = 1.2537537512500458E+17
real 0m13.396s
user 0m6.833s
sys 0m6.054s
ijk, sum(s) = 1.2537537512500458E+17
real 2m37.943s
user 2m27.723s
sys 0m7.873s
-- pgf95 11.10 with -fast
kji, sum(s) = 1.2537537512499998E+017
real 0m12.119s
user 0m6.051s
sys 0m5.910s
ijk, sum(s) = 1.2537537512499998E+017
real 0m11.979s
user 0m5.854s
sys 0m5.939s
-- ifort 12.1 with -O3 :
kji, sum(s) = 1.253753751250000E+017
real 0m5.210s
user 0m3.028s
sys 0m2.150s
ijk, sum(s) = 1.253753751250000E+017
real 0m5.114s
user 0m2.981s
sys 0m2.115s
So we see that PG Fortran and Intel Fortran behave well : the two versions take
about the same time. Also Intel Fortran is much faster than other compilers on
this test.
I would also like to comment on the use of the forall. Tobias Burnus says that
improving the forall in Gfortran is not worth the effort. I think the forall is
useful. It is an elegant way to write some assignments. There is no idea of
time sequence in a forall and the forall can only contain an assignement while,
as you know, the do construct could contain call to subroutines, input-output,
recursive computations, anything. So when one reads a program and sees the
forall it is much more quickly clear to understand what is going on than when
one reads a do loop. Also the fact that assignments are independent (comment of
Harald Anlauf) should make it easier for the compiler to produce a fast code.