This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug fortran/24520] New: Temporary constant array descriptors being declared at wrong binding level.
- From: "paul dot richard dot thomas at cea dot fr" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 25 Oct 2005 13:37:22 -0000
- Subject: [Bug fortran/24520] New: Temporary constant array descriptors being declared at wrong binding level.
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
Posted in: http://gcc.gnu.org/ml/fortran/2005-10/msg00443.html
I have been investigating the relatively poor performance of gfortran for
some of the Polyhedron Benchmark Tests (www.polyhedron.com).
I already discussed a couple of days ago how test_fpu.f90 exposed some
weakness in the dependency analysis. I am developing a patch will do
somewhat more than the "draft patch" discussed there.
As posted on the Wiki (http://gcc.gnu.org/wiki/GFortranResults), two real
offenders are induct.f90 and kepler.f90 (I have confirmed this in an ifc/gfc
comparison that I will post tonight or tomorrow.). As mentioned there,
profiling indicates that the intrinsic dot_product is taking >50% of the
time. Subsequently I have confirmed this by the simple expedient of adding
a repeat copy of the section of code that calls dot_product. The difference
is of the same order as the difference between gfc and DF6.0 execution
times.
It turns out that gfc is slow because it is making temporary array
descriptors for the actual arguments of dot_product. Since these are only
of length 13, the temporary making slugs down gfc a lot. This can be
confirmed as follows:
real, dimension(12) :: x, y
real :: z
do i = 1, 10000000
z = dot_product(x,y)
end do
end
takes 0.15s under DF6.0 and 45.5s for gfc!
When rewritten as
real, dimension(:), pointer :: x, y
real :: z
allocate (x(12), y(12))
do i = 1, 10000000
z = dot_product (x,y)
end do
end
the time increases slightly for DF6.0, to 0.27s. gfc now comes in with a
creditable 0.39s.
The code within the loop for both versions appears below. Apparently the
allocation of the descriptor structures and the assignments to them cause
the enormous slow-down.
I think that the lesson is that constant array references need to be taken
out of loops or their use should automatically generate a pointer. I rather
like the latter because I suspect it to be more easily implementable.
Paul Thomas
Non_pointer version
if (i <= 10000000)
{
while (1)
{
{
logical4 D.573;
{
struct array1_real4 parm.1;
struct array1_real4 parm.0;
parm.0.dtype = 281;
parm.0.dim[0].lbound = 1;
parm.0.dim[0].ubound = 12;
parm.0.dim[0].stride = 1;
parm.0.data = (void *) (real4[0:] *) &x[0];
parm.0.offset = 0;
parm.1.dtype = 281;
parm.1.dim[0].lbound = 1;
parm.1.dim[0].ubound = 12;
parm.1.dim[0].stride = 1;
parm.1.data = (void *) (real4[0:] *) &y[0];
parm.1.offset = 0;
z = _gfortran_dot_product_r4 (&parm.0, &parm.1);
}
L.1:;
D.573 = i == 10000000;
i = i + 1;
if (D.573) goto L.2; else (void) 0;
}
}
}
else
{
(void) 0;
}
L.2:;
and for the pointer version
if (i <= 10000000)
{
while (1)
{
{
logical4 D.573;
z = _gfortran_dot_product_r4 (&x, &y);
L.1:;
D.573 = i == 10000000;
i = i + 1;
if (D.573) goto L.2; else (void) 0;
}
}
}
else
{
(void) 0;
}
L.2:;
--
Summary: Temporary constant array descriptors being declared at
wrong binding level.
Product: gcc
Version: 4.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: fortran
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: paul dot richard dot thomas at cea dot fr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24520