This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug ada/29543] New: Ada produces substantially slower code than FORTRAN for identical inputs - looping over double subscripted arrays
- From: "jeff at thecreems dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 22 Oct 2006 02:00:23 -0000
- Subject: [Bug ada/29543] New: Ada produces substantially slower code than FORTRAN for identical inputs - looping over double subscripted arrays
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
I understand comparing very very small benchmarks like this can be misleading
but I believe I've looked at this enough to have a sense that it is
demonstrating a basic truth and not a narrow performance issue.
The test case that has been attached shows a FORTRAN and Ada program that are
equivalent (within their matrix multiply loop). The Ada one runs about 2x
slower with about 3x the number of machine instructions in the inner loop.
(Note that running with Ada run time checks disabled).
I dumped the optimized trees (as the original tree of the Ada version was
difficult to read because of the node types not being known to the pretty
printer). The Ada tree is certainly a mess compared to the FORTRAN version.
The core of the FORTRAN code looks like
do I = 1,N
do J = 1,N
sum = 0.0
do R = 1,N
sum = sum + A(I,R)*B(R,J)
end do
C(I,J) = sum
end do
end do
With the resulting optimized tree fragment (of the inner most loop) being
<L25>:;
sum = MEM[base: (real4 *) ivtmp.97] * MEM[base: (real4 *) pretmp.81, index:
(real4 *) ivtmp.161 + (real4 *) ivtmp.94, step: 4B, offset: 4B] + sum;
ivtmp.94 = ivtmp.94 + 1;
ivtmp.97 = ivtmp.97 + ivtmp.157;
if (ivtmp.94 == (<unnamed type>) D.1273) goto <L29>; else goto <L25>;
While the core of the Ada code looks like:
for I in A'range(1) loop
for J in A'range(2) loop
Sum := 0.0;
for R in A'range(2) loop
Sum := Sum + A(I,R)*B(R,J);
end loop;
C(I,J) := Sum;
end loop;
end loop;
With the resulting optimized tree fragment of the inner most loop being :
<L15>:;
D.2370 = (*D.2277)[pretmp.627]{lb: tst_array__L_3__T16b___L sz: pretmp.709 *
4}[(<unnamed type>) r]{lb: tst_array__L_4__T17b___L sz: 4};
<bb 51>:
temp.721 = D.2344->LB0;
<bb 52>:
temp.720 = D.2344->UB1;
<bb 53>:
temp.719 = D.2344->LB1;
<bb 54>:
j.73 = (<unnamed type>) j;
D.2373 = (*D.2298)[(<unnamed type>) r]{lb: temp.721 sz: MAX_EXPR <(temp.720 +
1 - temp.719) * 4, 0> + 3 & -4}[j.73]{lb: temp.719 sz: 4};
<bb 55>:
D.2374 = D.2370 * D.2373;
<bb 56>:
sum = D.2374 + sum;
<bb 57>:
if (r == tst_array__L_4__T17b___U) goto <L17>; else goto <L16>;
<L16>:;
r = r + 1;
goto <bb 50> (<L15>);
Now, I'll be the first to admit that I know very little about the innards of
compiler technology but that tree looks like a horrible mess. It is no wonder
the resulting assembly is such a mess.
I am attaching a tar file that has the complete source for the Ada and the
FORTRAN version.
--
Summary: Ada produces substantially slower code than FORTRAN for
identical inputs - looping over double subscripted
arrays
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: ada
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: jeff at thecreems dot com
GCC build triplet: i686-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29543