1.02 g77 link problem
Toon Moene
toon@moene.indiv.nluug.nl
Mon Apr 13 11:14:00 GMT 1998
[ egcs-1.0.2 compiled code for Fortran complex program slower than
egcs-1.0.0 ]
The hotspot of your program is:
do m = m1,m2
z1=z1 + x(m,l)*y(m,k)
end do
with z1, x and y DOUBLE COMPLEX entities.
As I pointed out before, the generated code contains lots of
temporary variables due to the expansion of the complex product.
However, using the -fno-emulate-complex flag on the compile line,
the generated code looks much better (this is for my
m68k-next-nextstep3):
L33:
movel d2,a0
fmoved a0@,fp3
moveq #16,d5
addl d5,d2
fmoved a1@,fp5
addl d5,a1
fmoved a2@,fp4
addl d5,a2
fmoved a3@,fp2
addl d5,a3
fmovex fp3,fp1
fmulx fp4,fp1
fmovex fp5,fp0
fmulx fp2,fp0
fsubx fp0,fp1
fmulx fp2,fp3
fmulx fp5,fp4
faddx fp4,fp3
faddx fp1,fp6
faddx fp3,fp7
dbra d1,L33
i.e., it has one fmoved an@,fpm (double precision load) for each of
the four parts of the two double complex array elements involved -
and nothing more: Everything else is just address updating of the
arrays and of course, the floating point computations itself.
This is the result as far as timing and correctness is concerned:
Without -fno-emulate-complex:
0.062477
3.468858957
954.619750977
(75373568.,152576512.)
0.
With -fno-emulate-complex:
0.046838999
4.733003616
457.628112793
(75373568.,152576512.)
0.
So that's at least twice as fast, and the answer is correct.
Why does g77 default to -femulate-complex (and what is it) ? :
When g77 was first released to the general public (19950217), it
handled (double) complex variables and arrays by just telling the
gcc backend (loosely: the code generator): These are (double)
complex entities (in gcc-speak: SCmode and DCmode), generate code
for them.
Unfortunately, during the two years following the initial release,
it turned out that the backend didn't treat the complex type very
well. At first we hoped that it would only affect complex int,
short [which is of no concern to Fortran code] etc., but it turned
out that also the treatment of complex float and complex double was
broken in enough cases to warrant another approach.
So Craig Burley converted all places in the Fortran Frontend that
handed over complex arithmetic to the backend to explicitly spell
out the resulting real arithmetic (so instead of saying: here are
two complex numbers (a,b) and (c,d); multiply them, it said: here
are two sets of two real numbers, to be multiplied as follows: (ac -
bd, ad + bc).
The last form is the default since g77-0.5.20 (19970301).
Apparently, when the backend doesn't make an error in dealing with
complex entities directly, it is able to generate more efficient
code - it is not clear to me why this is so.
So you might try to use the flag -fno-emulate-complex, BUT ONLY IF
YOU CAN CHECK THE ANSWER OF YOUR COMPUTATIONS !!!
HTH,
Toon.
More information about the Gcc-bugs
mailing list