This is the mail archive of the
fortran@gcc.gnu.org
mailing list for the GNU Fortran project.
Re: Polyhedron tests on Intel Darwin8/9
Replacing
denominator = sqrt(dot_product(rot_c_vector-rot_q_vector, &
rot_c_vector-rot_q_vector))
with
denominator = sqrt((rot_c_vector(1)-rot_q_vector(1)) * &
(rot_c_vector(1)-rot_q_vector(1)) + &
(rot_c_vector(2)-rot_q_vector(2)) * &
(rot_c_vector(2)-rot_q_vector(2)) + &
(rot_c_vector(3)-rot_q_vector(3)) * &
(rot_c_vector(3)-rot_q_vector(3)))
l12_lower = l12_lower + numerator/denominator
everywhere in induct.f90 decreases the execution time (-O3 -ffast-math
-funroll-loops core2Duo 2.16Ghz) from
93.207u 0.101s 1:33.32 99.9% 0+0k 0+1io 0pf+0w
to
73.361u 0.061s 1:13.42 100.0% 0+0k 0+0io 0pf+0w
Then, replacing everywhere
numerator = w1gauss(j) * w2gauss(k) * &
dot_product(coil_current_vec,current_vector)
with
numerator = w1gauss(j) * w2gauss(k) * &
(coil_current_vec(1)*current_vector(1) + &
coil_current_vec(2)*current_vector(2) + &
coil_current_vec(3)*current_vector(3))
further decrease the execution time to
65.560u 0.080s 1:05.64 100.0% 0+0k 0+1io 0pf+0w
Although I may understand that the first optimization is missed, I don't
understand why the second is missed if the dot product is inlined.
Dominique