[Bug tree-optimization/31079] 20% difference between ifort/gfortran, missed vectorization
jv244 at cam dot ac dot uk
gcc-bugzilla@gcc.gnu.org
Tue Aug 19 05:45:00 GMT 2008
------- Comment #8 from jv244 at cam dot ac dot uk 2008-08-19 05:43 -------
(In reply to comment #7)
> That is, GCCs inner loop is
>
> .L6:
> addl $1, %eax
> addsd %xmm12, %xmm11
> cmpl $100000000, %eax
> addsd %xmm14, %xmm3
> addsd %xmm15, %xmm2
> addsd %xmm13, %xmm1
> jne .L6
>
> which doesn't necessarily look slower than ICCs.
>
Right... checked trunk, and it now does something very smart with the testcase
from comment 4 ... it is now about 10 times faster than ifort (9.1 /11.0)
> gfortran -O3 -ftree-vectorize -ffast-math -march=native -S PR31079_4.f90
> ./a.out
0.25201499
> ifort -xT -O2 PR31079_4.f90
> ./a.out
2.040127
I'll see if there is a way to get the testcase somewhat smarter. I checked the
very first program (comment #0), and this is still slower with gfortran (intel
3.51 vs gfortran 4.1). Just for completeness, I attach the Fortran source and
the intel assembly.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31079
More information about the Gcc-bugs
mailing list