[Bug tree-optimization/25621] Missed optimization when unrolling the loop (splitting up the sum) (only with -ffast-math)
Joost.VandeVondele at mat dot ethz.ch
gcc-bugzilla@gcc.gnu.org
Fri Mar 29 10:07:00 GMT 2013
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25621
Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |Joost.VandeVondele at mat
| |dot ethz.ch
Depends on| |53947
--- Comment #12 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> 2013-03-29 10:07:06 UTC ---
This has become much more a vectorizer problem. Basically ifort generates code
that is twice as fast for routine S31 of the initial comment. Given that this
is a common dot product, it might be good to see why that happens. Both
compilers fail to notice that S32 is basically the same code hand-unrolled.
Tested with the code in comment #6 (without inlining)
> gfortran -march=native -ffast-math -O3 -fno-inline PR25621.f90
> ./a.out
default loop 0.56491500000000006
hand optimized loop 0.74488600000000016
> ifort -xHost -O3 -fno-inline PR25621.f90
> ./a.out
default loop 0.377943000000000
hand optimized loop 0.579911000000000
More information about the Gcc-bugs
mailing list