[Bug tree-optimization/71414] 2x slower than clang summing small float array, GCC should consider larger vectorization factor for "unrolling" reductions
Joost.VandeVondele at mat dot ethz.ch
gcc-bugzilla@gcc.gnu.org
Tue Jun 7 13:52:00 GMT 2016
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71414
Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |Joost.VandeVondele at mat dot ethz
| |.ch
--- Comment #6 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> ---
Isn't this a case where -fvariable-expansion-in-unroller is helpful ?
> gcc -Ofast t.c -lrt ; ./a.out
285.670206
> gcc -Ofast -funroll-loops -fvariable-expansion-in-unroller t.c -lrt ; ./a.out
151.246083
> gcc -Ofast -funroll-loops t.c -lrt ; ./a.out
277.047507
There is some relation with PR25621 I think.
More information about the Gcc-bugs
mailing list