[Bug tree-optimization/71414] 2x slower than clang summing small float array, GCC should consider larger vectorization factor for "unrolling" reductions

Joost.VandeVondele at mat dot ethz.ch gcc-bugzilla@gcc.gnu.org
Tue Jun 7 13:52:00 GMT 2016


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71414

Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Joost.VandeVondele at mat dot ethz
                   |                            |.ch

--- Comment #6 from Joost VandeVondele <Joost.VandeVondele at mat dot ethz.ch> ---
Isn't this a case where -fvariable-expansion-in-unroller is helpful ?

> gcc -Ofast t.c -lrt ; ./a.out
285.670206

> gcc -Ofast -funroll-loops -fvariable-expansion-in-unroller  t.c -lrt ; ./a.out
151.246083

> gcc -Ofast -funroll-loops  t.c -lrt ; ./a.out
277.047507

There is some relation with PR25621 I think.


More information about the Gcc-bugs mailing list