[Bug tree-optimization/88760] GCC unrolling is suboptimal
rguenther at suse dot de
gcc-bugzilla@gcc.gnu.org
Fri Oct 11 10:29:00 GMT 2019
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
--- Comment #30 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 11 Oct 2019, wilco at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
>
> --- Comment #29 from Wilco <wilco at gcc dot gnu.org> ---
> (In reply to Jiu Fu Guo from comment #28)
> > For these kind of small loops, it would be acceptable to unroll in GIMPLE,
> > because register pressure and instruction cost may not be major concerns;
> > just like "cunroll" and "cunrolli" passes (complete unroll) which also been
> > done at O2.
>
> Absolutely, unrolling is a high-level optimization like vectorization.
To expose ILP? I'd call that low-level though ;)
If it exposes data reuse then I'd call it high-level - and at that level
we already have passes like predictive commoning or unroll-and-jam doing
exactly that. Or vectorization.
We've shown though data that unrolling without a good idea on CPU
pipeline details is a loss on x86_64. This further hints at it
being low-level.
More information about the Gcc-bugs
mailing list