This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/63503] [AArch64] A57 executes fused multiply-add poorly in some situations


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503

--- Comment #24 from Wilco <wdijkstr at arm dot com> ---
(In reply to Evandro from comment #23)
> (In reply to Wilco from comment #22)
> > Unrolling alone isn't good enough in sum reductions. As I mentioned before,
> > GCC doesn't enable any of the useful loop optimizations by default. So add
> > -fvariable-expansion-in-unroller to get a good speedup with unrolling. Again
> > these are all generic GCC issues.
> 
> Adding -fvariable-expansion-in-unroller when using -funroll-loops results in
> practically the same code being emitted.

Correct, all it does is cut the dependency chain of the accumulates. But that's
enough to get the speedup.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]