[Bug target/97127] FMA3 code transformation leads to slowdown on Skylake
already5chosen at yahoo dot com
gcc-bugzilla@gcc.gnu.org
Fri Sep 25 13:21:27 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97127
--- Comment #15 from Michael_S <already5chosen at yahoo dot com> ---
(In reply to Hongtao.liu from comment #14)
> > Still I don't understand why compiler does not compare the cost of full loop
> > body after combining to the cost before combining and does not come to
> > conclusion that combining increased the cost.
>
> As Richard says, GCC does not model CPU pipelines in such detail.
I don't understand what "details" you have in mind.
The costs of instructions that you quoted above looks fine.
But for reason, I don't understand, compiler had chosen more costly "combined"
code sequence over less costly, according to its own cost model, "RISCy"
sequence.
More information about the Gcc-bugs
mailing list