This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: RFA: another patch to fix PR61360


>The "r->x" alternative results in "vector" decoding on amdfam10. This is AMD-speak for microcoded instructions, and AMD optimization manual strongly recommends avoiding them. I have CC'd Ganesh, maybe he >can provide more relevant data on the performance impact.

Thanks Uros!

Yes, the AMD SWOG recommends precisely what Uros mentions.
<snip from SWOG for BD>
When moving data from a GPR to an XMM register, use separate store and load instructions to move
the data first from the source register to a temporary location in memory and then from memory into
the destination register
</snip>

This is listed as an optimization too. This holds good for all amdfam10 and BD  family processors. 
I have to dig through the performance numbers will try to get them.

Regards
Ganesh

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]