[PR 81616] Deferring FMA transformations in tight loops

Wed Jan 10 19:16:00 GMT 2018

On 01/10/2018 11:46 AM, Martin Jambor wrote:
> Hello,
> 
> I would really like to ping the FMA transformation prevention patch that
> I sent here in December, which, after incorporating a suggestion from
> Richi, re-base and re-testing, I re-post below.  I really think that it
> should make into gcc 8 in some form, because the performance wins are
> really big.
> 
> I am still opened to all sorts of comments, of course, especially to
> suggestions about the form of the param controlling this behavior (or
> how to communicate that we want to do this on Zen in general).  It might
> even be a binary value since not forming FMAs does not seem to harm
> bigger vectors either (as far as we know, I should add).
> 
> For the record, I have not yet received any information from AMD why
> FMAs on 256bit vectors do not have this problem, I assume all people
> that could give an authoritative answer are now looking into covert
> channel mitigation stuff.  But very probably it is just that the
> internal split FMAs can be scheduled so that while one is still waiting
> for its addend, another can already execute.
Both are likely true...

Hoping Richi will look at the patch, but it's certainly in my queue of
things to look at if he doesn't get to  it.

jeff