How to get a vector FMA with GCC in a portable way?
Wed Jan 16 09:27:00 GMT 2019
On Tue, 15 Jan 2019, Vincent Lefevre wrote:
> I would like to know how to get a vector FMA with GCC in a portable
> By "portable way", I mean that the behavior must not depend on the
> compilation options (e.g., if FP contraction is disabled, I still
> want a true FMA) and that the code must not depend on the architecture
> (thus intrinsics should not be used... even when restricting to x86,
> one reason is FMA3 vs FMA4 issues).
> For instance, for addition, one can write "a + b". But for FMA?
In the context of autovectorized code or when using generic vector types?
When the source is supposed to be autovectorized and operates on scalar
variables, using fma function works (GCC recognizes it as a builtin;
__FP_FAST_FMA is predefined when the fma instruction is available).
For generic vector types I'm afraid GCC does not provide such a facility.
I think it would make a reasonable feature request.
More information about the Gcc-help