This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH,i386] fma,fma4 and xop flags


On Mon, Aug 13, 2012 at 9:03 PM, Richard Henderson <rth@redhat.com> wrote:

>> +      (eq_attr "isa" "fma") (symbol_ref "TARGET_FMA")
>> +      (eq_attr "isa" "fma4")
>> +        (symbol_ref "TARGET_FMA4 && !TARGET_FMA")
>
> Why the !TARGET_FMA for fma4?
>
> If both ISAs are enabled, I don't see why we couldn't choose from either.
> If they really should be mutually exclusive, then that should happen elsewhere.
>
> I do see that fma3 is one byte smaller.  So in the instances where we're
> concerned with code size, and we have both isas, and there does happen to
> be output overlap with one of the inputs, then we should use fma3.  But
> we should also not have reload generate an extra move when fma4 is available.

AFAIU fma3 is better than fma4 for bdver2 (the only CPU that
implements both FMA sets). Current description of bdver2 doesn't even
enable fma4 in processor_alias_table due to this fact.

The change you are referring to adds preference for fma3 insn set for
generic code (not FMA4 builtins!), even when fma4 is enabled. So, no
matter which combination and sequence of -mfmfa -mfma4 or -mxop user
passes to the compiler, only fma3 instructions will be generated.

This change also allows -march=bdver2 to use PTA_FMA4 in
processor_alias_table, while still generating fma3 instructions only
for generic code.

Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]