This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH, i386, testsuite] FMA intrinsics


2011/8/20 Uros Bizjak <ubizjak@gmail.com>:
> Hello!
>
>> This patch adds intrinsics for FMA instruction set along with tests for them.
>> Bootstraps and passes make check (including make check on simulator
>> for new runtime tests).
>
> ? ? ? ? ? ? ? * config/i386/fmaintrin.h: New.
>
> It is not included in the patch.
Sorry about that
>
> ? ? ? ? ? ? ? * config.gcc: Add fmaintrin.h.
> ? ? ? ? ? ? ? * config/i386/i386.c
> ? ? ? ? ? ? ? * <ix86_builtins> (IX86_BUILTIN_VFMADDSS3): New.
> ? ? ? ? ? ? ? (IX86_BUILTIN_VFMADDSD3): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDSS3): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDSD3): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBSS3): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBSD3): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBSS3): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBSD3): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPS): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPD): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPS256): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPD256): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPS): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPD): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPS256): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPD256): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPS): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPD): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPS256): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPD256): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPS): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPD): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPS256): Likewise.
> ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPD256): Likewise.
>
> You don't need to add "negated" versions, one FMA builtin per mode is
> enough, please see existing FMA4 descriptions. Just put unary minus
> sign in the intrinsics header for "negated" operand and let GCC do its
> job. Please see existing FMA4 intrinsics header.
>
Actually i tried that.But in such case  when i compile(FMA4 example)
#include <x86intrin.h>
extern  __m128 a,b,c;
void foo(){
   a = _mm_nmsub_ps(a,b,c);
}
with -S -O0 -mfma4
The asm have

        vxorps  %xmm1, %xmm0, %xmm0
        vmovaps -16(%rbp), %xmm1
        vmovaps .LC0(%rip), %xmm2
        vxorps  %xmm2, %xmm1, %xmm1
        vfmaddps        %xmm0, -32(%rbp), %xmm1, %xmm0
So vfmaddps of negated values is generated instead of vfnmsubps.
I think it is bad that intrinsic for  instruction can generate code
without this instruction.
So to make sure that exact instruction is always generated i
introduced additional expands and builtins.
Is it wrong?
> ? ? ? ? ? ? ? * config/i386/sse.md (fmai_fnmadd_<mode>): New.
> ? ? ? ? ? ? ? (fmai_fmsub_<mode>): Likewise.
> ? ? ? ? ? ? ? (fmai_fnmsub_<mode>): Likewise.
> ? ? ? ? ? ? ? (fmai_fmadd_s_<mode>): Likewise.
> ? ? ? ? ? ? ? (fmai_vmfmadd_s_<mode>): Likewise.
> ? ? ? ? ? ? ? (fmai_vmfmsub_s_<mode>): Likewise.
> ? ? ? ? ? ? ? (fmai_vmfnmadd_s_<mode>): Likewise.
> ? ? ? ? ? ? ? (fmai_vmfnmsub_s_<mode>): Likewise.
> ? ? ? ? ? ? ? (*fmai_fmadd_s_<mode>): Likewise.
> ? ? ? ? ? ? ? (*fmai_fmsub_s_<mode>): Likewise.
> ? ? ? ? ? ? ? (*fmai_fnmadd_s_<mode>): Likewise.
> ? ? ? ? ? ? ? (*fmai_fnmsub_s_<mode>): Likewise.
> ? ? ? ? ? ? ? (fmsubadd_<mode>): Likewise.
>
> Also here. All your FMAMODE patterns should be expanded through
> existing "fma4i_fmadd_<mode>" expander (you can rename it to
> "fmai_fmadd..." to make its name more generic). This includes new
> "fmsubadd_<mode>" pattern that should be expanded through existing
> "fmaddsub_<mode>" expander.
>
See above explanation why i included new expands.
_s_ is removed
> vec_merge scalar versions also need only one expander, again follow
> existing FMA4 version. Also, there is no need to include "_s_" in the
> name. We know that these are scalar versions.
>
> ? ? ? ? ? ? ? * gcc.target/i386/fma-check.h: New.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmaddXX.c: New testcase.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmaddsubXX.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmsubXX.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmsubaddXX.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fnmaddXX.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fnmsubXX.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-fmaddXX.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-fmaddsubXX.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-fmsubXX.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-fmsubaddXX.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-fnmaddXX.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-fnmsubXX.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/fma-compile.c: Likewise.
> ? ? ? ? ? ? ? * gcc.target/i386/i386.exp (check_effective_target_fma): New.
>
> Is there a reason that all runtime tests are compiled with -O0 except
> that there are some existing FMA tests in the testsuite using -O0?
> Usually, these kind of tests are compiled using -O2, so optimizations
> are applied also to the builtins.
Changed  to O2.
>
> Uros.
>

Attachment: patch
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]