This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
2011/8/20 Uros Bizjak <ubizjak@gmail.com>: > Hello! > >> This patch adds intrinsics for FMA instruction set along with tests for them. >> Bootstraps and passes make check (including make check on simulator >> for new runtime tests). > > ? ? ? ? ? ? ? * config/i386/fmaintrin.h: New. > > It is not included in the patch. Sorry about that > > ? ? ? ? ? ? ? * config.gcc: Add fmaintrin.h. > ? ? ? ? ? ? ? * config/i386/i386.c > ? ? ? ? ? ? ? * <ix86_builtins> (IX86_BUILTIN_VFMADDSS3): New. > ? ? ? ? ? ? ? (IX86_BUILTIN_VFMADDSD3): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDSS3): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDSD3): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBSS3): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBSD3): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBSS3): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBSD3): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPS): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPD): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPS256): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBPD256): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPS): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPD): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPS256): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMADDPD256): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPS): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPD): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPS256): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFNMSUBPD256): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPS): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPD): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPS256): Likewise. > ? ? ? ? ? ? ? (X86_BUILTIN_VFMSUBADDPD256): Likewise. > > You don't need to add "negated" versions, one FMA builtin per mode is > enough, please see existing FMA4 descriptions. Just put unary minus > sign in the intrinsics header for "negated" operand and let GCC do its > job. Please see existing FMA4 intrinsics header. > Actually i tried that.But in such case when i compile(FMA4 example) #include <x86intrin.h> extern __m128 a,b,c; void foo(){ a = _mm_nmsub_ps(a,b,c); } with -S -O0 -mfma4 The asm have vxorps %xmm1, %xmm0, %xmm0 vmovaps -16(%rbp), %xmm1 vmovaps .LC0(%rip), %xmm2 vxorps %xmm2, %xmm1, %xmm1 vfmaddps %xmm0, -32(%rbp), %xmm1, %xmm0 So vfmaddps of negated values is generated instead of vfnmsubps. I think it is bad that intrinsic for instruction can generate code without this instruction. So to make sure that exact instruction is always generated i introduced additional expands and builtins. Is it wrong? > ? ? ? ? ? ? ? * config/i386/sse.md (fmai_fnmadd_<mode>): New. > ? ? ? ? ? ? ? (fmai_fmsub_<mode>): Likewise. > ? ? ? ? ? ? ? (fmai_fnmsub_<mode>): Likewise. > ? ? ? ? ? ? ? (fmai_fmadd_s_<mode>): Likewise. > ? ? ? ? ? ? ? (fmai_vmfmadd_s_<mode>): Likewise. > ? ? ? ? ? ? ? (fmai_vmfmsub_s_<mode>): Likewise. > ? ? ? ? ? ? ? (fmai_vmfnmadd_s_<mode>): Likewise. > ? ? ? ? ? ? ? (fmai_vmfnmsub_s_<mode>): Likewise. > ? ? ? ? ? ? ? (*fmai_fmadd_s_<mode>): Likewise. > ? ? ? ? ? ? ? (*fmai_fmsub_s_<mode>): Likewise. > ? ? ? ? ? ? ? (*fmai_fnmadd_s_<mode>): Likewise. > ? ? ? ? ? ? ? (*fmai_fnmsub_s_<mode>): Likewise. > ? ? ? ? ? ? ? (fmsubadd_<mode>): Likewise. > > Also here. All your FMAMODE patterns should be expanded through > existing "fma4i_fmadd_<mode>" expander (you can rename it to > "fmai_fmadd..." to make its name more generic). This includes new > "fmsubadd_<mode>" pattern that should be expanded through existing > "fmaddsub_<mode>" expander. > See above explanation why i included new expands. _s_ is removed > vec_merge scalar versions also need only one expander, again follow > existing FMA4 version. Also, there is no need to include "_s_" in the > name. We know that these are scalar versions. > > ? ? ? ? ? ? ? * gcc.target/i386/fma-check.h: New. > ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmaddXX.c: New testcase. > ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmaddsubXX.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmsubXX.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fmsubaddXX.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fnmaddXX.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/fma-256-fnmsubXX.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/fma-fmaddXX.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/fma-fmaddsubXX.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/fma-fmsubXX.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/fma-fmsubaddXX.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/fma-fnmaddXX.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/fma-fnmsubXX.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/fma-compile.c: Likewise. > ? ? ? ? ? ? ? * gcc.target/i386/i386.exp (check_effective_target_fma): New. > > Is there a reason that all runtime tests are compiled with -O0 except > that there are some existing FMA tests in the testsuite using -O0? > Usually, these kind of tests are compiled using -O2, so optimizations > are applied also to the builtins. Changed to O2. > > Uros. >
Attachment:
patch
Description: Binary data
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |