This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH i386 13/8] [AVX-512] Fix argument order for perm and recp intrinsics.
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Kirill Yukhin <kirill dot yukhin at gmail dot com>
- Cc: Jakub Jelinek <jakub at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 18 Feb 2014 11:34:05 +0100
- Subject: Re: [PATCH i386 13/8] [AVX-512] Fix argument order for perm and recp intrinsics.
- Authentication-results: sourceware.org; auth=none
- References: <20140213104430 dot GA42503 at msticlxl57 dot ims dot intel dot com> <CAFULd4YAxaOhujnPuvfm90L3Yc8pjxS0Dr8C2jJQR07yoA1Vug at mail dot gmail dot com> <CAFULd4ZrZXTGi_pLhNOZus3JPVK1_99QtJokSFPZ1mFJ-KeWUQ at mail dot gmail dot com> <CAFULd4YsyhC4D5CWcz6Z+kSZxFXBWs+HC31o=Sf2D-8P=zcmiA at mail dot gmail dot com> <20140217122656 dot GB58805 at msticlxl57 dot ims dot intel dot com> <CAFULd4aZimfOkyDin3N8ria+6bz-6-bPbv-pK6KH6rwg7UROyA at mail dot gmail dot com> <20140218100651 dot GA4382 at msticlxl57 dot ims dot intel dot com>
On Tue, Feb 18, 2014 at 11:06 AM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
>> >> >> Please don't change srcp pattern, it should be defined similar to
>> >> >> vrcpss (aka sse_vmrcpv4sf). You need to switch operand order
>> >> >> elsewhere.
>> >> >
>> >> > No, you are correct. Operands should be swapped as in your patch.
>> >>
>> >> Eh, sorry that after some more thinking, I have to again revert this decision.
>> >>
>> >> The srcp pattern should remain as is, and you should swap operands in
>> >> avx512fintrin.h instead:
>> >
>> > In the bottom there's updated patch.
>> >
>> > Added "sse" type. mem operand made second.
>> > Built-ins & tests fixed.
>> >
>> > Testing in progress.
>> >
>> > Is it ok for mainline if pass?
>>
>> No, you got operand order wrong.
>>
>> To correctly calculate "memory" attribute, all "sse" type insns expect
>> the operands in the way sse_vmrcpv4sf2 is defined. You should keep
>> nonimmedate operand as operand_1 and switch operands in builtins and
>> insn mnemonics to fulfill required operand order *in the pattern*.
> Patch updated. It is in the bottom.
> gcc/
> * config/i386/avx512erintrin.h (_mm_rcp28_round_sd): Swap operands.
> (_mm_rcp28_round_ss): Ditto.
> (_mm_rsqrt28_round_sd): Ditto.
> (_mm_rsqrt28_round_ss): Ditto.
> * config/i386/avx512erintrin.h (_mm_rcp14_round_sd): Ditto.
> (_mm_rcp14_round_ss): Ditto.
> (_mm_rsqrt14_round_sd): Ditto.
> (_mm_rsqrt14_round_ss): Ditto.
> * config/i386/sse.md (rsqrt14<mode>): Make memory first operand.
"Put nonimmediate operand as the first input operand." (and in similar
way below).
> (avx512er_exp2<mode><mask_name><round_saeonly_name>): Set type
> attribute to sse.
> (<mask_codefor>avx512er_rcp28<mode><mask_name><round_saeonly_name>):
> Ditto.
> (avx512er_vmrcp28<mode><round_saeonly_name>): Make memory first
> operand, set type attribute.
> (<mask_codefor>avx512er_rsqrt28<mode><mask_name><round_saeonly_name>):
> Set type attribute.
> (avx512er_vmrsqrt28<mode><round_saeonly_name>): Make memory first
> operand, Set type attribute.
>
> gcc/testsuite/
> * gcc.target/i386/avx512er-vrcp28sd-2.c: Distinguish src1 and src2.
> * gcc.target/i386/avx512er-vrcp28ss-2.c: Call correct intrinsic.
> * gcc.target/i386/avx512er-vrsqrt28sd-2.c: Distinguish src1 and src2.
> * gcc.target/i386/avx512er-vrsqrt28ss-2.c: Ditto.
> * gcc.target/i386/avx512f-vrcp14sd-2.c: Fix reference calculation.
> * gcc.target/i386/avx512f-vrcp14ss-2.c: Ditto.
OK with a slight adjustement to vrcp14 patter below.
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -1551,13 +1551,13 @@
> [(set (match_operand:VF_128 0 "register_operand" "=v")
> (vec_merge:VF_128
> (unspec:VF_128
> - [(match_operand:VF_128 1 "register_operand" "v")
> - (match_operand:VF_128 2 "nonimmediate_operand" "vm")]
> + [(match_operand:VF_128 2 "register_operand" "v")
> + (match_operand:VF_128 1 "nonimmediate_operand" "vm")]
> UNSPEC_RSQRT14)
> (match_dup 1)
> (const_int 1)))]
> "TARGET_AVX512F"
> - "vrsqrt14<ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
> + "vrsqrt14<ssescalarmodesuffix>\t{%1, %2, %0|%0, %2, %1}"
This pattern should probably read the same as other vmrsqrt patterns
(e.g. sse_vmrsqrtv4sf2 and avx512er_vmrsqrt28...):
(vec_merge:VF_128
(unspec:VF_128
[(match_operand:VF_128 1 "nonimmediate_operand" "vm")]
UNSPEC_RSQRT14)
(match_operand:VF_128 2 "register_operand" "v")
(const_int 1)))]
"TARGET_AVX512F"
"vrsqrt14<ssescalarmodesuffix>\t{%1, %2, %0|%0, %2, %1}"
OK with the change above.
Thanks,
Uros.