[PATCH] [AArch64, NEON] Improve vpmaxX & vpminX intrinsics

Tejas Belagod tejas.belagod@arm.com
Tue Jan 13 17:34:00 GMT 2015


On 09/12/14 08:17, Yangfei (Felix) wrote:
>> On 28 November 2014 at 09:23, Yangfei (Felix) <felix.yang@huawei.com> wrote:
>>> Hi,
>>>    This patch converts vpmaxX & vpminX intrinsics to use builtin functions
>> instead of the previous inline assembly syntax.
>>>    Regtested with aarch64-linux-gnu on QEMU.  Also passed the glorious
>> testsuite of Christophe Lyon.
>>>    OK for the trunk?
>>
>> Hi Felix,   We know from experience that the advsimd intrinsics tend
>> to be fragile for big endian and in general it is fairly easy to break the big endian
>> case.  For these advsimd improvements that you are working on (that we very
>> much appreciate) it is important to run both little endian and big endian
>> regressions.
>>
>> Thanks
>> /Marcus
>
>
> Okay.  Any plan for the advsimd big-endian improvement?
> I rebased this patch over Alan Lawrance's patch: https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00279.html
> No regressions for aarch64_be-linux-gnu target too.  OK for the thunk?
>
>
> Index: gcc/ChangeLog
> ===================================================================
> --- gcc/ChangeLog       (revision 218464)
> +++ gcc/ChangeLog       (working copy)
> @@ -1,3 +1,18 @@
> +2014-12-09  Felix Yang  <felix.yang@huawei.com>
> +
> +       * config/aarch64/aarch64-simd.md (aarch64_<maxmin_uns>p<mode>): New
> +       pattern.
> +       * config/aarch64/aarch64-simd-builtins.def (smaxp, sminp, umaxp,
> +       uminp, smax_nanp, smin_nanp): New builtins.
> +       * config/aarch64/arm_neon.h (vpmax_s8, vpmax_s16, vpmax_s32,
> +       vpmax_u8, vpmax_u16, vpmax_u32, vpmaxq_s8, vpmaxq_s16, vpmaxq_s32,
> +       vpmaxq_u8, vpmaxq_u16, vpmaxq_u32, vpmax_f32, vpmaxq_f32, vpmaxq_f64,
> +       vpmaxqd_f64, vpmaxs_f32, vpmaxnm_f32, vpmaxnmq_f32, vpmaxnmq_f64,
> +       vpmaxnmqd_f64, vpmaxnms_f32, vpmin_s8, vpmin_s16, vpmin_s32, vpmin_u8,
> +       vpmin_u16, vpmin_u32, vpminq_s8, vpminq_s16, vpminq_s32, vpminq_u8,
> +       vpminq_u16, vpminq_u32, vpmin_f32, vpminq_f32, vpminq_f64, vpminqd_f64,
> +       vpmins_f32, vpminnm_f32, vpminnmq_f32, vpminnmq_f64, vpminnmqd_f64,
> +


>   __extension__ static __inline float32x2_t __attribute__ ((__always_inline__))
> Index: gcc/config/aarch64/aarch64-simd.md
> ===================================================================
> --- gcc/config/aarch64/aarch64-simd.md  (revision 218464)
> +++ gcc/config/aarch64/aarch64-simd.md  (working copy)
> @@ -1017,6 +1017,28 @@
>     DONE;
>   })
>
> +;; Pairwise Integer Max/Min operations.
> +(define_insn "aarch64_<maxmin_uns>p<mode>"
> + [(set (match_operand:VDQ_BHSI 0 "register_operand" "=w")
> +       (unspec:VDQ_BHSI [(match_operand:VDQ_BHSI 1 "register_operand" "w")
> +                        (match_operand:VDQ_BHSI 2 "register_operand" "w")]
> +                       MAXMINV))]
> + "TARGET_SIMD"
> + "<maxmin_uns_op>p\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
> +  [(set_attr "type" "neon_minmax<q>")]
> +)
> +

Hi Felix,

Sorry for the delay in getting back to you on this.

If you've rolled aarch64_reduc_<maxmin_uns>_internalv2si into the above 
pattern, do you still need it? For all its call points, just point them 
to aarch64_<maxmin_uns>p<mode>?

Thanks,
Tejas.




More information about the Gcc-patches mailing list