[AArch64][SVE] Utilize ASRD instruction for division and remainder
Yuliang Wang
Yuliang.Wang@arm.com
Fri Sep 27 10:00:00 GMT 2019
Apologies for the accidental change, and added the underscore.
Regards
Yuliang
gcc/ChangeLog:
2019-09-27 Yuliang Wang <yuliang.wang@arm.com>
* config/aarch64/aarch64-sve.md (sdiv_pow2<mode>3):
New pattern for ASRD.
* config/aarch64/iterators.md (UNSPEC_ASRD): New unspec.
* internal-fn.def (IFN_DIV_POW2): New internal function.
* optabs.def (sdiv_pow2_optab): New optab.
* tree-vect-patterns.c (vect_recog_divmod_pattern):
Modify pattern to support new operation.
* doc/md.texi (sdiv_pow2$var{m3}): Documentation for the above.
* doc/sourcebuild.texi (vect_sdiv_pow2_si):
Document new target selector.
gcc/testsuite/ChangeLog:
2019-09-27 Yuliang Wang <yuliang.wang@arm.com>
* gcc.dg/vect/vect-sdiv-pow2-1.c: New test.
* gcc.target/aarch64/sve/asrdiv_1.c: As above.
* lib/target-support.exp (check_effective_target_vect_sdiv_pow2_si):
Return true for AArch64 with SVE.
-----Original Message-----
From: Yuliang Wang
Sent: 27 September 2019 10:37
To: Richard Sandiford <richard.sandiford@arm.com>
Cc: nd <nd@arm.com>; gcc-patches@gcc.gnu.org
Subject: RE: [AArch64][SVE] Utilize ASRD instruction for division and remainder
Hi Richard,
I have renamed the optabs and associated identifiers as per your suggestion. Thanks.
Regards
Yuliang
gcc/ChangeLog:
2019-09-27 Yuliang Wang <yuliang.wang@arm.com>
* config/aarch64/aarch64-sve.md (sdiv_pow2<mode>3):
New pattern for ASRD.
* config/aarch64/iterators.md (UNSPEC_ASRD): New unspec.
* internal-fn.def (IFN_DIV_POW2): New internal function.
* optabs.def (sdiv_pow2_optab): New optab.
* tree-vect-patterns.c (vect_recog_divmod_pattern):
Modify pattern to support new operation.
* doc/md.texi (sdiv_pow2$var{m3}): Documentation for the above.
* doc/sourcebuild.texi (vect_sdivpow2_si): Document new target selector.
gcc/testsuite/ChangeLog:
2019-09-27 Yuliang Wang <yuliang.wang@arm.com>
* gcc.dg/vect/vect-sdivpow2-1.c: New test.
* gcc.target/aarch64/sve/asrdiv_1.c: As above.
* lib/target-support.exp (check_effective_target_vect_sdivpow2_si):
Return true for AArch64 with SVE.
-----Original Message-----
From: Richard Sandiford <richard.sandiford@arm.com>
Sent: 24 September 2019 17:12
To: Yuliang Wang <Yuliang.Wang@arm.com>
Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>
Subject: Re: [AArch64][SVE] Utilize ASRD instruction for division and remainder
Yuliang Wang <Yuliang.Wang@arm.com> writes:
> Hi,
>
> The C snippets below (signed division/modulo by a power-of-2 immediate value):
>
> #define P ...
>
> void foo_div (int *a, int *b, int N)
> {
> for (int i = 0; i < N; i++)
> a[i] = b[i] / (1 << P);
> }
> void foo_mod (int *a, int *b, int N)
> {
> for (int i = 0; i < N; i++)
> a[i] = b[i] % (1 << P);
> }
>
> Vectorize to the following on AArch64 + SVE:
>
> foo_div:
> movx0, 0
> movw2, N
> ptruep1.b, all
> whilelop0.s, wzr, w2
> .p2align3,,7
> .L2:
> ld1wz1.s, p0/z, [x3, x0, lsl 2]
> cmpltp2.s, p1/z, z1.s, #0//
> movz0.s, p2/z, #7//
> addz0.s, z0.s, z1.s//
> asrz0.s, z0.s, #3//
> st1wz0.s, p0, [x1, x0, lsl 2]
> incwx0
> whilelop0.s, w0, w2
> b.any.L2
> ret
>
> foo_mod:
> ...
> .L2:
> ld1wz0.s, p0/z, [x3, x0, lsl 2]
> cmpltp2.s, p1/z, z0.s, #0//
> movz1.s, p2/z, #-1//
> lsrz1.s, z1.s, #29//
> addz0.s, z0.s, z1.s//
> andz0.s, z0.s, #{2^P-1}//
> subz0.s, z0.s, z1.s//
> st1wz0.s, p0, [x1, x0, lsl 2]
> incwx0
> whilelop0.s, w0, w2
> b.any.L2
> ret
>
> This patch utilizes the special-purpose ASRD (arithmetic shift-right for divide by immediate) instruction:
>
> foo_div:
> ...
> .L2:
> ld1wz0.s, p0/z, [x3, x0, lsl 2]
> asrdz0.s, p1/m, z0.s, #{P}//
> st1wz0.s, p0, [x1, x0, lsl 2]
> incwx0
> whilelop0.s, w0, w2
> b.any.L2
> ret
>
> foo_mod:
> ...
> .L2:
> ld1wz0.s, p0/z, [x3, x0, lsl 2]
> movprfxz1, z0//
> asrdz1.s, p1/m, z1.s, #{P}//
> lslz1.s, z1.s, #{P}//
> subz0.s, z0.s, z1.s//
> st1wz0.s, p0, [x1, x0, lsl 2]
> incwx0
> whilelop0.s, w0, w2
> b.any.L2
> ret
>
> Added new tests. Built and regression tested on aarch64-none-elf.
>
> Best Regards,
> Yuliang Wang
>
>
> gcc/ChangeLog:
>
> 2019-09-23 Yuliang Wang <yuliang.wang@arm.com>
>
> * config/aarch64/aarch64-sve.md (asrd<mode>3): New pattern for ASRD.
> * config/aarch64/iterators.md (UNSPEC_ASRD): New unspec.
> (ASRDIV): New int iterator.
> * internal-fn.def (IFN_ASHR_DIV): New internal function.
> * optabs.def (ashr_div_optab): New optab.
> * tree-vect-patterns.c (vect_recog_divmod_pattern):
> Modify pattern to support new operation.
> * doc/md.texi (asrd$var{m3}): Documentation for the above.
> * doc/sourcebuild.texi (vect_asrdiv_si): Document new target selector.
This looks good to me. My only real question is about naming:
maybe IFN_DIV_POW2 would be a better name for the internal function and sdiv_pow2_optab/"div_pow2$a3" for the optab? But I'm useless at naming things, so maybe others would prefer your names.
Thanks,
Richard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rb11863.patch
Type: application/octet-stream
Size: 12491 bytes
Desc: rb11863.patch
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20190927/f982b6be/attachment.obj>
More information about the Gcc-patches
mailing list