This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [AArch64][SVE] Utilize ASRD instruction for division and remainder


Apologies for the accidental change, and added the underscore.

Regards
Yuliang


gcc/ChangeLog:

2019-09-27  Yuliang Wang  <yuliang.wang@arm.com>

	* config/aarch64/aarch64-sve.md (sdiv_pow2<mode>3):
	New pattern for ASRD.
	* config/aarch64/iterators.md (UNSPEC_ASRD): New unspec.
	* internal-fn.def (IFN_DIV_POW2): New internal function.
	* optabs.def (sdiv_pow2_optab): New optab.
	* tree-vect-patterns.c (vect_recog_divmod_pattern):
	Modify pattern to support new operation.
	* doc/md.texi (sdiv_pow2$var{m3}): Documentation for the above.
	* doc/sourcebuild.texi (vect_sdiv_pow2_si):
	Document new target selector.

gcc/testsuite/ChangeLog:

2019-09-27  Yuliang Wang  <yuliang.wang@arm.com>

	* gcc.dg/vect/vect-sdiv-pow2-1.c: New test.
	* gcc.target/aarch64/sve/asrdiv_1.c: As above.
	* lib/target-support.exp (check_effective_target_vect_sdiv_pow2_si):
	Return true for AArch64 with SVE.


-----Original Message-----
From: Yuliang Wang 
Sent: 27 September 2019 10:37
To: Richard Sandiford <richard.sandiford@arm.com>
Cc: nd <nd@arm.com>; gcc-patches@gcc.gnu.org
Subject: RE: [AArch64][SVE] Utilize ASRD instruction for division and remainder

Hi Richard,

I have renamed the optabs and associated identifiers as per your suggestion. Thanks.

Regards
Yuliang


gcc/ChangeLog:

2019-09-27  Yuliang Wang  <yuliang.wang@arm.com>

	* config/aarch64/aarch64-sve.md (sdiv_pow2<mode>3):
	New pattern for ASRD.
	* config/aarch64/iterators.md (UNSPEC_ASRD): New unspec.
	* internal-fn.def (IFN_DIV_POW2): New internal function.
	* optabs.def (sdiv_pow2_optab): New optab.
	* tree-vect-patterns.c (vect_recog_divmod_pattern):
	Modify pattern to support new operation.
	* doc/md.texi (sdiv_pow2$var{m3}): Documentation for the above.
	* doc/sourcebuild.texi (vect_sdivpow2_si): Document new target selector.

gcc/testsuite/ChangeLog:

2019-09-27  Yuliang Wang  <yuliang.wang@arm.com>

	* gcc.dg/vect/vect-sdivpow2-1.c: New test.
	* gcc.target/aarch64/sve/asrdiv_1.c: As above.
	* lib/target-support.exp (check_effective_target_vect_sdivpow2_si):
	Return true for AArch64 with SVE.


-----Original Message-----
From: Richard Sandiford <richard.sandiford@arm.com> 
Sent: 24 September 2019 17:12
To: Yuliang Wang <Yuliang.Wang@arm.com>
Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>
Subject: Re: [AArch64][SVE] Utilize ASRD instruction for division and remainder

Yuliang Wang <Yuliang.Wang@arm.com> writes:
> Hi,
>
> The C snippets below  (signed division/modulo by a power-of-2 immediate value):
>
> #define P ...
>
> void foo_div (int *a, int *b, int N)
> {
>     for (int i = 0; i < N; i++)
>         a[i] = b[i] / (1 << P);
> }
> void foo_mod (int *a, int *b, int N)
> {
>     for (int i = 0; i < N; i++)
>         a[i] = b[i] % (1 << P);
> }
>
> Vectorize to the following on AArch64 + SVE:
>
> foo_div:
>     movx0, 0
>     movw2, N
>     ptruep1.b, all
>     whilelop0.s, wzr, w2
>     .p2align3,,7
> .L2:
>     ld1wz1.s, p0/z, [x3, x0, lsl 2]
>     cmpltp2.s, p1/z, z1.s, #0//
>     movz0.s, p2/z, #7//
>     addz0.s, z0.s, z1.s//
>     asrz0.s, z0.s, #3//
>     st1wz0.s, p0, [x1, x0, lsl 2]
>     incwx0
>     whilelop0.s, w0, w2
>     b.any.L2
>     ret
>
> foo_mod:
>     ...
> .L2:
>     ld1wz0.s, p0/z, [x3, x0, lsl 2]
>     cmpltp2.s, p1/z, z0.s, #0//
>     movz1.s, p2/z, #-1//
>     lsrz1.s, z1.s, #29//
>     addz0.s, z0.s, z1.s//
>     andz0.s, z0.s, #{2^P-1}//
>     subz0.s, z0.s, z1.s//
>     st1wz0.s, p0, [x1, x0, lsl 2]
>     incwx0
>     whilelop0.s, w0, w2
>     b.any.L2
>     ret
>
> This patch utilizes the special-purpose ASRD (arithmetic shift-right for divide by immediate) instruction:
>
> foo_div:
>     ...
> .L2:
>     ld1wz0.s, p0/z, [x3, x0, lsl 2]
>     asrdz0.s, p1/m, z0.s, #{P}//
>     st1wz0.s, p0, [x1, x0, lsl 2]
>     incwx0
>     whilelop0.s, w0, w2
>     b.any.L2
>     ret
>
> foo_mod:
>     ...
> .L2:
>     ld1wz0.s, p0/z, [x3, x0, lsl 2]
>     movprfxz1, z0//
>     asrdz1.s, p1/m, z1.s, #{P}//
>     lslz1.s, z1.s, #{P}//
>     subz0.s, z0.s, z1.s//
>     st1wz0.s, p0, [x1, x0, lsl 2]
>     incwx0
>     whilelop0.s, w0, w2
>     b.any.L2
>     ret
>
> Added new tests. Built and regression tested on aarch64-none-elf.
>
> Best Regards,
> Yuliang Wang
>
>
> gcc/ChangeLog:
>
> 2019-09-23  Yuliang Wang  <yuliang.wang@arm.com>
>
> * config/aarch64/aarch64-sve.md (asrd<mode>3): New pattern for ASRD.
> * config/aarch64/iterators.md (UNSPEC_ASRD): New unspec.
> (ASRDIV): New int iterator.
> * internal-fn.def (IFN_ASHR_DIV): New internal function.
> * optabs.def (ashr_div_optab): New optab.
> * tree-vect-patterns.c (vect_recog_divmod_pattern):
> Modify pattern to support new operation.
> * doc/md.texi (asrd$var{m3}): Documentation for the above.
> * doc/sourcebuild.texi (vect_asrdiv_si): Document new target selector.

This looks good to me.  My only real question is about naming:
maybe IFN_DIV_POW2 would be a better name for the internal function and sdiv_pow2_optab/"div_pow2$a3" for the optab?  But I'm useless at naming things, so maybe others would prefer your names.

Thanks,
Richard
 

Attachment: rb11863.patch
Description: rb11863.patch


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]