This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][GCC][AArch64] Vectorise __builtin_signbit on aarch64


Hi,

Przemyslaw Wirkus <Przemyslaw.Wirkus@arm.com> writes:
> Hi all,
>
> Vectorise __builtin_signbit (v4sf) with unsigned shift right vector
> instruction.
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Assembly output for:
> $ aarch64-elf-gcc -S -O3 signbitv4sf.c -dp
>
> Before patch:
>
> foo:
> 	adrp	x3, in	// 37	[c=4 l=4]  *movdi_aarch64/12
> 	adrp	x2, out	// 40	[c=4 l=4]  *movdi_aarch64/12
> 	add	x3, x3, :lo12:in	// 39	[c=4 l=4]  add_losym_di
> 	add	x2, x2, :lo12:out	// 42	[c=4 l=4]  add_losym_di
> 	mov	x0, 0	// 3	[c=4 l=4]  *movdi_aarch64/3
> 	.p2align 3,,7
> .L2:
> 	ldr	w1, [x3, x0]	// 10	[c=16 l=4]  *zero_extendsidi2_aarch64/1
> 	and	w1, w1, -2147483648	// 11	[c=4 l=4]  andsi3/1
> 	str	w1, [x2, x0]	// 16	[c=4 l=4]  *movsi_aarch64/8
> 	add	x0, x0, 4	// 17	[c=4 l=4]  *adddi3_aarch64/0
> 	cmp	x0, 4096	// 19	[c=4 l=4]  cmpdi/1
> 	bne	.L2		// 20	[c=4 l=4]  condjump
> 	ret		// 50	[c=0 l=4]  *do_return
>
> After patch:
>
> foo:
> 	adrp	x2, in	// 36	[c=4 l=4]  *movdi_aarch64/12
> 	adrp	x1, out	// 39	[c=4 l=4]  *movdi_aarch64/12
> 	add	x2, x2, :lo12:in	// 38	[c=4 l=4]  add_losym_di
> 	add	x1, x1, :lo12:out	// 41	[c=4 l=4]  add_losym_di
> 	mov	x0, 0	// 3	[c=4 l=4]  *movdi_aarch64/3
> 	.p2align 3,,7
> .L2:
> 	ldr	q0, [x2, x0]	// 10	[c=8 l=4]  *aarch64_simd_movv4sf/0
> 	ushr	v0.4s, v0.4s, 31	// 11	[c=12 l=4]  aarch64_simd_lshrv4si
> 	str	q0, [x1, x0]	// 15	[c=4 l=4]  *aarch64_simd_movv4si/2
> 	add	x0, x0, 16	// 16	[c=4 l=4]  *adddi3_aarch64/0
> 	cmp	x0, 4096	// 18	[c=4 l=4]  cmpdi/1
> 	bne	.L2		// 19	[c=4 l=4]  condjump
> 	ret		// 49	[c=0 l=4]  *do_return
>
> Thanks,
> Przemyslaw
>
> gcc/ChangeLog:
>
> 2019-03-20  Przemyslaw Wirkus  <przemyslaw.wirkus@arm.com>
>
> 	* config/aarch64/aarch64-builtins.c
> 	(aarch64_builtin_vectorized_function): Added CASE_CFN_SIGNBIT.
> 	* config/aarch64/aarch64-simd-builtins.def: (signbit)
> 	Extend to V4SF mode.
> 	* config/aarch64/aarch64-simd.md (signbitv4sf2): New expand
> 	defined.

I think it'd be better to add a new IFN_SIGNBIT internal function
that maps to signbit_optab.  That way the compiler will know what
the vector function does and there'll be no need to add a new
built-in function.

Thanks,
Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]