This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [AArch64] Handle HFAs of float16 types properly
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: <gcc-patches at gcc dot gnu dot org>
- Cc: <nd at arm dot com>, <richard dot earnshaw at arm dot com>, <marcus dot shawcroft at arm dot com>
- Date: Thu, 4 Aug 2016 11:30:52 +0100
- Subject: Re: [AArch64] Handle HFAs of float16 types properly
- Authentication-results: sourceware.org; auth=none
- Nodisclaimer: True
- References: <1469541302-17088-1-git-send-email-james.greenhalgh@arm.com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On Tue, Jul 26, 2016 at 02:55:02PM +0100, James Greenhalgh wrote:
>
> Hi,
>
> It looks like we've not been handling structures of 16-bit floating-point
> data correctly for AArch64. For some reason we end up passing them
> packed in to integer registers. That is to say, on trunk and GCC 6, for:
>
> struct x {
> __fp16 x[4];
> };
>
> __fp16
> foo1 (struct x x)
> {
> return x.x[1];
> }
>
> We generate:
>
> foo1:
> sbfx x0, x0, 16, 16
> mov v0.h[0], w0
> ret
>
> Which is wrong.
>
> This patch fixes that, so now we generate:
>
> foo1:
> umov w0, v1.h[0]
> sxth x0, w0
> mov v0.h[0], w0
> ret
>
> Far from optimal (I'll work on that...) but at least getting the data from
> the right register bank!
>
> To do this we need to keep around a reference to the fp16 type after we
> construct it. I've moved this initialisation to a new function
> aarch64_init_fp16_types in aarch64-builtins.c and made the references
> available through arm_neon.h.
>
> After that, we want to remove the #if 0 wrapping HFmode support in
> aarch64_gimplify_va_arg_expr in aarch64.c, and add HFmode to the
> REAL_TYPE and COMPLEX_TYPE support in aapcs_vfp_sub_candidate.
>
> Strictly speaking, we don't need the hunk regarding COMPLEX_TYPE.
> We can't build complex forms of __fp16. But, were we ever to support the
> _Float16 type we'd need this. Rather than leave the chance it will be
> forgotten about, I've just added it here. If the maintainers would prefer,
> I can change this to a TODO and put a sticky-note somewhere near my desk.
>
> With those simple changes, we fix the argument passing. The rest of the
> patch is an update to the various testcases in aapcs64.exp to fully cover
> various __fp16 cases (both naked, and within an HFA).
>
> Bootstrapped on aarch64-none-linux-gnu and tested with no issues. Also
> tested on aarch64_be-none-elf. All test came back clean.
>
> OK? As this is an ABI break, I'm not proposing for it to go back to GCC 6,
> though it will apply cleanly there if the maintainers support that.
*Ping*
https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01720.html
Thanks,
James
>
> gcc/
>
> 2016-07-26 James Greenhalgh <james.greenhalgh@arm.com>
>
> * config/aarch64/aarch64.h (aarch64_fp16_type_node): Declare.
> (aarch64_fp16_ptr_type_node): Likewise.
> * config/aarch64/aarch64-simd-builtins.c
> (aarch64_fp16_ptr_type_node): Define.
> (aarch64_init_fp16_types): New, refactored out of...
> (aarch64_init_builtins): ...here, update to call
> aarch64_init_fp16_types.
> * config/aarch64/aarch64.c (aarch64_gimplify_va_arg_expr): Handle
> HFmode.
> (aapcs_vfp_sub_candidate): Likewise.
>
> gcc/testsuite/
>
> 2016-07-26 James Greenhalgh <james.greenhalgh@arm.com>
>
> * gcc.target/aarch64/aapcs64/abitest-common.h: Define half-precision
> registers.
> * gcc.target/aarch64/aapcs64/abitest.S (dumpregs): Add assembly for
> saving the half-precision registers.
> * gcc.target/aarch64/aapcs64/func-ret-1.c: Test that an __fp16
> value is returned in h0.
> * gcc.target/aarch64/aapcs64/test_2.c: Check that __FP16 arguments
> are passed in FP/SIMD registers.
> * gcc.target/aarch64/aapcs64/test_27.c: New, test that __fp16 HFA
> passing works corrcetly.
> * gcc.target/aarch64/aapcs64/type-def.h (hfa_f16x1_t): New.
> (hfa_f16x2_t): Likewise.
> (hfa_f16x3_t): Likewise.
> * gcc.target/aarch64/aapcs64/va_arg-1.c: Check that __fp16 values
> are promoted to double and passed in a double register.
> * gcc.target/aarch64/aapcs64/va_arg-2.c: Check that __fp16 values
> are promoted to double and stacked.
> * gcc.target/aarch64/aapcs64/va_arg-4.c: Check stacking of HFA of
> __fp16 data types.
> * gcc.target/aarch64/aapcs64/va_arg-5.c: Likewise.
> * gcc.target/aarch64/aapcs64/va_arg-16.c: New, check HFAs of
> __fp16 first get passed in FP/SIMD registers, then stacked.
>