[PATCH][GCC][AArch64] optimize float immediate moves (2 /4) - HF/DF/SF mode.
Tamar Christina
Tamar.Christina@arm.com
Mon Jun 26 10:50:00 GMT 2017
Hi all,
Here's the re-spun patch.
Aside from the grouping of the split patterns it now also uses h register for the fmov for HF when available,
otherwise it forces a literal load.
Regression tested on aarch64-none-linux-gnu and no regressions.
OK for trunk?
Thanks,
Tamar
gcc/
2017-06-26 Tamar Christina <tamar.christina@arm.com>
Richard Sandiford <richard.sandiford@linaro.org>
* config/aarch64/aarch64.md (mov<mode>): Generalize.
(*movhf_aarch64, *movsf_aarch64, *movdf_aarch64):
Add integer and movi cases.
(movi-split-hf-df-sf split, fp16): New.
(enabled): Added TARGET_FP_F16INST.
* config/aarch64/iterators.md (GPF_HF): New.
________________________________________
From: Tamar Christina
Sent: Wednesday, June 21, 2017 11:48:33 AM
To: James Greenhalgh
Cc: GCC Patches; nd; Marcus Shawcroft; Richard Earnshaw
Subject: RE: [PATCH][GCC][AArch64] optimize float immediate moves (2 /4) - HF/DF/SF mode.
> > movi\\t%0.4h, #0
> > - mov\\t%0.h[0], %w1
> > + fmov\\t%s0, %w1
>
> Should this not be %h0?
The problem is that H registers are only available in ARMv8.2+,
I'm not sure what to do about ARMv8.1 given your other feedback
Pointing out that the bit patterns between how it's stored in s vs h registers
differ.
>
> > umov\\t%w0, %1.h[0]
> > mov\\t%0.h[0], %1.h[0]
> > + fmov\\t%s0, %1
>
> Likewise, and much more important for correctness as it changes the way the
> bit pattern ends up in the register (see table C2-1 in release B.a of the ARM
> Architecture Reference Manual for ARMv8-A), here.
>
> > + * return aarch64_output_scalar_simd_mov_immediate (operands[1],
> > + SImode);
> > ldr\\t%h0, %1
> > str\\t%h1, %0
> > ldrh\\t%w0, %1
> > strh\\t%w1, %0
> > mov\\t%w0, %w1"
> > - [(set_attr "type"
> "neon_move,neon_from_gp,neon_to_gp,neon_move,\
> > - f_loads,f_stores,load1,store1,mov_reg")
> > - (set_attr "simd" "yes,yes,yes,yes,*,*,*,*,*")]
> > + "&& can_create_pseudo_p ()
> > + && !aarch64_can_const_movi_rtx_p (operands[1], HFmode)
> > + && !aarch64_float_const_representable_p (operands[1])
> > + && aarch64_float_const_rtx_p (operands[1])"
> > + [(const_int 0)]
> > + "{
> > + unsigned HOST_WIDE_INT ival;
> > + if (!aarch64_reinterpret_float_as_int (operands[1], &ival))
> > + FAIL;
> > +
> > + rtx tmp = gen_reg_rtx (SImode);
> > + aarch64_expand_mov_immediate (tmp, GEN_INT (ival));
> > + tmp = simplify_gen_subreg (HImode, tmp, SImode, 0);
> > + emit_move_insn (operands[0], gen_lowpart (HFmode, tmp));
> > + DONE;
> > + }"
> > + [(set_attr "type" "neon_move,f_mcr,neon_to_gp,neon_move,fconsts,
> \
> > + neon_move,f_loads,f_stores,load1,store1,mov_reg")
> > + (set_attr "simd" "yes,*,yes,yes,*,yes,*,*,*,*,*")]
> > )
>
> Thanks,
> James
-------------- next part --------------
A non-text attachment was scrubbed...
Name: float-mov_modes-2.patch
Type: text/x-patch
Size: 6529 bytes
Desc: float-mov_modes-2.patch
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20170626/9de1bfcb/attachment.bin>
More information about the Gcc-patches
mailing list