This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][AArch64] Support for LDP/STP of Q-registers
- From: Andrew Pinski <pinskia at gmail dot com>
- To: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Marcus Shawcroft <marcus dot shawcroft at arm dot com>, Richard Earnshaw <richard dot earnshaw at arm dot com>, James Greenhalgh <james dot greenhalgh at arm dot com>, Siddhesh Poyarekar <siddhesh at sourceware dot org>, sameera dot deshpande at linaro dot org, sellcey at cavium dot com
- Date: Tue, 5 Jun 2018 09:45:06 -0700
- Subject: Re: [PATCH][AArch64] Support for LDP/STP of Q-registers
- References: <5B157981.3010408@foss.arm.com> <5B16BB06.3020709@foss.arm.com>
On Tue, Jun 5, 2018 at 9:32 AM Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
>
>
> On 04/06/18 18:40, Kyrill Tkachov wrote:
> > Hi all,
> >
> > This patch adds support for generating LDPs and STPs of Q-registers.
> > This allows for more compact code generation and makes better use of the ISA.
> >
> > It's implemented in a straightforward way by allowing 16-byte modes in the
> > sched-fusion machinery and adding appropriate peepholes in aarch64-ldpstp.md
> > as well as the patterns themselves in aarch64-simd.md.
> >
> > I didn't see any non-noise performance effect on SPEC2017 on Cortex-A72 and Cortex-A53.
> >
>
> Adding some folks who know more about other CPUs as well.
> Are you okay with enabling these instructions in AArch64?
>
> If you could give this a spin on some benchmarks you
> care about on your platforms it would be really useful data.
It might be useful to have a aarch64-tuning-flags.def for this; even
if all current cores have it on.
I might do some performance analysis on OcteonTX 81xx and 83xx (aka
thunderxt81 and thunderxt83) but it won't be until end of June as I am
on vacation until then.
Thanks,
Andrew
>
> Thanks,
> Kyrill
>
> > Bootstrapped and tested on aarch64-none-linux-gnu.
> >
> > Ok for trunk?
> >
> > Thanks,
> > Kyrill
> >
> > 2018-06-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
> >
> > * config/aarch64/aarch64.c (aarch64_mode_valid_for_sched_fusion_p):
> > Allow 16-byte modes.
> > (aarch64_classify_address): Allow 16-byte modes for load_store_pair_p.
> > * config/aarch64/aarch64-ldpstp.md: Add peepholes for LDP STP of
> > 128-bit modes.
> > * config/aarch64/aarch64-simd.md (load_pair<VQ:mode><VQ2:mode>):
> > New pattern.
> > (vec_store_pair<VQ:mode><VQ2:mode>): Likewise.
> > * config/aarch64/iterators.md (VQ2): New mode iterator.
> >
> > 2018-06-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
> >
> > * gcc.target/aarch64/ldp_stp_q.c: New test.
> > * gcc.target/aarch64/stp_vec_128_1.c: Likewise.
>