This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Simplify vec_merge of vec_duplicate with const_vector
- From: Jeff Law <law at redhat dot com>
- To: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 27 Jun 2017 16:29:35 -0600
- Subject: Re: [PATCH] Simplify vec_merge of vec_duplicate with const_vector
- Authentication-results: sourceware.org; auth=none
- Authentication-results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com
- Authentication-results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=law at redhat dot com
- Dkim-filter: OpenDKIM Filter v2.11.0 mx1.redhat.com AD856334594
- Dmarc-filter: OpenDMARC Filter v1.3.2 mx1.redhat.com AD856334594
- References: <5936670F.4040404@foss.arm.com>
On 06/06/2017 02:25 AM, Kyrill Tkachov wrote:
> Hi all,
>
> I'm trying to improve some of the RTL-level handling of vector lane
> operations on aarch64 and that
> involves dealing with a lot of vec_merge operations. One simplification
> that I noticed missing
> from simplify-rtx are combinations of vec_merge with vec_duplicate.
> In this particular case:
> (vec_merge (vec_duplicate (X)) (const_vector [A, B]) (const_int N))
>
> which can be replaced with
>
> (vec_concat (X) (B)) if N == 1 (0b01) or
> (vec_concat (A) (X)) if N == 2 (0b10).
>
> For the aarch64 testcase in this patch this simplifications allows us to
> try to combine:
> (set (reg:V2DI 77 [ x ])
> (vec_concat:V2DI (mem:DI (reg:DI 0 x0 [ y ]) [1 *y_3(D)+0 S8 A64])
> (const_int 0 [0])))
>
> instead of the more complex:
> (set (reg:V2DI 77 [ x ])
> (vec_merge:V2DI (vec_duplicate:V2DI (mem:DI (reg:DI 0 x0 [ y ]) [1
> *y_3(D)+0 S8 A64]))
> (const_vector:V2DI [
> (const_int 0 [0])
> (const_int 0 [0])
> ])
> (const_int 1 [0x1])))
>
>
> For the simplified form above we already have an aarch64 pattern:
> *aarch64_combinez<mode> which
> is missing a DI/DFmode version due to an oversight, so this patch
> extends that pattern as well to
> use the VDC mode iterator that includes DI and DFmode (as well as V2HF
> which VD_BHSI was missing).
> The aarch64 hunk is needed to see the benefit of the simplify-rtx.c
> hunk, so I didn't split them
> into separate patches.
>
> Before this for the testcase we'd generate:
> construct_lanedi:
> movi v0.4s, 0
> ldr x0, [x0]
> ins v0.d[0], x0
> ret
>
> construct_lanedf:
> movi v0.2d, 0
> ldr d1, [x0]
> ins v0.d[0], v1.d[0]
> ret
>
> but now we can generate:
> construct_lanedi:
> ldr d0, [x0]
> ret
>
> construct_lanedf:
> ldr d0, [x0]
> ret
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2017-06-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>
> * simplify-rtx.c (simplify_ternary_operation, VEC_MERGE):
> Simplify vec_merge of vec_duplicate and const_vector.
> * config/aarch64/predicates.md (aarch64_simd_or_scalar_imm_zero):
> New predicate.
> * config/aarch64/aarch64-simd.md (*aarch64_combinez<mode>): Use VDC
> mode iterator. Update predicate on operand 1 to
> handle non-const_vec constants. Delete constraints.
> (*aarch64_combinez_be<mode>): Likewise for operand 2.
>
> 2017-06-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
>
> * gcc.target/aarch64/construct_lane_zero_1.c: New test.
OK for the simplify-rtx parts.
jeff