This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] [ARM] PR68532: Fix VUZP and VZIP recognition on big endian

From: Charles Baylis <charles dot baylis at linaro dot org>
To: Kyrylo Tkachov <kyrylo dot tkachov at arm dot com>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>, Michael Collison <michael dot collison at linaro dot org>
Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
Date: Mon, 1 Feb 2016 15:45:35 +0000
Subject: Re: [PATCH] [ARM] PR68532: Fix VUZP and VZIP recognition on big endian
Authentication-results: sourceware.org; auth=none
References: <CADnVucC41t=CvOBz-iKaTxwX8ZKUW6tHh-bncs8T4-pKpZaU1Q at mail dot gmail dot com> <CADnVucCbqb=P_duFawt7NPnSP9k3e2dj8nHUfC5Fm1AYtQ2QJg at mail dot gmail dot com>

ping^2

On 13 January 2016 at 13:37, Charles Baylis <charles.baylis@linaro.org> wrote:
> ping
>
> On 16 December 2015 at 17:44, Charles Baylis <charles.baylis@linaro.org> wrote:
>> Hi
>>
>> This patch addresses incorrect recognition of VEC_PERM_EXPRs as VUZP
>> and VZIP on armeb-* targets. It also fixes the definition of the
>> vuzpq_* and vzipq_*  NEON intrinsics which use incorrect lane
>> specifiers in the use of __builtin_shuffle().
>>
>> The problem with arm_neon.h can be seen by temporarily altering
>> arm_expand_vec_perm_const_1() to unconditionally return false. If this
>> is done, the vuzp/vzip tests in the advsimd execution tests will fail.
>> With these patches, this is no longer the case.
>>
>> The problem is caused by the weird mapping of architectural lane order
>> to gcc lane order in big endian. For 64 bit vectors, the order is
>> simply reversed, but 128 bit vectors are treated as 2 64 bit vectors
>> where the lane ordering is reversed inside those. This is due to the
>> memory ordering defined by the EABI. There is a large comment in
>> gcc/config/arm.c above output_move_neon() which describes this in more
>> detail.
>>
>> The arm_evpc_neon_vuzp() and  arm_evpc_neon_vzip() functions do not
>> allow for this lane order, instead treating the lane order as simply
>> reversed in 128 bit vectors. These patches fix this. I have included a
>> test case for vuzp, but I don't have one for vzip.
>>
>> Tested with make check on arm-unknown-linux-gnueabihf with no regressions
>> Tested with make check on armeb-unknown-linux-gnueabihf. Some
>> gcc.dg/vect tests fail due to no longer being vectorized. I haven't
>> analysed these, but it is expected since vuzp is not usable for the
>> shuffle patterns for which it was previously used. There are also a
>> few new PASSes.
>>
>>
>> Patch 1 (vuzp):
>>
>> gcc/ChangeLog:
>>
>> 2015-12-15  Charles Baylis  <charles.baylis@linaro.org>
>>
>>         * config/arm/arm.c (arm_neon_endian_lane_map): New function.
>>         (arm_neon_vector_pair_endian_lane_map): New function.
>>         (arm_evpc_neon_vuzp): Allow for big endian lane order.
>>         * config/arm/arm_neon.h (vuzpq_s8): Adjust shuffle patterns for big
>>         endian.
>>         (vuzpq_s16): Likewise.
>>         (vuzpq_s32): Likewise.
>>         (vuzpq_f32): Likewise.
>>         (vuzpq_u8): Likewise.
>>         (vuzpq_u16): Likewise.
>>         (vuzpq_u32): Likewise.
>>         (vuzpq_p8): Likewise.
>>         (vuzpq_p16): Likewise.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2015-12-15  Charles Baylis  <charles.baylis@linaro.org>
>>
>>         * gcc.c-torture/execute/pr68532.c: New test.
>>
>>
>> Patch 2 (vzip)
>>
>> gcc/ChangeLog:
>>
>> 2015-12-15  Charles Baylis  <charles.baylis@linaro.org>
>>
>>         * config/arm/arm.c (arm_evpc_neon_vzip): Allow for big endian lane
>>         order.
>>         * config/arm/arm_neon.h (vzipq_s8): Adjust shuffle patterns for big
>>         endian.
>>         (vzipq_s16): Likewise.
>>         (vzipq_s32): Likewise.
>>         (vzipq_f32): Likewise.
>>         (vzipq_u8): Likewise.
>>         (vzipq_u16): Likewise.
>>         (vzipq_u32): Likewise.
>>         (vzipq_p8): Likewise.
>>         (vzipq_p16): Likewise.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]