This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [AArch64_be] Fix vtbl[34] and vtbx4

From: James Greenhalgh <james dot greenhalgh at arm dot com>
To: Christophe Lyon <christophe dot lyon at linaro dot org>
Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
Date: Mon, 12 Oct 2015 14:30:23 +0100
Subject: Re: [AArch64_be] Fix vtbl[34] and vtbx4
Authentication-results: sourceware.org; auth=none
References: <CAKdteObt_dP63aqn3eH6mHiK5zXP+Y_rL+DfN55D=WfK_4cVGw at mail dot gmail dot com> <20151007150941 dot GA31205 at arm dot com> <CAKdteOYBU7y-z0J5d9ijU+O=DZPkLTPjjiRyhD8ywHoa4K5QPw at mail dot gmail dot com> <20151008091230 dot GA13098 at arm dot com> <CAKdteOawhToG=aw7sYYvHva4EiW46a2EDWiA9hW8GtbAqmNRkQ at mail dot gmail dot com>

On Fri, Oct 09, 2015 at 05:16:05PM +0100, Christophe Lyon wrote:
> On 8 October 2015 at 11:12, James Greenhalgh <james.greenhalgh@arm.com> wrote:
> > On Wed, Oct 07, 2015 at 09:07:30PM +0100, Christophe Lyon wrote:
> >> On 7 October 2015 at 17:09, James Greenhalgh <james.greenhalgh@arm.com> wrote:
> >> > On Tue, Sep 15, 2015 at 05:25:25PM +0100, Christophe Lyon wrote:
> >> >
> >> > Why do we want this for vtbx4 rather than putting out a VTBX instruction
> >> > directly (as in the inline asm versions you replace)?
> >> >
> >> I just followed the pattern used for vtbx3.
> >>
> >> > This sequence does make sense for vtbx3.
> >> In fact, I don't see why vtbx3 and vtbx4 should be different?
> >
> > The difference between TBL and TBX is in their handling of a request to
> > select an out-of-range value. For TBL this returns zero, for TBX this
> > returns the value which was already in the destination register.
> >
> > Because the byte-vectors used by the TBX instruction in aarch64 are 128-bit
> > (so two of them togather allow selecting elements in the range 0-31), and
> > vtbx3 needs to emulate the AArch32 behaviour of picking elements from 3x64-bit
> > vectors (allowing elements in the range 0-23), we need to manually check for
> > values which would have been out-of-range on AArch32, but are not out
> > of range for AArch64 and handle them appropriately. For vtbx4 on the other
> > hand, 2x128-bit registers give the range 0..31 and 4x64-bit registers give
> > the range 0..31, so we don't need the special masked handling.
> >
> > You can find the suggested instruction sequences for the Neon intrinsics
> > in this document:
> >
> >   http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/IHI0073A_arm_neon_intrinsics_ref.pdf
> >
> 
> Hi James,
> 
> Please find attached an updated version which hopefully addresses your comments.
> Tested on aarch64-none-elf and aarch64_be-none-elf using the Foundation Model.
> 
> OK?

Looks good to me,

Thanks,
James

Follow-Ups:
- Re: [AArch64_be] Fix vtbl[34] and vtbx4
  - From: Christophe Lyon

References:
- Re: [AArch64_be] Fix vtbl[34] and vtbx4
  - From: James Greenhalgh
- Re: [AArch64_be] Fix vtbl[34] and vtbx4
  - From: Christophe Lyon
- Re: [AArch64_be] Fix vtbl[34] and vtbx4
  - From: James Greenhalgh
- Re: [AArch64_be] Fix vtbl[34] and vtbx4
  - From: Christophe Lyon

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]