This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH, ARM, RFC] Fix vect.exp failures for NEON in big-endian mode
- From: Paul Brook <paul at codesourcery dot com>
- To: Julian Brown <julian at codesourcery dot com>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, <gcc-patches at gcc dot gnu dot org>, Ramana Radhakrishnan <ramrad01 at arm dot com>, Richard Earnshaw <rearnsha at arm dot com>
- Date: Fri, 1 Mar 2013 14:35:05 +0000
- Subject: Re: [PATCH, ARM, RFC] Fix vect.exp failures for NEON in big-endian mode
- References: <20130227172947.31fa279c@octopus> <CAFiYyc3JOi-7jVtWXgvdKeQG-WrV5=kvSPrgSzMOoKM0aw2oSQ@mail.gmail.com> <20130301120229.60d35679@octopus>
> > Do I understand correctly that the "only" issue is memory vs. register
> > element ordering? Thus a fixup could be as simple as extra shuffles
> > inserted after vector memory loads and before vector memory stores?
> > (with the hope of RTL optimizers optimizing those)?
> It's not even necessary to use explicit shuffles -- NEON has perfectly
> good instructions for loading/storing vectors in the "right" order, in
> the form of vld1 & vst1. I'm afraid the solution to this problem might
> have been staring us in the face for years, which is simply to forbid
> vldr/vstr/vldm/vstm (the instructions which lead to weird element
> permutations in BE mode) for loading/storing NEON vectors altogether.
> That way the vectorizer gets what it wants, the intrinsics can continue
> to use __builtin_shuffle exactly as they are doing, and we get to
> remove all the bits which fiddle vector element numbering in BE mode in
> the ARM backend.
> I can't exactly remember why we didn't do that to start with. I think
> the problem was ABI-related, or to do with transferring NEON vectors
> to/from ARM registers when it was necessary to do that... I'm planning
> to do some archaeology to try to see if I can figure out a definitive
The ABI defined vector types (uint32x4_t etc) are defined to be in vldm/vstm