This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, ARM, RFC] Fix vect.exp failures for NEON in big-endian mode
- From: Paul Brook <paul at codesourcery dot com>
- To: Julian Brown <julian at codesourcery dot com>
- Cc: Richard Biener <richard dot guenther at gmail dot com>, <gcc-patches at gcc dot gnu dot org>, Ramana Radhakrishnan <ramrad01 at arm dot com>, Richard Earnshaw <rearnsha at arm dot com>
- Date: Mon, 4 Mar 2013 13:08:57 +0000
- Subject: Re: [PATCH, ARM, RFC] Fix vect.exp failures for NEON in big-endian mode
- References: <20130227172947.31fa279c@octopus> <201303011435.08546.paul@codesourcery.com> <20130304115601.6eb5b407@octopus>
> > > I can't exactly remember why we didn't do that to start with. I
> > > think the problem was ABI-related, or to do with transferring NEON
> > > vectors to/from ARM registers when it was necessary to do that...
> > > I'm planning to do some archaeology to try to see if I can figure
> > > out a definitive answer.
> >
> > The ABI defined vector types (uint32x4_t etc) are defined to be in
> > vldm/vstm order.
>
> There's no conflict with the ABI-defined vector order -- the ABI
> (looking at AAPCS, IHI 0042D) describes "containerized" vectors which
> should be used to pass and return vector quantities at ABI boundaries,
> but I couldn't find any further restrictions. Internally to a function,
> we are still free to use vld1/vst1 vector ordering. Using
> "containerized"/opaque transfers, the bit pattern of a vector in one
> function (using vld1/vst1 ordering internally) will of course remain
> unchanged if passed to another function and using the same ordering
> there also.
Ah, ok. If you make the ABI defined types distinct from the GCC generic
vector types (as used by the vectorizer), then in principle that should work.
I agree that current GCC probably does not have the infrastructure to do that,
and some of the vector code plays a bit fast and loose with type
conversions/subregs.
Remember that it's not just function arguments, it's any interface shared
between functions. i.e. including structures and global variables.
> Actually making that work (especially efficiently) with GCC is a
> slightly different matter. Let's call vldm/vstm-ordered vectors
> "containerized" format, and vld1/vst1-ordered vectors "array" format. We
> need to do introduce the concept of marshalling vector arguments from
> array format to containerized format when passing them to a function,
> and unmarshalling those vector arguments back the other way on function
> entry. AFAICT, GCC does not have suitable infrastructure for
> implementing such functionality at present: consider that e.g. vectors
> passed by value on the stack should use containerized format, which
> means the called function cannot simply dereference the stack pointer
> to read the vector:
IIRC I/we tried to do something very similar (possibly the other way around)
by abusing the unaligned load mechanism. I don't remember why that failed.
Paul