[PATCH, ARM] Misaligned access support for ARM Neon

Mon Jun 7 19:09:00 GMT 2010

On Fri, 04 Jun 2010 17:26:17 +0100
Richard Earnshaw <rearnsha@arm.com> wrote:

> 
> On Fri, 2010-06-04 at 13:50 +0100, Julian Brown wrote:
> > On Tue, 18 May 2010 01:31:08 +0100
> > Julian Brown <julian@codesourcery.com> wrote:
> > 
> > > Hi,
> > > 
> > > On Mon, 21 Dec 2009 12:20:12 +0000
> > > Paul Brook <paul@codesourcery.com> wrote:
> > > 
> > > > On Friday 18 December 2009, Julian Brown wrote:
> > > > > This is a version of the patch which doesn't attempt to
> > > > > resolve the discrepancy between vector copies and vectorizing
> > > > > loads/stores (thus is only intended to work in little-endian
> > > > > mode, leaving big-endian mode as an open problem). So,
> > > > > vldr/vstr etc. will still be used for aligned accesses, and
> > > > > any issues with adding semantics to movmisalign<mode> are
> > > > > sidestepped.
> > > > 
> > > > I don't think this is correct. The original patch contained two
> > > > hooks: [snip]
> > > > - Add movmisalign. Either ignore the fact that packed structures
> > > > break, or add yet annother hook for "misaligned vectors must be
> > > > at least {-this-} aligned". This will not work for big-endian
> > > > vectors, and will go away once we implement array load support.
> > > 
> > > This is a new version of the patch, which adds movmisalign
> > > patterns for little-endian NEON, and uses a new (since the last
> > > version of the patch was posted) target hook
> > > (TARGET_SUPPORT_VECTOR_MISALIGNMENT) to describe the alignments
> > > supported by NEON.
> > 
> > Ping (ARM maintainers)?
> > 
> > Julian
> 
> I've no particular objection to this patch, but I can't help feeling
> it's not really addressing the fundamental problem.
> 
> I think the problem we're really trying to fix is GCC's builtin
> assumption about the mapping of vectors to registers (ie the order of
> the lanes -- Joseph alludes to this in one of his posts on the thread)
> and that fundamentally most of this is trying to paper over that
> built-in assumption (it's a bit like trying to make big-endian look
> like little-endian, or perhaps more accurately
> WORDS_BIG_ENDIAN+LITTLE_ENDIAN look like a pure big or little-endian
> machine).

This patch actually doesn't try to do anything to address the mapping
between the memory representation of vectors and element numberings --
it just allows misaligned accesses to be used (using element
load/store instructions), albeit only in little-endian mode. I believe
this makes the autovectoriser much more useful for real-world code (i.e.
able to trigger in many more cases, and/or able to produce better
output).

Yes, there's still an assumption that elements from increasing memory
locations go in increasing lane numbers (which is only true in
little-endian mode for NEON at present), but I don't think this patch
makes things any worse. Fixing big-endian mode is another problem for
another day :-).

Cheers,

Julian