This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[Patch 0/4, AArch64] Conform vector implementation to ABI.
- From: Tejas Belagod <tbelagod at arm dot com>
- To: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 21 Nov 2013 13:43:08 +0000
- Subject: [Patch 0/4, AArch64] Conform vector implementation to ABI.
- Authentication-results: sourceware.org; auth=none
Hi,
This patch series fixes aarch64's autovectorization and gcc FE vector extension
programming models for ABI conformance. The ABI states
"Elements in a short vector are numbered such that the lowest numbered element
(element 0) occupies the lowest numbered bit (bit zero) in the vector and
successive elements take on progressively increasing bit positions in the
vector. When a short vector transferred between registers and memory it is
treated as an opaque object. That is a short vector is stored in memory as if it
were stored with a single STR of the entire register; a short vector is loaded
from memory using the corresponding LDR instruction. On a little-endian system
this means that element 0 will always contain the lowest addressed element of a
short vector; on a bigendian system element 0 will contain the highest-addressed
element of a short vector."
To conform to ABI, this patch fixes vector mode loads to be LDR D/Q and stores
to STR D/Q. This means that the order of the elements in a register are reversed
for big-endian when loaded from memory. This incidentally mirrors the way gcc
implements its vectors in RTL therefore becomes easy to interpret standard
pattern names and RTL lane numbers while expansion as the order is the same as
the NEON register. The data-flow seems to fall out quite easily. For example,
the widening standard patterns in RTL expect high and low parts to be reversed
for Big-Endian, and because we use LDR Q, the low and high parts of vectors are
already reversed - we don't have to jump though hoops to fix this up. In
contrast, the narrowing operations need the reversing as will be evident in the
patches that follow. Simliarly the reduc_* patterns expect the scalar result in
the LSB of the RTL register which is the same as (n-1)th lane in Bigendian which
is the same lane when we conform to ABI.
In a series of 4 patches we fixes ABI conformance and fix some fall-out of bugs
from that.
This set however does not fix up the model for NEON intrinsics that maps vld1_*
to mov<mode> and its associated lane accesses - this is coming soon.
Thanks,
Tejas Belagod.
ARM.