This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Patch 0/4, AArch64] Conform vector implementation to ABI.



Hi,

This patch series fixes aarch64's autovectorization and gcc FE vector extension programming models for ABI conformance. The ABI states

"Elements in a short vector are numbered such that the lowest numbered element
(element 0) occupies the lowest numbered bit (bit zero) in the vector and successive elements take on progressively increasing bit positions in the vector. When a short vector transferred between registers and memory it is treated as an opaque object. That is a short vector is stored in memory as if it were stored with a single STR of the entire register; a short vector is loaded from memory using the corresponding LDR instruction. On a little-endian system this means that element 0 will always contain the lowest addressed element of a short vector; on a bigendian system element 0 will contain the highest-addressed element of a short vector."

To conform to ABI, this patch fixes vector mode loads to be LDR D/Q and stores to STR D/Q. This means that the order of the elements in a register are reversed for big-endian when loaded from memory. This incidentally mirrors the way gcc implements its vectors in RTL therefore becomes easy to interpret standard pattern names and RTL lane numbers while expansion as the order is the same as the NEON register. The data-flow seems to fall out quite easily. For example, the widening standard patterns in RTL expect high and low parts to be reversed for Big-Endian, and because we use LDR Q, the low and high parts of vectors are already reversed - we don't have to jump though hoops to fix this up. In contrast, the narrowing operations need the reversing as will be evident in the patches that follow. Simliarly the reduc_* patterns expect the scalar result in the LSB of the RTL register which is the same as (n-1)th lane in Bigendian which is the same lane when we conform to ABI.

In a series of 4 patches we fixes ABI conformance and fix some fall-out of bugs from that.

This set however does not fix up the model for NEON intrinsics that maps vld1_* to mov<mode> and its associated lane accesses - this is coming soon.

Thanks,
Tejas Belagod.
ARM.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]