Auto-vectorizer and (mis-)alignment support assumptions

Richard Biener richard.guenther@gmail.com
Thu Sep 12 09:25:00 GMT 2013


On Thu, Sep 12, 2013 at 10:40 AM, Frederic Riss <frederic.riss@gmail.com> wrote:
> Hello,
>
> I have coded SIMD support for my target, but I'm hitting some issues
> relative to alignment. I can't find any documentation or comment
> describing the assumptions that the vectorizer makes about target
> support, if it exists please just point me at it (I'm based on GCC 4.7
> for now).
>
> On my target, vectors need to be aligned on their full size. For
> example V2SI vectors need to be aligned on 8 byte boundaries.
>
> Now take this simple code:
>
> unsigned long foo (unsigned long *input)
> {
>         unsigned long i, res;
>         unsigned long data[80];
>
>         memcpy (data, input, sizeof(data));
>         for (i=16; i<80; i++)
>                 data[i] = data[i-3];
>
>         return data[80];
> }
>
> In this incarnation, whatever target hooks I implement to tell the
> auto-vectorizer that my V2SI vectors need 8 byte alignment (and that
> it doesn't support misalignment), it decides that the loop is
> vectorizable and generates:
>
>   vector(2) long unsigned int * vect_pdata.20;
>   vector(2) long unsigned int vect_var_.19;
>   vector(2) long unsigned int * vect_pdata.18;
>   vector(2) long unsigned int * vect_pdata.15;
> [...]
>   vect_var_.19_35 = MEM[(long unsigned int[80] *)vect_pdata.15_33];
>   MEM[(long unsigned int[80] *)vect_pdata.20_37] = vect_var_.19_35;
>   vect_pdata.15_34 = vect_pdata.15_33 + 8;
>   vect_pdata.20_38 = vect_pdata.20_37 + 8;
>
> Where the array references start with the following definition:
>
>   vect_pdata.18_32 = &MEM[(void *)&data + 52B];
>   vect_pdata.23_36 = &MEM[(void *)&data + 64B];
>
> As you see, both vector pointers can't possibly be aligned on an 8
> byte boundary.
>
> Interestingly, If I put the data array as a global variable, then it
> refuses to vectorize because the alignment of the accesses is unknown.
> I suppose that when the array is local it can be forced to whatever
> alignment fits. This doesn't however guarantee that all accesses can
> be forced to the right alignment, because their relative offsets are
> fixed.
>
> Thus I'm wondering if the vectorizer makes the assumption that a
> target must be able to load/store vectors aligned on any element size
> boundary. If it's the case, how can a target determine that a specific
> access might not be fully aligned (and handle it specifically)?
>
> If this assumption is not made, please point me at the code that
> checks the relative alignment of vectorized accesses, so that I can
> find out what my targeting lacks.

You may simply hit some bug in the vectorizer.  The vectorizer assumes
it can re-align local decls, and for the above it should use known-misalignment
accesses.  You can check with -fdump-rtl-expand-details-alias what the
MEMs think they are aligned to (and with -fdump-tree-vect-alias what the
vectorizer thinks it created).

Richard.



> Many thanks,
> Fred



More information about the Gcc mailing list