Vectorize stores with unknown stride
Alan Lawrence
alan.lawrence@arm.com
Thu May 7 17:13:00 GMT 2015
NP, and sorry for the spurious comments, hadn't spotted you were using nunits. I
like the testcase, thanks :).
A.
Michael Matz wrote:
> On Thu, 7 May 2015, Alan Lawrence wrote:
>
>> Also update comment? (5 identical cases)
>>
>> Also update comment?
>
> Obviously a good idea, thanks :) (s/loads/accesses/ for the commit)
>
>>> @@ -5013,7 +5025,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator
>>> *gsi, gimple *vec_stmt,
>>> tree dataref_ptr = NULL_TREE;
>>> tree dataref_offset = NULL_TREE;
>>> gimple ptr_incr = NULL;
>>> - int nunits = TYPE_VECTOR_SUBPARTS (vectype);
>>> + unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype);
>> Unrelated? (though I don't object to it!)
>
> No, the patch introduces for a first time in this function a loop over
> unsigned i with nunits as bound, so the compare would warn if I weren't to
> change either the type of i (with a cascade of more such changes) or
> nuinits (with only this one place). The whole file could use an overhaul
> of signedness (after all nunits in all functions will always be
> non-negative), but as a separate patch.
>
>>> + if (STMT_VINFO_STRIDED_P (stmt_info))
>>> + ;
>>> + else
>> This is not a long chain of ifs, so is there a reason not to have
>> if (!STMT_VINFO_STRIDED_P (stmt_info))
>> ?
>
> Err, right. Probably I had some code in there initially; as now it looks
> a bit mannered :)
>
>>> -#define STMT_VINFO_STRIDE_LOAD_P(S) (S)->stride_load_p
>>> +#define STMT_VINFO_STRIDED_P(S) (S)->strided_p
>> Spacing (?)
>
> No, it's using correct tabs, indentation by the diff '+' prefix deceives
> us. (Your mailer must be playing games, I've sent it out with TABs
> intact).
>
>>> + float src[] = {1, 2, 3, 4, 5, 6, 7, 8};
>>> + float dest[8];
>>> + check_vect ();
>>> + sumit (dest, src, src, 1, 8);
>>> + for (i = 0; i < 8; i++)
>>> + if (2*src[i] != dest[i])
>>> + abort ();
>>> + return 0;
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
>>> +/* { dg-final { cleanup-tree-dump "vect" } } */
>> While I appreciate the vectorizer doesn't know the invariant stride is going
>> to be '1' when the loop executes...IMVHO I'd feel more reassured if we passed
>> in stride>1 at runtime ;). In fact I'd feel more reassured still, if we did
>> the thing with a few different values of stride...
>
> Sensible. I'll use the test case below (hey! even stride zero! :) )
>
> Regstrapping again with the changes as per above. Committing tomorrow.
> Many thanks for the review.
>
>
> Ciao,
> Michael.
> /* { dg-require-effective-target vect_float } */
>
> #include <stdarg.h>
> #include "tree-vect.h"
>
> void __attribute__((noinline))
> sumit (float * __restrict dest,
> float * __restrict src, float * __restrict src2,
> int stride, int n)
> {
> int i;
> for (i = 0; i < n; i++)
> dest[i*stride] = src[i] + src2[i];
> }
>
> int main()
> {
> int i, stride;
> float src[] = {1, 2, 3, 4, 5, 6, 7, 8};
> float dest[64];
> check_vect ();
> for (stride = 0; stride < 8; stride++)
> {
> sumit (dest, src, src, stride, 8);
> if (!stride && dest[0] != 16)
> abort();
> else if (stride)
> for (i = 0; i < 8; i++)
> if (2*src[i] != dest[i*stride])
> abort ();
> }
> return 0;
> }
>
> /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> /* { dg-final { cleanup-tree-dump "vect" } } */
>
More information about the Gcc-patches
mailing list