Vectorize stores with unknown stride

Alan Lawrence alan.lawrence@arm.com
Thu May 7 17:13:00 GMT 2015


NP, and sorry for the spurious comments, hadn't spotted you were using nunits. I 
like the testcase, thanks :).

A.

Michael Matz wrote:
> On Thu, 7 May 2015, Alan Lawrence wrote:
> 
>> Also update comment? (5 identical cases)
>>
>> Also update comment?
> 
> Obviously a good idea, thanks :)  (s/loads/accesses/ for the commit)
> 
>>> @@ -5013,7 +5025,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator
>>> *gsi, gimple *vec_stmt,
>>>    tree dataref_ptr = NULL_TREE;
>>>    tree dataref_offset = NULL_TREE;
>>>    gimple ptr_incr = NULL;
>>> -  int nunits = TYPE_VECTOR_SUBPARTS (vectype);
>>> +  unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype);
>> Unrelated? (though I don't object to it!)
> 
> No, the patch introduces for a first time in this function a loop over 
> unsigned i with nunits as bound, so the compare would warn if I weren't to 
> change either the type of i (with a cascade of more such changes) or 
> nuinits (with only this one place).  The whole file could use an overhaul 
> of signedness (after all nunits in all functions will always be 
> non-negative), but as a separate patch.
> 
>>> +  if (STMT_VINFO_STRIDED_P (stmt_info))
>>> +    ;
>>> +  else
>> This is not a long chain of ifs, so is there a reason not to have
>> if (!STMT_VINFO_STRIDED_P (stmt_info))
>> ?
> 
> Err, right.  Probably I had some code in there initially; as now it looks 
> a bit mannered :)
> 
>>> -#define STMT_VINFO_STRIDE_LOAD_P(S)       (S)->stride_load_p
>>> +#define STMT_VINFO_STRIDED_P(S)                   (S)->strided_p
>> Spacing (?)
> 
> No, it's using correct tabs, indentation by the diff '+' prefix deceives 
> us.  (Your mailer must be playing games, I've sent it out with TABs 
> intact).
> 
>>> +  float src[] = {1, 2, 3, 4, 5, 6, 7, 8};
>>> +  float dest[8];
>>> +  check_vect ();
>>> +  sumit (dest, src, src, 1, 8);
>>> +  for (i = 0; i < 8; i++)
>>> +    if (2*src[i] != dest[i])
>>> +      abort ();
>>> +  return 0;
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
>>> +/* { dg-final { cleanup-tree-dump "vect" } } */
>> While I appreciate the vectorizer doesn't know the invariant stride is going
>> to be '1' when the loop executes...IMVHO I'd feel more reassured if we passed
>> in stride>1 at runtime ;). In fact I'd feel more reassured still, if we did
>> the thing with a few different values of stride...
> 
> Sensible.  I'll use the test case below (hey!  even stride zero! :) )
> 
> Regstrapping again with the changes as per above.  Committing tomorrow.  
> Many thanks for the review.
> 
> 
> Ciao,
> Michael.
> /* { dg-require-effective-target vect_float } */
> 
> #include <stdarg.h>
> #include "tree-vect.h"
> 
> void __attribute__((noinline))
> sumit (float * __restrict dest,
>        float * __restrict src, float * __restrict src2,
>        int stride, int n)
> {
>   int i;
>   for (i = 0; i < n; i++)
>     dest[i*stride] = src[i] + src2[i];
> }
> 
> int main()
> {
>   int i, stride;
>   float src[] = {1, 2, 3, 4, 5, 6, 7, 8};
>   float dest[64];
>   check_vect ();
>   for (stride = 0; stride < 8; stride++)
>     {
>       sumit (dest, src, src, stride, 8);
>       if (!stride && dest[0] != 16)
> 	abort();
>       else if (stride)
> 	for (i = 0; i < 8; i++)
> 	  if (2*src[i] != dest[i*stride])
> 	    abort ();
>     }
>   return 0;
> }
> 
> /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> /* { dg-final { cleanup-tree-dump "vect" } } */
> 



More information about the Gcc-patches mailing list