This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: How to avoid auto-vectorization for this loop (rolls at most 3 times)


>> It seems  the auto-vectorizer could not recognize that this loop will
>> roll at most 3 times.
>> And it will generate quite messy code.
>>
>> int a[1024], b[1024];
>> void foo (int n)
>> {
>>   int i;
>>   for (i = (n/4)*4; i< n; i++)
>>     a[i] =  a[i] +  b[i];
>> }
>>
>> How can we correctly estimate the number of iterations for this case
>> and use this info for the vectorizer?

>Does it recognise it if you rewrite the loop as follows:

>for (i = n&~0x3; i< n; i++)
 >    a[i] =  a[i] +  b[i];

NO.  

But it is OK for the following case:

 for (i = n-3; i< n; i++)
     a[i] =  a[i] +  b[i];

It seems it fails at the case of "unknown but small". Anyway, this mostly
affects compilation time and code size, and has limited impact on 
performance.

For
for (i = n&~0x3; i< n; i++)
    a[i] =  a[i] +  b[i]; 

The attached foo-O3-no-tree-vectorize.s is what we expect from the optimizer.
foo-O3.s is too bad.

Thanks,

Changpeng


 

Attachment: foo-O3-no-tree-vectorize.s
Description: foo-O3-no-tree-vectorize.s

Attachment: foo-O3.s
Description: foo-O3.s


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]