This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav: Message Nav: [Date Index] [Subject Index] [Author Index] [Thread Index] [Date Prev] [Date Next] [Thread Prev] [Thread Next] [Raw text]

# RE: How to avoid auto-vectorization for this loop (rolls at most 3 times)

• From: "Fang, Changpeng" <Changpeng dot Fang at amd dot com>
• To: Ian Bolton <Ian dot Bolton at arm dot com>
• Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
• Date: Thu, 9 Sep 2010 18:09:23 -0500
• Subject: RE: How to avoid auto-vectorization for this loop (rolls at most 3 times)
• References: <D4C76825A6780047854A11E93CDE84D05B04FB5C@SAUSEXMBP01.amd.com>,<680044E4997F5343A2C58032DDD099161733F9@ZIPPY.Emea.Arm.com>

```>> It seems  the auto-vectorizer could not recognize that this loop will
>> roll at most 3 times.
>> And it will generate quite messy code.
>>
>> int a[1024], b[1024];
>> void foo (int n)
>> {
>>   int i;
>>   for (i = (n/4)*4; i< n; i++)
>>     a[i] =  a[i] +  b[i];
>> }
>>
>> How can we correctly estimate the number of iterations for this case
>> and use this info for the vectorizer?

>Does it recognise it if you rewrite the loop as follows:

>for (i = n&~0x3; i< n; i++)
>    a[i] =  a[i] +  b[i];

NO.

But it is OK for the following case:

for (i = n-3; i< n; i++)
a[i] =  a[i] +  b[i];

It seems it fails at the case of "unknown but small". Anyway, this mostly
affects compilation time and code size, and has limited impact on
performance.

For
for (i = n&~0x3; i< n; i++)
a[i] =  a[i] +  b[i];

The attached foo-O3-no-tree-vectorize.s is what we expect from the optimizer.

Thanks,

Changpeng

```

Attachment: foo-O3-no-tree-vectorize.s
Description: foo-O3-no-tree-vectorize.s

Attachment: foo-O3.s
Description: foo-O3.s

Index Nav: Message Nav: [Date Index] [Subject Index] [Author Index] [Thread Index] [Date Prev] [Date Next] [Thread Prev] [Thread Next]