Bug 31946

Summary: missed vectorization due to too strict peeling-for-alignment policy
Product: gcc Reporter: Dorit Naishlos <dorit>
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: UNCONFIRMED ---    
Severity: normal CC: fang, gcc-bugs, ramana.radhakrishnan
Priority: P3    
Version: 4.3.0   
Target Milestone: ---   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed:
Bug Depends on:    
Bug Blocks: 53947    

Description Dorit Naishlos 2007-05-16 09:50:57 UTC
The vectorizer is too restricted in the way it decides by how many iterations to peel a loop in order to align a certain memory reference in a loop. It considers only the first (potentially) misaligned store it encounters in the loop. For this reason the testcases vect-multitypes-1.c, vect-multitypes-4.c and vect-iv-4.c don't get vectorized. For example (using Vector Size of 16 bytes), in vect-multitypes-1.c we have:

short sa[N], sb[N];
int ia[N], ib[N];  
for (i = 0; i < n; i++) {
      ia[i+3] = ib[i];
      sa[i+3] = sb[i];
}

The current peeling-for-alignment scheme will consider the 'ia[i+3]' access for peeling, and therefore will examine the option of using a peeling factor = (4-3)%4 = 1. This will not align the access 'sa[i+3]', for which we need to peel 5 iterations. As a result the loop doesn't get vectorized (cause we currently can't handle misaligned stores unless we align them by peeling). However, if we had considered the 'sa[i+3]' access as well for peeling, we would have examined the option of using a peeling factor = (8-3)%8 = 5, which would align both accesses, and would allow us to vectorize the loop. So the vectorizer needs to be extended to consider more peeling factors, and not just one.