Bug 31946 - missed vectorization due to too strict peeling-for-alignment policy
Summary: missed vectorization due to too strict peeling-for-alignment policy
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2007-05-16 09:50 UTC by Dorit Naishlos
Modified: 2021-07-21 02:39 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dorit Naishlos 2007-05-16 09:50:57 UTC
The vectorizer is too restricted in the way it decides by how many iterations to peel a loop in order to align a certain memory reference in a loop. It considers only the first (potentially) misaligned store it encounters in the loop. For this reason the testcases vect-multitypes-1.c, vect-multitypes-4.c and vect-iv-4.c don't get vectorized. For example (using Vector Size of 16 bytes), in vect-multitypes-1.c we have:

short sa[N], sb[N];
int ia[N], ib[N];  
for (i = 0; i < n; i++) {
      ia[i+3] = ib[i];
      sa[i+3] = sb[i];
}

The current peeling-for-alignment scheme will consider the 'ia[i+3]' access for peeling, and therefore will examine the option of using a peeling factor = (4-3)%4 = 1. This will not align the access 'sa[i+3]', for which we need to peel 5 iterations. As a result the loop doesn't get vectorized (cause we currently can't handle misaligned stores unless we align them by peeling). However, if we had considered the 'sa[i+3]' access as well for peeling, we would have examined the option of using a peeling factor = (8-3)%8 = 5, which would align both accesses, and would allow us to vectorize the loop. So the vectorizer needs to be extended to consider more peeling factors, and not just one.