[Bug target/25413] wrong alignment or incorrect address computation in vectorized code on Pentium 4 SSE

Thu Dec 15 12:42:00 GMT 2005

------- Comment #2 from dorit at il dot ibm dot com  2005-12-15 12:41 -------
The problem is that the vectorizer applies loop-peeling in order to align the
data reference *(m->c+i), and peeling only works correctly if the data is
naturally aligned (aligned on it's type size). This is what the vectorizer
currently blindly assumes, but on the Pentium4 doubles are not necessarily
64bit aligned.

Accidentally Devang and I discussed this issue last week, and Devang actually
committed a patch to apple-ppc branch that works around the problem (
http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=108214). Devang's patch
however will not fix this PR - the patch he committed disables vectorization if
the vectorizer was able to compute the misalignment, and discovered that it
doesn't evenly divide by the type size. In this testcase the misalignment is
unknown at compile time. 

To fix this problem we need to disable loop-peeling in the vectorizer if we
can't prove that the data is naturally aligned. Alternatively, if we can't
prove either way we can peel the loop but control the number of iterations it
will execute using a runtime test (i.e. have the prolog loop iterate the entire
loop-count if at runtime we discover that the data is not naturally aligned). 

-- 

dorit at il dot ibm dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dorit at il dot ibm dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413