Bug 38011 - vectorizer ignores alignment, useless versioning
Summary: vectorizer ignores alignment, useless versioning
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.3.2
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2008-11-04 14:58 UTC by David Monniaux
Modified: 2013-03-27 12:35 UTC (History)
1 user (show)

See Also:
Host: i486-linux-gnu
Target: i486-linux-gnu
Build: i486-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Monniaux 2008-11-04 14:58:30 UTC
When compiling:

void assignMultiplyVec(double* restrict __attribute__ ((aligned (16))) a, const double * restrict __attribute__ ((aligned (16))) b, double coef, unsigned count) {
  for(unsigned i=0; i<count; i++) {
    a[i] = b[i]*coef;
  }
}

Using: gcc -std=c99 -march=core2 -mtune=core2 -O3 -mfpmath=sse -ftree-vectorize -ftree-vectorizer-verbose=9

The logs show:
essai_restrict_ref.c:2: note: Alignment of access forced using versioning.
essai_restrict_ref.c:2: note: Versioning for alignment will be applied.
essai_restrict_ref.c:2: note: Vectorizing an unaligned access.
and indeed the assembly code shows a test whether operands are 16-byte aligned.

This versioning is superfluous, since variable attributes guarantee 16-byte alignment.
Comment 1 Richard Biener 2012-07-13 08:56:00 UTC
Link to vectorizer missed-optimization meta-bug.
Comment 2 Richard Biener 2013-03-27 12:35:44 UTC
Not true - you aligned the pointer, not the data it points to.  There isn't
a good way to do that with an aligned attribute, the closest you can get at
is with

typedef double aligned_double __attribute__((aligned (16)));
void assignMultiplyVec(aligned_double* restrict a, aligned_double* restrict b,
                       double coef, unsigned count)
{
  for(unsigned i=0; i<count; i++) {
      a[i] = b[i]*coef;
  }
}

but that has the issue that a[1] is not aligned but technically you
still say so (the issue is that the array has no gaps according to
the C standard but the alignment of the element type is bigger than
its size ...).

So instead we now have an assume_aligned builtin which you can use like

void assignMultiplyVec(double* restrict a_, const double * restrict b_,
                       double coef, unsigned count)
{
  double* restrict a = __builtin_assume_aligned (a_, 16);
  double* restrict b = __builtin_assume_aligned (b_, 16);
  for(unsigned i=0; i<count; i++) {
    a[i] = b[i]*coef;
  }
}

which does not have this issue.