[Bug tree-optimization/39075] alignment for "unsigned short a[10000]" vs "extern unsigned short a[10000]"
dann at godzilla dot ics dot uci dot edu
gcc-bugzilla@gcc.gnu.org
Mon Feb 2 14:50:00 GMT 2009
------- Comment #1 from dann at godzilla dot ics dot uci dot edu 2009-02-02 14:50 -------
This code:
unsigned short a[10000];
void test()
{
int i;
for (i = 0; i < 10000; ++i) a[i] = 5;
}
will be vectorized with -O3 -march=core2 to this:
.L2:
movdqa %xmm0, a(%eax)
addl $16, %eax
cmpl $20000, %eax
jne .L2
but this one:
extern unsigned short a[10000];
void test()
{
int i;
for (i = 0; i < 10000; ++i) a[i] = 5;
}
will get a lot of extra code before the loop because the vectorizer thinks it
needs to do peeling for alignment:
test.c:7: note: Alignment of access forced using peeling.
Intel's compiler does not generate the extra peeling code.
--
dann at godzilla dot ics dot uci dot edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|alignment for "unsigned |alignment for "unsigned
|short a[10000 |short a[10000]" vs "extern
| |unsigned short a[10000]"
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39075
More information about the Gcc-bugs
mailing list