This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/39075] alignment for "unsigned short a[10000]" vs "extern unsigned short a[10000]"
- From: "dann at godzilla dot ics dot uci dot edu" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 2 Feb 2009 14:50:01 -0000
- Subject: [Bug tree-optimization/39075] alignment for "unsigned short a[10000]" vs "extern unsigned short a[10000]"
- References: <bug-39075-1008@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #1 from dann at godzilla dot ics dot uci dot edu 2009-02-02 14:50 -------
This code:
unsigned short a[10000];
void test()
{
int i;
for (i = 0; i < 10000; ++i) a[i] = 5;
}
will be vectorized with -O3 -march=core2 to this:
.L2:
movdqa %xmm0, a(%eax)
addl $16, %eax
cmpl $20000, %eax
jne .L2
but this one:
extern unsigned short a[10000];
void test()
{
int i;
for (i = 0; i < 10000; ++i) a[i] = 5;
}
will get a lot of extra code before the loop because the vectorizer thinks it
needs to do peeling for alignment:
test.c:7: note: Alignment of access forced using peeling.
Intel's compiler does not generate the extra peeling code.
--
dann at godzilla dot ics dot uci dot edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|alignment for "unsigned |alignment for "unsigned
|short a[10000 |short a[10000]" vs "extern
| |unsigned short a[10000]"
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39075