[Bug tree-optimization/39075] alignment for "unsigned short a[10000]" vs "extern unsigned short a[10000]"

dann at godzilla dot ics dot uci dot edu gcc-bugzilla@gcc.gnu.org
Mon Feb 2 14:50:00 GMT 2009



------- Comment #1 from dann at godzilla dot ics dot uci dot edu  2009-02-02 14:50 -------
This code:
unsigned short a[10000];
void test()
{
  int i;
  for (i = 0; i < 10000; ++i)  a[i] = 5;
}

will be vectorized with -O3 -march=core2 to this:

.L2:
        movdqa  %xmm0, a(%eax)
        addl    $16, %eax
        cmpl    $20000, %eax
        jne     .L2


but this one:

extern unsigned short a[10000];

void test()
{
  int i;
  for (i = 0; i < 10000; ++i)     a[i] = 5;
}

will get a lot of extra code before the loop because the vectorizer thinks it
needs to do peeling for alignment:
test.c:7: note: Alignment of access forced using peeling.

Intel's compiler does not generate the extra peeling code.


-- 

dann at godzilla dot ics dot uci dot edu changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|alignment for "unsigned     |alignment for "unsigned
                   |short a[10000               |short a[10000]" vs "extern
                   |                            |unsigned short a[10000]"


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39075



More information about the Gcc-bugs mailing list