This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/55590] New: SRA still produces unnecessarily unaligned memory accesses


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55590

             Bug #: 55590
           Summary: SRA still produces unnecessarily unaligned memory
                    accesses
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: jamborm@gcc.gnu.org
        ReportedBy: jamborm@gcc.gnu.org


SRA can still produce unaligned memory accesses which should be
aligned when it's basing its new scalar access on a MEM_REF buried
below COMPONENT_REFs or ARRAY_REFs.

Testcase 1:

/* { dg-do compile } */
/* { dg-options "-O2 -mavx" } */

#include <immintrin.h>

struct S
{
  __m128 a, b;
};

struct T
{
  int a;
  struct S s;
};

void foo (struct T *p, __m128 v)
{
  struct S s;

  s = p->s;
  s.b = _mm_add_ps(s.b, v);
  p->s = s;
}

/* { dg-final { scan-assembler-not "vmovups" } } */

on x86_64 compiles to

        vmovups 32(%rdi), %xmm1
        vaddps  %xmm0, %xmm1, %xmm0
        vmovups %xmm0, 32(%rdi)

even though it should really be

        vaddps  32(%rdi), %xmm0, %xmm0
        vmovaps %xmm0, 32(%rdi)
        ret



Testcase 2 (which describes why this should be fixed differently from
the recent IPA-SRA patch because of the variable array index):

/* { dg-do compile } */
/* { dg-options "-O2 -mavx" } */

#include <immintrin.h>

struct S
{
  __m128 a, b;
};

struct T
{
  int a;
  struct S s[8];
};

void foo (struct T *p, int i, __m128 v)
{
  struct S s;

  s = p->s[i];
  s.b = _mm_add_ps(s.b, v);
  p->s[i] = s;
}

/* { dg-final { scan-assembler-not "vmovups" } } */

Compiles to

        movslq  %esi, %rsi
        salq    $5, %rsi
        leaq    16(%rdi,%rsi), %rax
        vmovups 16(%rax), %xmm1
        vaddps  %xmm0, %xmm1, %xmm0
        vmovups %xmm0, 16(%rax)
        ret

when it should produce

        movslq  %esi, %rsi
        salq    $5, %rsi
        leaq    16(%rdi,%rsi), %rax
        vaddps  16(%rax), %xmm0, %xmm0
        vmovaps %xmm0, 16(%rax)
        ret

I'm testing a patch.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]