This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/81635] nvptx SLP test cases regressions


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81635

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
I.

The test-case slp.c (minus dg-final checks) looks like this:
...
/* { dg-options "-O2 -ftree-slp-vectorize" } */
int p[1000] __attribute__((aligned(8)));
int p2[1000] __attribute__((aligned(8)));

void __attribute__((noinline, noclone))
foo ()
{
  unsigned int a, b;

  unsigned int i;
  for (i = 0; i < 1000; i += 2)
    {
      a = p[i];
      b = p[i+1];

      p2[i] = a;
      p2[i+1] = b;
    }
}
...

Changing the type of the loop iteration variable 'i' from 'unsigned int' to
'int' makes the slp.c test pass again.


II.

With int, we have same 'offset from base address' and an 'constant offset from
base address' of 0 and 4:
...
Creating dr for p[i_13]
analyze_innermost: success.
        base_address: &p
        offset from base address: (ssizetype) ((sizetype) i_13 * 4)
        constant offset from base address: 0
        step: 0
        base alignment: 8
        base misalignment: 0
        offset alignment: 8
        step alignment: 128
        base_object: p[i_13]
Creating dr for p[_1]
analyze_innermost: success.
        base_address: &p
        offset from base address: (ssizetype) ((sizetype) i_13 * 4)
        constant offset from base address: 4
        step: 0
        base alignment: 8
        base misalignment: 0
        offset alignment: 8
        step alignment: 128
        base_object: p[_1]
...

resulting in:
...
gcc/testsuite/gcc.target/nvptx/slp.c:13:3: note: Detected interleaving load
p[i_13] and p[_1]
...


III.

With unsigned int, we have different offset of base address (note that _1 ==
i_13 + 1):
...
Creating dr for p[i_13]
analyze_innermost: success.
        base_address: &p
        offset from base address: (ssizetype) ((sizetype) i_13 * 4)
        constant offset from base address: 0
        step: 0
        base alignment: 8
        base misalignment: 0
        offset alignment: 8
        step alignment: 128
        base_object: p[i_13]
Creating dr for p[_1]
analyze_innermost: success.
        base_address: &p
        offset from base address: (ssizetype) ((sizetype) _1 * 4)
        constant offset from base address: 0
        step: 0
        base alignment: 8
        base misalignment: 0
        offset alignment: 4
        step alignment: 128
        base_object: p[_1]
...

resulting in:
...
gcc/testsuite/gcc.target/nvptx/slp.c:13:3: note: not consecutive access b_6 =
p[_1];
gcc/testsuite/gcc.target/nvptx/slp.c:13:3: note: not consecutive access a_5 =
p[i_13];
...


IV.

On x86_64 -m32, the test-case is not vectorized (reason: 'unrolling required in
basic block SLP'), but the interleaving load is recognized, both with int and
unsigned int.

On x86_64 -m64, we have:
- for int, detected interleaving load, but test-case not vectorized (reason:
  'unrolling required in basic block SLP')
- for unsigned int, we got failure to detect interleaving load, just as in III.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]