This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: _mm{,256}_i{32,64}gather_{ps,pd,epi32,epi64} intrinsics semantics


On Wed, Nov 02, 2011 at 01:10:14PM +0400, Kirill Yukhin wrote:
> Actually I did not get the point.
> If we have no src/masking, destination must be unchanged until gather
> will write to it (at least partially)
> If we have all 1's in mask, scr must not be changed at all.
> So, nullification in intrinsics just useless.
> Having such snippet:
> (1)       vmovdqa k(%rax,%rax), %ymm1
> (2)       vmovaps %ymm0, %ymm6
> (3)       vmovaps %ymm0, %ymm2
> (4)       vmovdqa k+32(%rax,%rax), %ymm3
> (5)       vgatherdps      %ymm6, vf1(,%ymm1,4), %ymm2
> 
> Looks pretty strange. Which value has ymm0? If it has all zeroes, then
> (1)-(5) is dead code, which may be just removed.
> If contains all 1s then (2) s useless.

%ymm0 is all ones (this is code from the auto-vectorization).
(2) is not useless, %ymm6 contains the mask, for auto-vectorization
(3) is useless, it is there just because the current gather insn patterns
always use the previous value of the destination register.
Because if vgatherdps above doesn't segfault, the whole register will
be overwritten, and if it does segfault, nothing anywhere says that
the scalar code was supposed to be vectorized through vgatherdps and
what the destination register should contain.

My question was about the intrinsics.  If user writes something like
the proglet below, can he have any expectations on what will be the content
of the destination register of the vgather* insn that crashed (e.g. if
the segfault handler decides to skip the vpgather* insn and longjmps
to the next insn)?  Currently 0 would be put there, because avx2intrin.h
uses there src { 0, 0 ... } and mask { -1, -1 ... }.

#define _GNU_SOURCE
#include <stdlib.h>
#include <signal.h>
#include <stdio.h>
#include <stdint.h>
#include <sys/ucontext.h>
#include <x86intrin.h>

__m256i a, b;
long long c[3] = { 64, 65, 66 };

void
segv (int signum, siginfo_t *info, void *ctx)
{
  struct ucontext *uc = (struct ucontext *) ctx;
  gregset_t *gregs = &uc->uc_mcontext.gregs;
  unsigned char *eip = (unsigned char *)gregs[REG_RIP];
  printf ("%x\n", eip);
  exit (0);
}

int
main ()
{
  struct sigaction sa;
  sa.sa_sigaction = segv;
  sigemptyset (&sa.sa_mask);
  sa.sa_flags = SA_SIGINFO;
  if (sigaction (SIGSEGV, &sa, NULL) != 0)
    return 1;
  b = _mm256_set_epi64x ((uintptr_t) & c[0], (uintptr_t) & c[1],
                         (uintptr_t) NULL, (uintptr_t) & c[2]);
  a = _mm256_i64gather_epi64 (NULL, b, 1);
  printf ("%lx %lx %lx %lx\n",
          ((long long *) &a)[0], ((long long *) &a)[1],
          ((long long *) &a)[2], ((long long *) &a)[3]);
  return 0;
}

BTW, sde doesn't seem to work here as documented for the insn,
TID0: Read 0x42 = *(UINT64*)0x6009f0
TID0: Read 0x42 = *(UINT64*)0
TID0: Read 0x41 = *(UINT64*)0x6009e8
TID0: Read 0x40 = *(UINT64*)0x6009e0
TID0: INS 0x0000000000400523             vpgatherqq ymm0, qword ptr [rax+ymm1*1], ymm2
TID0:   YMM0 := 00000000_00000040_00000000_00000041
               _00000000_00000042_00000000_00000042
Or did I misunderstand the documentation and the insn isn't supposed
to segfault?

And, if user can't expect anything in the register because
the intrinsics doesn't even have any src/mask arguments,
what about if
  a = _mm256_i64gather_epi64 (NULL, b, 1);
in the testcase is replaced with:
  __m256i d, e;
  d = _mm256_set_epi64x (1, 2, 3, 4);
  e = _mm256_set_epi64x (-1, -1, -1, -1);
  a = _mm256_mask_i64gather_epi64 (d, NULL, b, e, 1);
Again, does the intrinsics (as opposed to hw insn) make any guarantees
on what will be in the register after the segfault?  Does the
compiler have to load the destination of vpgather* insn register with
the { 1LL, 2LL, 3LL, 4LL } vector before the insn or is it free to
optimize that away as it can see the mask loads all values?

Can you ask what ICC does here and what the intrinsics semantics
should be?

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]