This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Slightly fix up vgather* patterns


On Sun, Oct 09, 2011 at 12:55:40PM +0200, Uros Bizjak wrote:
> About memory - can't we use (mem:BLK (match_operand:P
> "register_operand" "r")) here?

I don't think it is sufficient.
Consider e.g. _mm_i32gather_pd (NULL, index, 1); where index
is initialized from loading consecutive (32-bit) double * pointers from an
array.  Then it loads for elt 0 through 1 *(double *)(0 + index[elt]).
Describing this as mem:BLK (register initialized to 0) is wrong.
But even with non-zero base, say if base is a pointer pointing into
a middle of some array and some offsets are positive and some negative
using mem:BLK of the base would just mean non-negative offsets from it.

OT, seems avx2intrin.h is weird for many of the gather patterns:
E.g. the _mm_i32gather_pd inline uses:
  __v2df src = _mm_setzero_pd ();
  __v2df mask = _mm_cmpeq_pd (src, src);
which will work and set mask to all ones floating point vector, but
e.g. _mm256_i32gather_pd uses
  __v4df src = _mm256_setzero_pd ();
  __v4df mask = _mm256_set1_pd((double)(long long int) -1);
which I believe will create a { -1.0, -1.0, -1.0, -1.0 }; vector.
Either it could be
  __v4df src = _mm256_setzero_pd ();
  __v4df mask = _mm256_cmp_pd (src, src, _CMP_EQ_OQ);
or it would need to be something like
#define __MM_ALL_ONES_DOUBLE \
  (__extension__ ((union { long long int __l; double __d; }) { __l: -1 }).__d)
  __v4df src = _mm256_setzero_pd ();
  __v4df mask = _mm256_set1_pd (__MM_ALL_ONES_DOUBLE);

Though, only the most significant bit of the mask is used by the instruction
and thus perhaps -1.0 is useful too.  Though, it is certainly more
expensive than the _mm256_cmp_pd alternative (needs to be loaded from
memory).  BTW, the expander probably needs some help to emit code for
the second case for the third case, it loads it from memory too.

> BTW: No need to use %c modifier:
> 
> /* Meaning of CODE:
>    L,W,B,Q,S,T -- print the opcode suffix for specified size of operand.
>    C -- print opcode suffix for set/cmov insn.
>    c -- like C, but print reversed condition
>    ...
> */

Ok.

	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]