This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] Slightly fix up vgather* patterns
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Richard Henderson <rth at redhat dot com>, gcc-patches at gcc dot gnu dot org
- Date: Sun, 9 Oct 2011 12:55:40 +0200
- Subject: Re: [RFC] Slightly fix up vgather* patterns
- References: <20111008154322.GQ19412@tyan-ft48-01.lab.bos.redhat.com>
On Sat, Oct 8, 2011 at 5:43 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> The AVX2 docs say that the insns will #UD if any of the mask, src and index
> registers are the same, but e.g. on
> #include <x86intrin.h>
>
> __m256 m;
> float f[1024];
>
> __m256
> foo (void)
> {
> ?__m256i mi = (__m256i) m;
> ?return _mm256_mask_i32gather_ps (m, f, mi, m, 4);
> }
>
> which is IMHO valid and should for m being zero vector just return a
> zero vector and clear mask (in this case it was already cleared) we compile
> it as
> ? ? ? ?vmovdqa m(%rip), %ymm1
> ? ? ? ?vmovaps %ymm1, %ymm0
> ? ? ? ?vgatherdps ? ? ?%ymm1, (%rax, %ymm1, 4), %ymm0
> and thus IMHO it will #UD. ?Also, the insns should make it clear that
> the mask register is modified too (the patch clobbers it, perhaps
> we could instead say that it zeros the register (which is true if
> it doesn't segfault), but then what if a segfault handler chooses to
> continue with the next insn and doesn't clear the mask register?).
> Still, the insn description is imprecise, saying that it loads from mem
> at the address register is wrong and perhaps some DCE might delete
> what shouldn't be deleted. ?So, either it should (use (mem (scratch)))
> or something similar, or in the unspec list all the memory locations
> that are being read
> (mem:<scalarssemode> (plus:SI (reg:SI) (vec_select:SI (match_operand:V4SI)
> (parallel [(const_int N)]))))
> for N 0 through something (but it is complicated by Pmode size vs.
> the need to do nothing/truncate/sign_extend the vec_select to the right
> mode).
>
> What do you think?
Regarding the clear of mask operand: I agree that this should be
modelled as a clobber. Zeroing can't be guaranteed due to the fact you
described above.
About memory - can't we use (mem:BLK (match_operand:P
"register_operand" "r")) here?
BTW: No need to use %c modifier:
/* Meaning of CODE:
L,W,B,Q,S,T -- print the opcode suffix for specified size of operand.
C -- print opcode suffix for set/cmov insn.
c -- like C, but print reversed condition
...
*/
Uros.