This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] x86: Define _mm*_undefined_*
- From: Ilya Tocar <tocarip dot intel at gmail dot com>
- To: Ulrich Drepper <drepper at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 18 Mar 2014 13:34:28 +0400
- Subject: Re: [PATCH] x86: Define _mm*_undefined_*
- Authentication-results: sourceware.org; auth=none
- References: <87bnx6z9at dot fsf at x240 dot local dot i-did-not-set--mail-host-address--so-tickle-me> <20140317113921 dot GA71172 at msticlxl7 dot ims dot intel dot com> <CAOPLpQebBsPYU_eRLmkVtFJixGg8Kx33sMyZg9n_X8MKJNfV_A at mail dot gmail dot com>
On 17 Mar 22:18, Ulrich Drepper wrote:
> On Mon, Mar 17, 2014 at 7:39 AM, Ilya Tocar <tocarip.intel@gmail.com> wrote:
>
> > undefined is similar in behavior to setzero, but it also clobbers
> > flags. Maybe just define it to setzero for now?
> >
> >
> What do you mean by "clobbers flags"? Do you have an example?
I've used follwing example:
#include <x86intrin.h>
extern __inline __m512
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_undefined_ps (void)
{
__m512 __Y;
__asm__ ("" : "=x" (__Y));
return __Y;
}
__m512 foo1(__m512 __A)
{
return (__m512) __builtin_ia32_rcp14ps512_mask ((__v16sf) __A,
(__v16sf)
_mm512_undefined_ps (),
(__mmask16) -1);
}
__m512 foo2(__m512 __A)
{
return (__m512) __builtin_ia32_rcp14ps512_mask ((__v16sf) __A,
(__v16sf)
_mm512_setzero_ps (),
(__mmask16) -1);
}
In foo1 asm statement is expanded into following rtl:
(insn 6 3 7 2 (parallel [
(set (reg:V16SF 87 [ __Y ])
(asm_operands:V16SF ("") ("=x") 0 []
[]
[] foo.c:8))
(clobber (reg:QI 18 fpsr))
(clobber (reg:QI 17 flags))
]) foo.c:8 -1
As you can see flags are clobbered by asm statement, while in setzero
case (foo2) i have just:
(insn 7 6 8 2 (set (reg:V16SF 88)
(const_vector:V16SF [
(const_double:SF 0.0 [0x0.0p+0])
(const_double:SF 0.0 [0x0.0p+0])
//rest of zeroes skipped.