[PATCH] x86: Define _mm*_undefined_*

Richard Biener richard.guenther@gmail.com
Tue Mar 18 11:23:00 GMT 2014


On Tue, Mar 18, 2014 at 10:34 AM, Ilya Tocar <tocarip.intel@gmail.com> wrote:
> On 17 Mar 22:18, Ulrich Drepper wrote:
>> On Mon, Mar 17, 2014 at 7:39 AM, Ilya Tocar <tocarip.intel@gmail.com> wrote:
>>
>> > undefined is similar in behavior to setzero, but it also clobbers
>> > flags. Maybe just define it to setzero for now?
>> >
>> >
>> What do you mean by "clobbers flags"?  Do you have an example?
>
> I've used follwing example:
>
> #include <x86intrin.h>
>
> extern __inline __m512
> __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> _mm512_undefined_ps (void)
> {
>   __m512 __Y;
>   __asm__ ("" : "=x" (__Y));
>   return __Y;
> }

Try the following instead:

extern __inline __m512
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_undefined_ps (void)
{
  __m512 __Y = __Y;
  return __Y;
}

>
> __m512 foo1(__m512 __A)
> {
> return (__m512) __builtin_ia32_rcp14ps512_mask ((__v16sf) __A,
>                                                   (__v16sf)
>                                                  _mm512_undefined_ps (),
>                                                   (__mmask16) -1);
> }
>
> __m512 foo2(__m512 __A)
> {
> return (__m512) __builtin_ia32_rcp14ps512_mask ((__v16sf) __A,
>                                                   (__v16sf)
>  _mm512_setzero_ps (),
>                                                   (__mmask16) -1);
> }
>
>
> In foo1 asm statement is expanded into following rtl:
>
> (insn 6 3 7 2 (parallel [
>             (set (reg:V16SF 87 [ __Y ])
>                 (asm_operands:V16SF ("") ("=x") 0 []
>                      []
>                      [] foo.c:8))
>             (clobber (reg:QI 18 fpsr))
>             (clobber (reg:QI 17 flags))
>         ]) foo.c:8 -1
>
> As you can see flags are clobbered by asm statement, while in setzero
> case (foo2) i have just:
> (insn 7 6 8 2 (set (reg:V16SF 88)
>         (const_vector:V16SF [
>                 (const_double:SF 0.0 [0x0.0p+0])
>                 (const_double:SF 0.0 [0x0.0p+0])
> //rest of zeroes skipped.



More information about the Gcc-patches mailing list