This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] x86: Define _mm*_undefined_*
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Ulrich Drepper <drepper at gmail dot com>
- Cc: Ilya Tocar <tocarip dot intel at gmail dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 18 Mar 2014 16:09:09 +0100
- Subject: Re: [PATCH] x86: Define _mm*_undefined_*
- Authentication-results: sourceware.org; auth=none
- References: <87bnx6z9at dot fsf at x240 dot local dot i-did-not-set--mail-host-address--so-tickle-me> <20140317113921 dot GA71172 at msticlxl7 dot ims dot intel dot com> <CAOPLpQebBsPYU_eRLmkVtFJixGg8Kx33sMyZg9n_X8MKJNfV_A at mail dot gmail dot com> <20140318093428 dot GB71172 at msticlxl7 dot ims dot intel dot com> <CAFiYyc1NTh-CV8f1e=9xJBhDx3esZEBtPHWHRiJnAyUWzzhS0Q at mail dot gmail dot com> <CAOPLpQc+Zc=A483teheyXLV2y7O=OEGHwwQCUb4nwwT9Dt2f=g at mail dot gmail dot com>
On Tue, Mar 18, 2014 at 4:03 PM, Ulrich Drepper <drepper@gmail.com> wrote:
> On Tue, Mar 18, 2014 at 7:13 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> extern __inline __m512
>> __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
>> _mm512_undefined_ps (void)
>> {
>> __m512 __Y = __Y;
>> return __Y;
>> }
>
>
> This provokes no warnings (as you wrote) and it doesn't clobber flags,
> but it doesn't avoid loading. The code below creates a pxor for the
> parameter. That's what I think compiler support should help to get
> rid of. If the compiler has some magic to recognize -1 masks then
> this will help in some situations but it seems to be a specific
> implementation for the intrinsics while I've been looking at generic
> solution.
>
>
> typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__));
>
> void g(__m128d);
>
> extern __inline __m128d
> __attribute__((__gnu_inline__, __always_inline__, __artificial__, const))
> _mm_undefined_pd(void) {
> __m128d v = v;
> return v;
> }
>
> void
> f()
> {
> g(_mm_undefined_pd());
> }
The load from zero is caused by the init-regs pass. To quote:
/* Check all of the uses of pseudo variables. If any use that is MUST
uninitialized, add a store of 0 immediately before it. For
subregs, this makes combine happy. For full word regs, this makes
other optimizations, like the register allocator and the reg-stack
happy as well as papers over some problems on the arm and other
processors where certain isa constraints cannot be handled by gcc.
These are of the form where two operands to an insn my not be the
same. The ra will only make them the same if they do not
interfere, and this can only happen if one is not initialized.
There is also the unfortunate consequence that this may mask some
buggy programs where people forget to initialize stack variable.
Any programmer with half a brain would look at the uninitialized
variable warnings. */
You can disable it with -fdisable-rtl-init-regs. Not sure if the above
comment today is just overly cautious ... maybe we should have a
target hook that controls its execution.
Richard.