This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Add missing avx512fintrin.h intrinsics (PR target/89602)
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Uros Bizjak <ubizjak at gmail dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 8 Mar 2019 16:45:53 +0800
- Subject: Re: [PATCH] Add missing avx512fintrin.h intrinsics (PR target/89602)
- References: <20190306234925.GQ7611@tucnak> <CAFULd4a+U3Vshb9a5OsssdKpno+GpAvr0f0SpB3R4SxbMNa6fw@mail.gmail.com> <20190307080945.GT7611@tucnak>
On Thu, Mar 7, 2019 at 4:09 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Thu, Mar 07, 2019 at 08:11:53AM +0100, Uros Bizjak wrote:
> > > +(define_insn "*avx512f_load<mode>_mask"
> > > + [(set (match_operand:<ssevecmode> 0 "register_operand" "=v")
> > > + (vec_merge:<ssevecmode>
> > > + (vec_merge:<ssevecmode>
> > > + (vec_duplicate:<ssevecmode>
> > > + (match_operand:MODEF 1 "memory_operand" "m"))
> > > + (match_operand:<ssevecmode> 2 "nonimm_or_0_operand" "0C")
> > > + (match_operand:QI 3 "nonmemory_operand" "Yk"))
> >
> > Is there a reason to have nonmemory_operand predicate here instead of
> > register_operand?
>
> Thanks for catching that up, that was from my earlier attempt to have
> Yk,n constraints and deal with that during output. For store it was
> possible, for others there were some cases it couldn't handle but further
> testing revealed that the combiner already handles most of the constant
> mask cases right.
>
> Here is updated patch, I've changed this in two spots. It even improves the
> constant 1 case (the only one that is still not optimized as much as it
> should):
> f4:
> - movzbl .LC0(%rip), %eax
> + movl $1, %eax
> kmovw %eax, %k1
> vmovsd (%rsi), %xmm0{%k1}{z}
> ret
> Tested so far with make check-gcc RUNTESTFLAGS=i386.exp=avx512f-vmovs*.c
> and compiling/eyeballing differences on the short testcase I've posted
> in the description with also the u, -> 1, and u, -> 0, changes, appart
> from the above f4 no differences.
>
> Ok for trunk if it passes another full bootstrap/regtest?
>
> 2019-03-07 Jakub Jelinek <jakub@redhat.com>
>
> PR target/89602
> * config/i386/sse.md (avx512f_mov<ssescalarmodelower>_mask,
> *avx512f_load<mode>_mask, avx512f_store<mode>_mask): New define_insns.
> (avx512f_load<mode>_mask): New define_expand.
> * config/i386/i386-builtin.def (__builtin_ia32_loadsd_mask,
> __builtin_ia32_loadss_mask, __builtin_ia32_storesd_mask,
> __builtin_ia32_storess_mask, __builtin_ia32_movesd_mask,
> __builtin_ia32_movess_mask): New builtins.
> * config/i386/avx512fintrin.h (_mm_mask_load_ss, _mm_maskz_load_ss,
> _mm_mask_load_sd, _mm_maskz_load_sd, _mm_mask_move_ss,
> _mm_maskz_move_ss, _mm_mask_move_sd, _mm_maskz_move_sd,
> _mm_mask_store_ss, _mm_mask_store_sd): New intrinsics.
>
This caused:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89630
This looks very strange since this patch only touched backend.
--
H.J.