[PATCH 4/4][PR target/88808]Enable bitwise operator for AVX512 masks.

Uros Bizjak ubizjak@gmail.com
Mon Aug 17 10:08:13 GMT 2020

On Fri, Aug 14, 2020 at 10:26 AM Hongtao Liu <crazylht@gmail.com> wrote:
> Enable operator or/xor/and/andn/not for mask register, kxnor is not
> enabled since there's no corresponding instruction for general
> registers.
> gcc/
>         PR target/88808
>         * config/i386/i386.md: (*movsi_internal): Adjust constraints
>         for mask registers.
>         (*movhi_internal): Ditto.
>         (*movqi_internal): Ditto.
>         (*anddi_1): Support mask register operations
>         (*and<mode>_1): Ditto.
>         (*andqi_1): Ditto.
>         (*andn<mode>_1): Ditto.
>         (*<code><mode>_1): Ditto.
>         (*<code>qi_1): Ditto.
>         (*one_cmpl<mode>2_1): Ditto.
>         (*one_cmplsi2_1_zext): Ditto.
>         (*one_cmplqi2_1): Ditto.
> gcc/testsuite/
>         * gcc.target/i386/bitwise_mask_op-1.c: New test.
>         * gcc.target/i386/bitwise_mask_op-2.c: New test.
>         * gcc.target/i386/avx512bw-kunpckwd-1.c: Adjust testcase.
>         * gcc.target/i386/avx512bw-kunpckwd-3.c: Ditto.
>         * gcc.target/i386/avx512dq-kmovb-5.c: Ditto.
>         * gcc.target/i386/avx512f-kmovw-5.c: Ditto.

index 74d207c3711..e8ad79d1b0a 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2294,7 +2294,7 @@

 (define_insn "*movsi_internal"
   [(set (match_operand:SI 0 "nonimmediate_operand"
-    "=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*rm,*k")
+    "=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*rm,k")
     (match_operand:SI 1 "general_operand"
     "g ,re,C ,*y,m  ,*y,*y,r  ,C ,*v,m ,*v,*v,r  ,*r,*km,*k ,CBC"))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))"

I'd rather see *k everywhere, also with *movqi_internal and
*movhi_internal patterns. The "*" means that the allocator won't
allocate a mask register by default, but it will be used to optimize
moves. With the above change, you are risking that during integer
register pressure, the register allocator will allocate zero to a mask
register, and later "optimize" the move with a direct maskreg-intreg

The current strategy is that only general registers get allocated for
integer modes. Let's keep it this way for now.

Otherwise, the patchset LGTM, but please test the suggested changes and repost.

BTW: Do you plan to remove mask operations from sse.md? ATM, they are
used to distinguish mask operations, generated from builtins from
generic operations, so I'd like to keep them for a while. The drawback
is, that they are not combined with other operations, but at the end
of the day, this is what the programmer asked for by using builtins.


