This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH, i386]: Separate operations with mask registers from operations with general registers


On Fri, Nov 25, 2016 at 10:51 AM, Uros Bizjak <ubizjak@gmail.com> wrote:

> Attached patch tries to fix the mess with operations where register
> allocator is free to use either mask or general registers. There are
> plenty of problems with this approach:
> a) missing operations wth general registers
> - kxnor operation with general register does not exist
> - kandn operation with general registers exists for TARGET_BMI only
>
> b) register allocation problems
> - DImode operations with 64bit general registers do not exist on 32bit target
> - register allocator is free to allocate operation with either
> register set, resulting in costly moves between general and mask
> register sets
>
> Mask operations can be generated in a consistent way using
> corresponding built-ins. To prevent RA problems, we have to separate
> mask ops from general ops - this way we always select operation with
> well defined register set.
>
> There is special problem with 64bit mask ops on 32bit targets:
> defining these operations as a named pattern, where only mask
> registers are available, will result in the situation where mask
> operation will be used for 64bit values living in a pair of general
> registers. RA will obviously generate many inter-regset moves in this
> situation.
>
> Another problem in point a) above is dependence of general-reg
> operations on various TARGET_AVX* flags. This can be seen with kxnor
> pattern, which has to be artificially enabled during register
> allocation just to split non-masked operation back to sequence of
> general operations. Similar situation arises with kandn on non-BMI
> targets.
>
> IMO the mentioned interferences do not warrant mixing of mask and
> general operations. I suspect here will be no end of "the value is
> moved between integer and mask registers unnecessary"  type of bugs
> (as was the situation on 32bit targets in the past, where DImode
> values were moved through MMX registers). Users can and should use
> relevant builtins when operating with mask registers (n.b.: generic
> logic operations can still be used; RA will move values between
> regsets -  but in a *consistent way*). FWIW, the few folding
> opportunities with mask builtins can be implemented in a
> target-folding function.
>
> The attached patch also paves the way for implementation of new
> builtins in patch [1] that are otherwise nearly to impossible to
> implement.
>
> 2016-11-25  Uros Bizjak  <ubizjak@gmail.com>
>
>     * config/i386/i386.md (UNSPEC_KMASKOP): New.
>     (UNSPEC_KMOV): Remove.
>     (kmovw): Expand to plain HImode move.
>     (k<any_logic:code><mode>): Rename from *k<logic><mode>. Use
>     register_operand predicates.  Tag pattern with UNSPEC_KMASKOP.
>     Remove corresponding clobber-removing splitter.
>     (*anddi_1): Remove mask register alternatives.
>     (*andsi_1): Ditto.
>     (*andhi_1): Ditto.
>     (*andqi_1): Ditto.
>     (*<any_or:code><mode>_1): Ditto.
>     (*<any_or:code>qi_1): Ditto.
>     (kandn<mode>): Use SWI1248_AVX512BW mode iterator.  Remove
>     general register alternatives.  Tag pattern with UNSPEC_KMASKOP.
>     Remove corresponding splitter to operation with general registers.
>     (*andn<SWI38:mode>): Rename from *bmi_andn_<mode>.
>     (*andn<SWI12:mode>): New pattern.
>     (*kxnor<mode>): Remove general register alternatives.  Tag pattern
>     with UNSPEC_KMASKOP.  Remove corresponding splitter to operation
>     with general registers.
>     (knot<mode>): New insn pattern.
>     (*one_cmpl<mode>2_1): Remove mask register alternatives.
>     (one_cmplqi2_1): Ditto.
>     (*k<any_lshift:code><mode>): Rename from *k<mshift><mode>3.
>     Tag pattern with UNSPEC_KMASKOP. Add mode attribute.
>     * config/i386/predicates.md (mask_reg_operand): Remove predicate.
>     * config/i386/sse.md (vec_unpacks_hi_hi): Update pattern
>     to generate kmaskop shift.
>     (vec_unpacks_hi_<mode>): Ditto.
>     * config/i386/i386-builtin.def (__builtin_ia32_kandhi):
>     Use CODE_FOR_kandhi.
>     (__builtin_ia32_knothi): Use CODE_FOR_knothi.
>     (__builtin_ia32_korhi): Use CODE_FOR_kiorhi.
>     (__builtin_ia32_kxorhi): Use CODE_FOR_kxorhi.
>
> Patch was bootstrapped and regression tested on x86_64-linux-gnu
> {,-m32}. There are a couple fo trivial scan-asm errors due to a rename
> and a missing kmov insn, where RA choose much more optimal movw
> <imm>,<mem> insn instead.

Now committed to mainline with additional testsuite patch:

2016-11-28  Uros Bizjak  <ubizjak@gmail.com>

    * gcc.target/i386/bmi-andn-1a.c (dg-final): Update scan string.
    * gcc.target/i386/bmi-andn-2a.c (dg-final): Ditto.

I'll fill gcc.target/i386/avx512f-kmovw-1.c failure in a follow-up patch.

Uros.

Index: gcc.target/i386/bmi-andn-1a.c
===================================================================
--- gcc.target/i386/bmi-andn-1a.c       (revision 242892)
+++ gcc.target/i386/bmi-andn-1a.c       (working copy)
@@ -3,4 +3,4 @@

 #include "bmi-andn-1.c"

-/* { dg-final { scan-assembler-times "bmi_andn_di" 1 } } */
+/* { dg-final { scan-assembler-times "andndi" 1 } } */
Index: gcc.target/i386/bmi-andn-2a.c
===================================================================
--- gcc.target/i386/bmi-andn-2a.c       (revision 242892)
+++ gcc.target/i386/bmi-andn-2a.c       (working copy)
@@ -3,4 +3,4 @@

 #include "bmi-andn-2.c"

-/* { dg-final { scan-assembler-times "bmi_andn_si" 1 } } */
+/* { dg-final { scan-assembler-times "andnsi" 1 } } */


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]