This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH, i386]: Separate operations with mask registers from operations with general registers
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Cc: Andrew Senkevich <andrew dot n dot senkevich at gmail dot com>, Kirill Yukhin <kirill dot yukhin at gmail dot com>
- Date: Mon, 28 Nov 2016 17:56:38 +0100
- Subject: [PATCH, i386]: Separate operations with mask registers from operations with general registers
- Authentication-results: sourceware.org; auth=none
On Fri, Nov 25, 2016 at 10:51 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Attached patch tries to fix the mess with operations where register
> allocator is free to use either mask or general registers. There are
> plenty of problems with this approach:
> a) missing operations wth general registers
> - kxnor operation with general register does not exist
> - kandn operation with general registers exists for TARGET_BMI only
>
> b) register allocation problems
> - DImode operations with 64bit general registers do not exist on 32bit target
> - register allocator is free to allocate operation with either
> register set, resulting in costly moves between general and mask
> register sets
>
> Mask operations can be generated in a consistent way using
> corresponding built-ins. To prevent RA problems, we have to separate
> mask ops from general ops - this way we always select operation with
> well defined register set.
>
> There is special problem with 64bit mask ops on 32bit targets:
> defining these operations as a named pattern, where only mask
> registers are available, will result in the situation where mask
> operation will be used for 64bit values living in a pair of general
> registers. RA will obviously generate many inter-regset moves in this
> situation.
>
> Another problem in point a) above is dependence of general-reg
> operations on various TARGET_AVX* flags. This can be seen with kxnor
> pattern, which has to be artificially enabled during register
> allocation just to split non-masked operation back to sequence of
> general operations. Similar situation arises with kandn on non-BMI
> targets.
>
> IMO the mentioned interferences do not warrant mixing of mask and
> general operations. I suspect here will be no end of "the value is
> moved between integer and mask registers unnecessary" type of bugs
> (as was the situation on 32bit targets in the past, where DImode
> values were moved through MMX registers). Users can and should use
> relevant builtins when operating with mask registers (n.b.: generic
> logic operations can still be used; RA will move values between
> regsets - but in a *consistent way*). FWIW, the few folding
> opportunities with mask builtins can be implemented in a
> target-folding function.
>
> The attached patch also paves the way for implementation of new
> builtins in patch [1] that are otherwise nearly to impossible to
> implement.
>
> 2016-11-25 Uros Bizjak <ubizjak@gmail.com>
>
> * config/i386/i386.md (UNSPEC_KMASKOP): New.
> (UNSPEC_KMOV): Remove.
> (kmovw): Expand to plain HImode move.
> (k<any_logic:code><mode>): Rename from *k<logic><mode>. Use
> register_operand predicates. Tag pattern with UNSPEC_KMASKOP.
> Remove corresponding clobber-removing splitter.
> (*anddi_1): Remove mask register alternatives.
> (*andsi_1): Ditto.
> (*andhi_1): Ditto.
> (*andqi_1): Ditto.
> (*<any_or:code><mode>_1): Ditto.
> (*<any_or:code>qi_1): Ditto.
> (kandn<mode>): Use SWI1248_AVX512BW mode iterator. Remove
> general register alternatives. Tag pattern with UNSPEC_KMASKOP.
> Remove corresponding splitter to operation with general registers.
> (*andn<SWI38:mode>): Rename from *bmi_andn_<mode>.
> (*andn<SWI12:mode>): New pattern.
> (*kxnor<mode>): Remove general register alternatives. Tag pattern
> with UNSPEC_KMASKOP. Remove corresponding splitter to operation
> with general registers.
> (knot<mode>): New insn pattern.
> (*one_cmpl<mode>2_1): Remove mask register alternatives.
> (one_cmplqi2_1): Ditto.
> (*k<any_lshift:code><mode>): Rename from *k<mshift><mode>3.
> Tag pattern with UNSPEC_KMASKOP. Add mode attribute.
> * config/i386/predicates.md (mask_reg_operand): Remove predicate.
> * config/i386/sse.md (vec_unpacks_hi_hi): Update pattern
> to generate kmaskop shift.
> (vec_unpacks_hi_<mode>): Ditto.
> * config/i386/i386-builtin.def (__builtin_ia32_kandhi):
> Use CODE_FOR_kandhi.
> (__builtin_ia32_knothi): Use CODE_FOR_knothi.
> (__builtin_ia32_korhi): Use CODE_FOR_kiorhi.
> (__builtin_ia32_kxorhi): Use CODE_FOR_kxorhi.
>
> Patch was bootstrapped and regression tested on x86_64-linux-gnu
> {,-m32}. There are a couple fo trivial scan-asm errors due to a rename
> and a missing kmov insn, where RA choose much more optimal movw
> <imm>,<mem> insn instead.
Now committed to mainline with additional testsuite patch:
2016-11-28 Uros Bizjak <ubizjak@gmail.com>
* gcc.target/i386/bmi-andn-1a.c (dg-final): Update scan string.
* gcc.target/i386/bmi-andn-2a.c (dg-final): Ditto.
I'll fill gcc.target/i386/avx512f-kmovw-1.c failure in a follow-up patch.
Uros.
Index: gcc.target/i386/bmi-andn-1a.c
===================================================================
--- gcc.target/i386/bmi-andn-1a.c (revision 242892)
+++ gcc.target/i386/bmi-andn-1a.c (working copy)
@@ -3,4 +3,4 @@
#include "bmi-andn-1.c"
-/* { dg-final { scan-assembler-times "bmi_andn_di" 1 } } */
+/* { dg-final { scan-assembler-times "andndi" 1 } } */
Index: gcc.target/i386/bmi-andn-2a.c
===================================================================
--- gcc.target/i386/bmi-andn-2a.c (revision 242892)
+++ gcc.target/i386/bmi-andn-2a.c (working copy)
@@ -3,4 +3,4 @@
#include "bmi-andn-2.c"
-/* { dg-final { scan-assembler-times "bmi_andn_si" 1 } } */
+/* { dg-final { scan-assembler-times "andnsi" 1 } } */