[PATCH] Disparage slightly the mask register alternative for bitwise operations. [PR target/101142]

Uros Bizjak ubizjak@gmail.com
Wed Jun 23 07:58:59 GMT 2021


On Mon, Jun 21, 2021 at 10:08 AM Hongtao Liu <crazylht@gmail.com> wrote:
>
> On Mon, Jun 21, 2021 at 3:28 PM Uros Bizjak via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Mon, Jun 21, 2021 at 6:56 AM liuhongt <hongtao.liu@intel.com> wrote:
> > >
> > > The avx512 supports bitwise operations with mask registers, but the
> > > throughput of those instructions is much lower than that of the
> > > corresponding gpr version, so we would additionally disparages
> > > slightly the mask register alternative for bitwise operations in the
> > > LRA.
> >
> > This was the reason for UNSPEC tagged instructions with mask
> > registers, used mainly for builtins.
> >
> > Also, care should be taken if we want mask registers to be used under
> > GPR pressure, or it is better to spill GPR registers. In the past,
> > DImode values caused a lot of problems with MMX registers on x86-64,
> > but we were able to hand-tune the allocator in the way you propose.
> >
> > Let's try the proposed approach to see what happens.
> >
> > > Also when allocano cost of GENERAL_REGS is same as MASK_REGS, allocate
> > > MASK_REGS first since it has already been disparaged.
> > >
> > > gcc/ChangeLog:
> > >
> > >         PR target/101142
> > >         * config/i386/i386.md: (*anddi_1): Disparage slightly the mask
> > >         register alternative.
> > >         (*and<mode>_1): Ditto.
> > >         (*andqi_1): Ditto.
> > >         (*andn<mode>_1): Ditto.
> > >         (*<code><mode>_1): Ditto.
> > >         (*<code>qi_1): Ditto.
> > >         (*one_cmpl<mode>2_1): Ditto.
> > >         (*one_cmplsi2_1_zext): Ditto.
> > >         (*one_cmplqi2_1): Ditto.
> > >         * config/i386/i386.c (x86_order_regs_for_local_alloc): Change
> > >         the order of mask registers to be before general registers.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         PR target/101142
> > >         * gcc.target/i386/spill_to_mask-1.c: Adjust testcase.
> > >         * gcc.target/i386/spill_to_mask-2.c: Adjust testcase.
> > >         * gcc.target/i386/spill_to_mask-3.c: Adjust testcase.
> > >         * gcc.target/i386/spill_to_mask-4.c: Adjust testcase.
> >
> > OK with a comment addition, see inline.
> >
> > Thanks,
> > Uros.
> >
> > > ---
> > >  gcc/config/i386/i386.c                        |  8 +-
> > >  gcc/config/i386/i386.md                       | 20 ++---
> > >  .../gcc.target/i386/spill_to_mask-1.c         | 89 +++++++++++++------
> > >  .../gcc.target/i386/spill_to_mask-2.c         | 11 ++-
> > >  .../gcc.target/i386/spill_to_mask-3.c         | 11 ++-
> > >  .../gcc.target/i386/spill_to_mask-4.c         | 11 ++-
> > >  6 files changed, 91 insertions(+), 59 deletions(-)
> > >
> > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > > index a61255857ff..a651853ca3b 100644
> > > --- a/gcc/config/i386/i386.c
> > > +++ b/gcc/config/i386/i386.c
> > > @@ -20463,6 +20463,10 @@ x86_order_regs_for_local_alloc (void)
> > >     int pos = 0;
> > >     int i;
> > >
> > > +   /* Mask register.  */
> > > +   for (i = FIRST_MASK_REG; i <= LAST_MASK_REG; i++)
> > > +     reg_alloc_order [pos++] = i;
> >
> > Please add a comment why mask registers should come first.
> Thanks for the review, this is the patch i'm check in.

This patch again caused unwanted mask instructions with -m32 in cpuid
check code, e.g.:

Running target unix/-m32
FAIL: gcc.target/i386/avx512bw-pr70329-1.c execution test
FAIL: gcc.target/i386/pr96814.c execution test

Debugging pr96814 failure:

  0x0804921d <+66>:    mov    %edx,%ecx
  0x0804921f <+68>:    cpuid
=> 0x08049221 <+70>:    kmovd  %edx,%k0
  0x08049225 <+74>:    mov    %eax,-0x8(%ebp)
  0x08049228 <+77>:    mov    %ebx,-0xc(%ebp)

It looks to me that putting mask registers in front of GPR is looking
for trouble, since apparently disparaging mask registers does not
offset register preference of the new order.

Uros.


More information about the Gcc-patches mailing list