[PATCH] i386: Hopefully last set of -mavx512vl -mno-avx512bw fixes [PR99321]

Fri Mar 12 16:11:25 GMT 2021

On Fri, Mar 12, 2021 at 4:28 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Fri, Mar 12, 2021 at 03:34:09PM +0100, Uros Bizjak wrote:
> > >  (define_insn "*avx2_pmaddwd"
> > > -  [(set (match_operand:V8SI 0 "register_operand" "=x,v")
> > > +  [(set (match_operand:V8SI 0 "register_operand" "=Yw")
> >
> > I'm not sure contraction like this is correct. The prolbem is with vex
> > vs. evex prefix, used in length calculation. When XMM16+ is used in
> > the insn, and evex prefix is used, the unpatched version returns vex
> > for alternative 0 due to (x,x,x) and evex for alternative 1, since one
> > of registers satisfies only "v".
> >
> > Patched version now always emits vex, which is wrong for XMM16+ registers.
>
> That is true, but we have many thousands of those cases, where we just
> use vex or maybe_vex or maybe_evex prefix with v or Yv or Yw or YW etc.
> Adding extra alternatives for that would mean changing pretty much all of
> sse.md.
> I think it is much easier to imply length_evex when prefix is vex/maybe_vex when
> any of the operands is EXT_REX_SSE_REG_P, subreg thereof,
> MASK_REG_P, or subreg thereof.
>
> The default definition of the "length" attribute has:
>          (ior (eq_attr "prefix" "evex")
>               (and (ior (eq_attr "prefix" "maybe_evex")
>                         (eq_attr "prefix" "maybe_vex"))
>                    (match_test "TARGET_AVX512F")))
>            (plus (attr "length_evex")
>                  (plus (attr "length_immediate")
>                        (plus (attr "modrm")
>                              (attr "length_address"))))
>          (ior (eq_attr "prefix" "vex")
>               (and (ior (eq_attr "prefix" "maybe_vex")
>                         (eq_attr "prefix" "maybe_evex"))
>                    (match_test "TARGET_AVX")))
>            (plus (attr "length_vex")
>                  (plus (attr "length_immediate")
>                        (plus (attr "modrm")
>                              (attr "length_address"))))]
> That is just extremely rough guess, assuming all insns with
> evex/maybe_evex/maybe_vex prefix will be EVEX encoded when TARGET_AVX512F
> is clearly wrong, that is only true for instructions that don't have
> a VEX counterpart (e.g. if they have mnemonics that is EVEX only), otherwise
> it depends on whether either the operands will be 64-byte (we can perhaps
> use for that the mode attribute at least by default) or whether any of the
> operands is %[xy]mm16+ or %k*.
> So (but I think this must be GCC 12 material) I'd say we should throw away
> maybe_evex prefix altogether (replace with maybe_vex or vex),
> use evex for the cases where it has EVEX only mnemonics and otherwise
> call some function to look at the operands and mode attribute.

Yes, I'm aware that while great care was taken to handle vex
attribute, evex handling is quite sloppy, and I fully agree with your
findings. I have noticed the issue when I tried to utilize newly
introduced YW constraint some more, e.g.:

(define_insn "*vec_extract<mode>"
  [(set (match_operand:<ssescalarmode> 0 "register_sse4nonimm_operand"
"=r,m,r,m")
    (vec_select:<ssescalarmode>
      (match_operand:PEXTR_MODE12 1 "register_operand" "x,x,v,v")
      (parallel
        [(match_operand:SI 2 "const_0_to_<ssescalarnummask>_operand")])))]
  "TARGET_SSE2"
  "@
   %vpextr<ssemodesuffix>\t{%2, %1, %k0|%k0, %1, %2}
   %vpextr<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}
   vpextr<ssemodesuffix>\t{%2, %1, %k0|%k0, %1, %2}
   vpextr<ssemodesuffix>\t{%2, %1, %0|%0, %1, %2}"
  [(set_attr "isa" "*,sse4,avx512bw,avx512bw")

where alternatives 0/2 and 1/3 can now be merged together using YW
register constraint (plus a couple of other places, just grep for
avx512bw isa attribute). I was not sure if using maybe_vex is the
correct selection for the prefix attribute in this case.

Uros.

> The maybe_vex decision on TARGET_AVX vs. !TARGET_AVX is on the other side
> correct, we have all those %v etc. to make sure we emit VEX/EVEX encoded
> instructions with -mavx rather than the old SSE* encodings.
>
> I think some years ago I've used scripts and -dP etc. to verify the length
> computations for vex, for evex it is known for years that it is inaccurate
> unfortunately.
>
> Maybe it is conservatively correct?  length_evex is hardcoded to 5,
> while length_vex is 3 + 1 or 2 + 1.
>
> In that case, perhaps I should for now change the vex on that particular
> instruction to maybe_vex.  But I see still several hundreds of instructions
> with vex prefix attribute and v etc. in constraints.
> So, if we want a conservatively correct fix now, I'd say we should treat
> even "vex" in the "length_evex" condition, so (untested):
>
> 2021-03-12  Jakub Jelinek  <jakub@redhat.com>
>
>         * config/i386/i386.md (length): For TARGET_AVX512F treat
>         also vex prefix conservatively as length_evex.
>
> --- gcc/config/i386/i386.md.jj  2021-03-05 21:51:33.675350463 +0100
> +++ gcc/config/i386/i386.md     2021-03-12 16:24:49.302919436 +0100
> @@ -686,6 +686,7 @@
>                        (attr "length_address")))
>          (ior (eq_attr "prefix" "evex")
>               (and (ior (eq_attr "prefix" "maybe_evex")
> +                       (eq_attr "prefix" "vex")
>                         (eq_attr "prefix" "maybe_vex"))
>                    (match_test "TARGET_AVX512F")))
>            (plus (attr "length_evex")
>
> and if we eventually do something more accurate, perhaps vex could
> stand for do the operand >= %xmm16 or %k? check (VEX can't ever encode
> those), but perhaps don't do the mode attribute check?
>
>         Jakub
>