[PATCH] x86: Disable SSE, AVX and AVX512 during CPUID check
Uros Bizjak
ubizjak@gmail.com
Sat Aug 22 17:11:04 GMT 2020
On Sat, Aug 22, 2020 at 6:27 PM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Fri, Aug 21, 2020 at 9:45 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Fri, Aug 21, 2020 at 9:35 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > On Fri, Aug 21, 2020 at 9:29 AM Hongtao Liu <crazylht@gmail.com> wrote:
> > > >
> > > > On Fri, Aug 21, 2020 at 11:50 PM Uros Bizjak <ubizjak@gmail.com> wrote:
> > > > >
> > > > > On Fri, Aug 21, 2020 at 5:41 PM Hongtao Liu <crazylht@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Aug 21, 2020 at 9:15 PM Uros Bizjak <ubizjak@gmail.com> wrote:
> > > > > > >
> > > > > > > > > > gcc/
> > > > > > > > > > PR target/88808
> > > > > > > > > > * config/i386/i386.c (ix86_preferred_reload_class): Allow
> > > > > > > > > > QImode data go into mask registers.
> > > > > > > > > > * config/i386/i386.md: (*movhi_internal): Adjust constraints
> > > > > > > > > > for mask registers.
> > > > > > > > > > (*movqi_internal): Ditto.
> > > > > > > > > > (*anddi_1): Support mask register operations
> > > > > > > > > > (*and<mode>_1): Ditto.
> > > > > > > > > > (*andqi_1): Ditto.
> > > > > > > > > > (*andn<mode>_1): Ditto.
> > > > > > > > > > (*<code><mode>_1): Ditto.
> > > > > > > > > > (*<code>qi_1): Ditto.
> > > > > > > > > > (*one_cmpl<mode>2_1): Ditto.
> > > > > > > > > > (*one_cmplsi2_1_zext): Ditto.
> > > > > > > > > > (*one_cmplqi2_1): Ditto.
> > > > > > > > > > (define_peephole2): Move constant 0/-1 directly into mask
> > > > > > > > > > registers.
> > > > > > > > > > * config/i386/predicates.md (mask_reg_operand): New predicate.
> > > > > > > > > > * config/i386/sse.md (define_split): Add post-reload splitters
> > > > > > > > > > that would convert "generic" patterns to mask patterns.
> > > > > > > > > > (*knotsi_1_zext): New define_insn.
> > > > > > > > > >
> > > > > > > > > > gcc/testsuite/
> > > > > > > > > > * gcc.target/i386/bitwise_mask_op-1.c: New test.
> > > > > > > > > > * gcc.target/i386/bitwise_mask_op-2.c: New test.
> > > > > > > > > > * gcc.target/i386/bitwise_mask_op-3.c: New test.
> > > > > > > > > > * gcc.target/i386/avx512bw-pr88465.c: New testcase.
> > > > > > > > > > * gcc.target/i386/avx512bw-kunpckwd-1.c: Adjust testcase.
> > > > > > > > > > * gcc.target/i386/avx512bw-kunpckwd-3.c: Ditto.
> > > > > > > > > > * gcc.target/i386/avx512dq-kmovb-5.c: Ditto.
> > > > > > > > > > * gcc.target/i386/avx512f-kmovw-5.c: Ditto.
> > > > > > > > >
> > > > > > > > > A little nit, please put new splitters after the instruction pattern.
> > > > > > > > >
> > > > > > > > > OK for the whole patch set with the above change,
> > > > > > > > >
> > > > > > > >
> > > > > > > > Yes, thanks for the review.
> > > > > > >
> > > > > > > Please note that your patch introduces several testsuite fails with -m32:
> > > > > > >
> > > > > > > gcc -O2 -mavx512bitalg -mavx512bw -m32 -g avx512bitalg-vpopcntb-1.c
> > > > > > >
> > > > > >
> > > > > > I can't reproduce this failure.
> > > > >
> > > > > Because you are running it on AVX512 enabled target.
> > > > >
> > > > > > > Program received signal SIGILL, Illegal instruction.
> > > > > > > 0x080490ac in __get_cpuid_count (__edx=<synthetic pointer>,
> > > > > > > __ecx=<synthetic pointer>, __ebx=<synthetic pointer>, __eax=<synthetic
> > > > > > > pointer>,
> > > > > > > __subleaf=0, __leaf=7) at /hdd/uros/gcc-build-fast/gcc/include/cpuid.h:316
> > > > > > > 316 __cpuid_count (__leaf, __subleaf, *__eax, *__ebx, *__ecx, *__edx);
> > > > > > >
> > > > > > > 0x080490a3 <+51>: cpuid
> > > > > > > 0x080490a5 <+53>: mov $0x1,%eax
> > > > > > > 0x080490aa <+58>: mov %ecx,%esi
> > > > > > > => 0x080490ac <+60>: kmovd %ebx,%k0
> > > > > > > 0x080490b0 <+64>: mov %edi,%ecx
> > > > > > > 0x080490b2 <+66>: mov %edi,%ebx
> > > > > > >
> > > > > > > kmov insn is generated for __cpuid_count function, where the binary
> > > > > > > determines, if the new instructions are supported. The binary will
> > > > > > > crash in the detection code if the processor lacks AVX512
> > > > > > > instructions.
> > > > > > >
> > > > > >
> > > > > > IMHO, the testcase shouldn't be run on processors without AVX512BW.
> > > > >
> > > > > No, it could run, because it checks for AVX512BW at runtime.
> > > > >
> > > >
> > > > Got it.
> > > >
> > > > > > Because in avx512bitalg-vpopcntb-1.c, there's /*
> > > > > > dg-require-effective-target avx512bw } */.
> > > > >
> > > > > This is to check the toolchain for support.
> > > > >
> > > > > > what's the version of your assembler?
> > > > >
> > > > > GNU assembler version 2.34-4.fc32
> > > > >
> > > >
> > > > If assembler supports avx512bw, but processor not, the test would pass
> > > > condition `dg-require-effective-target avx512bw` and be runned.
> > > > then crashed for illegal instruction.
> > > >
> > > > > Please add something like
> > > > > X86_TUNE_INTER_UNIT_MOVES_FROM_MASK/X86_TUNE_INTER_UNIT_MOVES_TO_MASK
> > > > > and enable them only for m_CORE_AVX512 (or perhaps m_INTEL).
> > > > >
> > > > > Handle this in inline_secondary_memory_needed to reject direct moves
> > > > > for all other targets. This should disable direct moves for generic
> > > > > targets.
> > > > >
> > > >
> > > > Yes, I'll add it.
> > > >
> > >
> > >
> > > (define_insn "*movsi_internal"
> > > [(set (match_operand:SI 0 "nonimmediate_operand"
> > > "=r,m ,*y,*y,?*y,?m,?r,?*y,*v,*v,*v,m ,?r,?*v,*k,*k ,*rm,*k")
> > > (match_operand:SI 1 "general_operand"
> > > "g ,re,C ,*y,m ,*y,*y,r ,C ,*v,m ,*v,*v,r ,*r,*km,*k ,CBC"))]
> > > "!(MEM_P (operands[0]) && MEM_P (operands[1]))"
> > > ...
> > > [(set (attr "isa")
> > > (cond [(eq_attr "alternative" "12,13")
> > > (const_string "sse2")
> > > ]
> > > (const_string "*")))
> > >
> > > is wrong. mask register alternatives should be marked with avx512f.
> > > Please fix it. Other integer move patterns may have the same issue.
> > > Once these are fixed,
> > >
> > > diff --git a/gcc/testsuite/gcc.target/i386/avx512-check.h
> > > b/gcc/testsuite/gcc.target/i386/avx512-check.h
> > > index 0a377dba1d5..576e9b390c6 100644
> > > --- a/gcc/testsuite/gcc.target/i386/avx512-check.h
> > > +++ b/gcc/testsuite/gcc.target/i386/avx512-check.h
> > > @@ -25,6 +25,7 @@ do_test (void)
> > > }
> > > #endif
> > >
> > > +__attribute__((target ("no-avx512f")))
> > > static int
> > > check_osxsave (void)
> > > {
> > > @@ -34,6 +35,7 @@ check_osxsave (void)
> > > return (ecx & bit_OSXSAVE) != 0;
> > > }
> > >
> > > +__attribute__((target ("no-avx512f")))
> > > int
> > > main ()
> > > {
> > >
> > > should work.
> > >
> >
> > Like this. You need to check all integer patterns with mskmov and msklog.
>
> Compile CPUID check with "-mno-sse -mfpmath=387" to disable SSE, AVX and
> AVX512 during CPUID check to avoid vector and mask register operations.
-mgeneral-regs-only ?
Uros.
More information about the Gcc-patches
mailing list