[PATCH] x86: Only use general-purpose registers during CPUID check
Sun Aug 23 16:02:57 GMT 2020
On Sun, Aug 23, 2020 at 5:23 PM H.J. Lu <email@example.com> wrote:
> On Sun, Aug 23, 2020 at 10:18:28AM +0200, Uros Bizjak wrote:
> > On Sat, Aug 22, 2020 at 9:09 PM H.J. Lu <firstname.lastname@example.org> wrote:
> > > > > Compile CPUID check with "-mno-sse -mfpmath=387" to disable SSE, AVX and
> > > > > AVX512 during CPUID check to avoid vector and mask register operations.
> > > >
> > > > -mgeneral-regs-only ?
> > > >
> > >
> > > Here is a patch to add target("general-regs-only") function
> > > attribute and use it for CPUID check. OK for master if there
> > > are no regressions?
> > Please test it first, then ask for an approval.
> > Please submit the general-regs-only part as an independent patch. (I
> > think this is the option linux should use for compilation).
> > OTOH, wrapping CPUID check in a target attribute is a bad idea. We
> > should disable spills to mask registers for generic targets by either
> > raising costs of moves between general and mask registers and/or (as
> > suggested earlier) introducing TARGET_SPILL_TO_MASK_REGS tuning and
> > use it in secondary_memory_needed to prevent inter register unit
> > spills.
> > So, compiling with -mavx512bw would NOT enable spills by default,
> > where compiling with -march=skylake-avx512 (or using equivalent
> > -mtune) would. This is IMO the least surprising approach, and would
> > avoid changing sources (as you now have to do for several testcases).
> We have 2 orthogonal issues here:
> 1. When mask register spill should be enabled.
> 2. CPUID check should be done with general registers only.
> As shown in GCC testcases, CPUID check may be done with arbitrary ISAs
> or -march/-mtune options enabled. We should either
> 1. Enable only general registers for CPUID check. Or
> 2. Issue an error for CPUID check if non-general registers are used.
We should follow the same approach as with SSE2, where DI/SImode
spills to XMM registers were effectively disabled for a generic
target. So, unless the tuning target is also specified, spills to mask
registers should not be generated. It was my oversight to approve the
patch that enables spills for a generic target, and without the tuning
flag, the patch will be reverted.
Now, we have -mgeneral-regs-only functionality in place, so if a
package wants to enable spills, the correct -mtune (ro -march that
implies -mtune) should be used, and it is expected that the detection
code is amended with general-regs-only pragmas.
Speaking of pragmas, these should be added outside cpuid.h, like:
#pragma GCC push_options
#pragma GCC target("general-regs-only")
void cpuid_check ()
#pragma GCC pop_options
Nowadays, -march=native is mostly used outside generic target
compilations, so for relevant avx512 targets, we still generate spills
to mask regs. In future, we can review the setting of the tuning flag
for a generic target in the same way as with SSE2 inter-reg moves.
More information about the Gcc-patches