CPUID Patch for IDT Winchip

Mon May 20 15:51:00 GMT 2019

On Mon, May 20, 2019 at 4:24 PM tedheadster <tedheadster@gmail.com> wrote:
>
> Uros,
>
> On Mon, May 20, 2019 at 3:29 AM Uros Bizjak <ubizjak@gmail.com> wrote:
> > The CPUID documentation from Intel Architectures Developer's manual is
> > crystal clear on the implementation:
> >
> > --quote--
> > CPUID—CPU Identification
> >
> > Returns processor identification and feature
> > information to the EAX, EBX, ECX, and EDX
> > registers, as determined by input entered in
> > EAX (in some cases, ECX as well).
> > --quote--
>
> That is for the current versions of the documentation. Earlier versions are different (see below).
>
> > So, I dont see why we should implement the workaround for an
> > implementation bug in a widely used header for a relatively obscure
> > target. Please also note that there are other places in the compiler
> > that assume that the instruction returns values as specified above,
> > not to mention other applications that assumes the same.
> >
> > I think proposed target specific bugfix should be implemented
> > elsewhere, not in the cpuid.h header that assumes the specified
> > functionality.
>
> I did some research and found the Intel CPUID document that existed when the WinChip was manufactured (dated December 1996):
>
> http://bitsavers.trailing-edge.com/components/intel/appNotes/AP-485_Intel_Processor_Identification_and_the_CPUID_Instruction_Dec96.pdf
>
> It is ALSO silent about the contents of %ebx and %ecx after a cpuid() with %eax = 1. The Winchip met the Intel _published_ standard that existed at the time. Intel changed the standard a long time afterwards.

Hm, I see.

> This means that all the following processor manufacturers (and example cpus) of that era might demonstrate this behavior:
>
> AMD (Am486 DX2/DX4, Am5x86, K5)
> Centaur (WinChip C6, Winchip 2)
> NexGen (Nx586)
> Cyrix (Cx486DX2/DX4, 5x86, 6x86, 6x86MX)
>
> That is a lot of processors. Admittedly they are old, but they are still officially supported by GNU/Linux.
>
> Since the patch only introduces four (4) more opcodes, it is not computationally expensive. Here is the diff:
>
>   test   $0x200000,%eax
>   je     0xb7e214c0 <rdrand+128>
> - xor    %eax,%eax
> + xor    %edi,%edi
> + mov    %edi,%eax
>   cpuid
>   test   %eax,%eax
>   je     0xb7e214c0 <rdrand+128>
> + mov    %edi,%ebx
> + mov    %edi,%ecx
> + mov    %edi,%edx
>   mov    $0x1,%eax
>   cpuid

How about the attached patch that enables zeroing only for 32bit
processors, and only for cpuid leaf 1? Usually, __get_cpuid is used
with a constant leaf argument, so we benefit from constant propagation
into inline function. Also, according to the confusing documentation,
it looks to me that zeroing %ecx and %edx regs should be enough.

Uros.
-------------- next part --------------
Index: config/i386/cpuid.h
===================================================================

--- config/i386/cpuid.h	(revision 271384)
+++ config/i386/cpuid.h	(working copy)
@@ -187,6 +187,14 @@
 #define signature_VORTEX_ecx	0x436f5320
 #define signature_VORTEX_edx	0x36387865
 
+/* At least one cpu (Winchip 2) does not set %ebx and %ecx
+   for cpuid leaf 1. Forcibly zero the two registers before
+   calling cpuid as a precaution.  */
+#define __cpuid_compat(level, a, b, c, d)		\
+  __asm__ ("cpuid\n\t"					\
+	   : "=a" (a), "=b" (b), "=c" (c), "=d" (d)	\
+	   : "0" (level), "1" (0), "2" (0))
+
 #define __cpuid(level, a, b, c, d)			\
   __asm__ ("cpuid\n\t"					\
 	   : "=a" (a), "=b" (b), "=c" (c), "=d" (d)	\
@@ -271,7 +279,13 @@
   if (__maxlevel == 0 || __maxlevel < __leaf)
     return 0;
 
-  __cpuid (__leaf, *__eax, *__ebx, *__ecx, *__edx);
+#ifndef __x86_64__
+  if (__leaf == 1)
+    __cpuid_compat (__leaf, *__eax, *__ebx, *__ecx, *__edx);
+  else
+#endif
+    __cpuid (__leaf, *__eax, *__ebx, *__ecx, *__edx);
+
   return 1;
 }