Summary: | __builtin_cpu_supports does not (always) detect "sse2" | ||
---|---|---|---|
Product: | gcc | Reporter: | Erich Eckner <gcc> |
Component: | target | Assignee: | Martin Liška <marxin> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | gcc, iam, jakub, marxin, Mayshao-oc, sjames |
Priority: | P3 | ||
Version: | 11.1.0 | ||
Target Milestone: | --- | ||
Host: | i686 | Target: | i686 |
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | 2021-05-26 00:00:00 | |
Attachments: |
test.c: probe for sse2
cpuid probing |
Thank you for the report. You are right, GCC does not detect features for VIA (Centaur) CPUs: https://github.com/gcc-mirror/gcc/blob/403bb89bd7f4ec03d4dcbdf8668d0187358631a0/gcc/common/config/i386/cpuinfo.h#L925-L926 Well, I can theoretically add it, but we speak about pretty legacy hardware. I don't know if it worth doing? What's your use case Erich? We use this in archlinux32 to detect, if we can install packages, that have sse2 opcodes: If one sets "Architecture = auto" in /etc/pacman.conf, uname only gives "i686" in both cases (this is how archlinux does/did the probing), and then, we probe for sse2 to check, if we really are "i686" or "pentium4" (our nomenclature for "i686" + sse2). This works well on amd and intel cpus, but fails on via cpus. Details about expected features: https://archlinux32.org/architecture/ We can add some cumbersome code which probes for sse2, but I'd really prefer some compiler builtin instead :-) (In reply to Erich Eckner from comment #2) > We use this in archlinux32 to detect, if we can install packages, that have > sse2 opcodes: > > If one sets "Architecture = auto" in /etc/pacman.conf, uname only gives > "i686" in both cases (this is how archlinux does/did the probing), and then, > we probe for sse2 to check, if we really are "i686" or "pentium4" (our > nomenclature for "i686" + sse2). This works well on amd and intel cpus, but > fails on via cpus. > > Details about expected features: https://archlinux32.org/architecture/ I see. > > We can add some cumbersome code which probes for sse2, but I'd really prefer > some compiler builtin instead :-) I can implement it for GCC 12.1 if you want (May 2022 release date). As I don't have access to the VIA machines, can you please paste the content of %eax, %ebx, %ecx and %edx for 'cpuid' instruction? Created attachment 50871 [details]
cpuid probing
Does the attached program yield, what you need? (Sry, I'm quite unfamiliar with asm in gcc)
It gives:
00000001 746e6543 736c7561 48727561
000006d0 00000800 00004181 a7c9bbff
and
0000000a 746e6543 736c7561 48727561
000006fa 00010800 008863a9 afc9fbff
on the two machines, respectively.
(In reply to Erich Eckner from comment #2) > We use this in archlinux32 to detect, if we can install packages, that have > sse2 opcodes: > > If one sets "Architecture = auto" in /etc/pacman.conf, uname only gives > "i686" in both cases (this is how archlinux does/did the probing), and then, > we probe for sse2 to check, if we really are "i686" or "pentium4" (our > nomenclature for "i686" + sse2). This works well on amd and intel cpus, but > fails on via cpus. > > Details about expected features: https://archlinux32.org/architecture/ > > We can add some cumbersome code which probes for sse2, but I'd really prefer > some compiler builtin instead :-) Eh, including cpuid.h and using that to check for SSE2 shouldn't be too hard (or parsing /proc/cpuinfo). (In reply to Richard Biener from comment #5) > (In reply to Erich Eckner from comment #2) > > We use this in archlinux32 to detect, if we can install packages, that have > > sse2 opcodes: > > > > If one sets "Architecture = auto" in /etc/pacman.conf, uname only gives > > "i686" in both cases (this is how archlinux does/did the probing), and then, > > we probe for sse2 to check, if we really are "i686" or "pentium4" (our > > nomenclature for "i686" + sse2). This works well on amd and intel cpus, but > > fails on via cpus. > > > > Details about expected features: https://archlinux32.org/architecture/ > > > > We can add some cumbersome code which probes for sse2, but I'd really prefer > > some compiler builtin instead :-) > > Eh, including cpuid.h and using that to check for SSE2 shouldn't be too hard > (or parsing /proc/cpuinfo). Yeah, the following should work for you: #include <cpuid.h> int main() { unsigned int eax, ebx, ecx, edx; __get_cpuid(0, &eax, &ebx, &ecx, &edx); if (edx & bit_SSE2) __builtin_printf ("has SSE2\n"); } Yes, we could do #include <cpuinfo.h> unsigned int regs[4]; __get_cpuid(0, regs, regs+1, regs+2, regs+3); if (regs[3] & bit_SSE2) { ... } The "cumbersome" in my first comment referred to doing all that by hand, which was the alternative, we saw so far. Ok, I'll switch to that code and stop bothering you :-) Thanks for looking into this! Closing as wontfix. May I ask why was is closed as WONTFIX? Just got hit by this bug in Debian 12, which uses special "sse2-support" package to detect SSE2 support by using __builtin_cpu_supports. It fails on VIA Eden Eshter. /proc/cpuinfo returns SSE2 support. (In reply to ValdikSS from comment #9) > May I ask why was is closed as WONTFIX? Because we're not planning to support such legacy hardware. > It fails on VIA Eden Eshter. Is it critical that the feature is not detected? We speak here about 15 years old hardware if I see correctly. Well, the function is called __builtin_cpu_supports, for which one might expect that it just checks CPUID and returns reliable results, while in reality it operates using the build-in CPU support list. The function does not return an error if it's unable to detect the feature, resulting in incorrect results, which makes the function unreliable. The fact that it may not always detect features which are present in the CPU is not documented, which brings the confusion among developers such as this bug. VIA processors have progressed into Zhaoxin CPU family which is fairly recent (2014-2019, with plans to release new processors soon). They both share almost the same CPUID data, that's why adding support for either of these CPU automatically implements the other one. (In reply to ValdikSS from comment #11) > Well, the function is called __builtin_cpu_supports, for which one might > expect that it just checks CPUID and returns reliable results, while in > reality it operates using the build-in CPU support list. The function does > not return an error if it's unable to detect the feature, resulting in > incorrect results, which makes the function unreliable. The fact that it may > not always detect features which are present in the CPU is not documented, > which brings the confusion among developers such as this bug. Sure, it's not perfect, I can confirm that. I'm going to introduce a documentation entry for this limitation. > > VIA processors have progressed into Zhaoxin CPU family which is fairly > recent (2014-2019, with plans to release new processors soon). Yep, and we support these (vendor == signature_SHANGHAI_ebx) correctly since r13-713-ga239aff82c3771. Maybe the author of the commit will be interested in old VIA CPUs? The master branch has been updated by Martin Liska <marxin@gcc.gnu.org>: https://gcc.gnu.org/g:1189d1b38e2b9507488ea294cda771c79e972c1d commit r13-5755-g1189d1b38e2b9507488ea294cda771c79e972c1d Author: Martin Liska <mliska@suse.cz> Date: Thu Feb 9 11:33:31 2023 +0100 docs: add caveat for __builtin_cpu_supports Document that the function does not work correctly for old VIA processors. PR target/100758 gcc/ChangeLog: * doc/extend.texi: Document that the function does not work correctly for old VIA processors. Given the above dumps, wouldn't it be just a matter of: --- gcc/common/config/i386/cpuinfo.h.jj 2023-01-16 11:52:15.910736614 +0100 +++ gcc/common/config/i386/cpuinfo.h 2023-02-09 12:30:31.430185107 +0100 @@ -601,8 +601,8 @@ get_intel_cpu (struct __processor_model static inline const char * get_zhaoxin_cpu (struct __processor_model *cpu_model, - struct __processor_model2 *cpu_model2, - unsigned int *cpu_features2) + struct __processor_model2 *cpu_model2, + unsigned int *cpu_features2) { const char *cpu = NULL; unsigned int family = cpu_model2->__cpu_family; @@ -1057,24 +1057,27 @@ cpu_indicator_init (struct __processor_m cpu_model->__cpu_vendor = VENDOR_AMD; } else if (vendor == signature_CENTAUR_ebx && family < 0x07) - cpu_model->__cpu_vendor = VENDOR_CENTAUR; + { + /* Find available features. */ + get_available_features (cpu_model, cpu_model2, cpu_features2, + ecx, edx); + cpu_model->__cpu_vendor = VENDOR_CENTAUR; + } else if (vendor == signature_SHANGHAI_ebx - || vendor == signature_CENTAUR_ebx) + || vendor == signature_CENTAUR_ebx) { /* Adjust model and family for ZHAOXIN CPUS. */ if (family == 0x07) - { - model += extended_model; - } + model += extended_model; cpu_model2->__cpu_family = family; cpu_model2->__cpu_model = model; /* Find available features. */ get_available_features (cpu_model, cpu_model2, cpu_features2, - ecx, edx); + ecx, edx); /* Get CPU type. */ - get_zhaoxin_cpu (cpu_model, cpu_model2,cpu_features2); + get_zhaoxin_cpu (cpu_model, cpu_model2, cpu_features2); cpu_model->__cpu_vendor = VENDOR_ZHAOXIN; } else if (vendor == signature_CYRIX_ebx) The important part just adding get_available_features for the VENDOR_CENTAUR case (otherwise I've just fixed up the messed up formatting of the Zhaoxin stuff). In fact, I wonder why get_available_features isn't called unconditionally for all CPUs that support at least max level >= 1. Or is the worry that some CPUs might misbehave on CPUID 0x80000000 leaf? Your patch might work. > In fact, I wonder why get_available_features isn't called unconditionally Yes, I would also expect that, but it was not the case even before the big refactoring in g:1890f2f0e210ef515c39728c54151372d36dd187. > for all CPUs that support at least max level >= 1. Or is the worry that > some CPUs might misbehave on CPUID 0x80000000 leaf? Maybe H.J. can answer the question? (In reply to Erich Eckner from comment #4) > Created attachment 50871 [details] > cpuid probing > > Does the attached program yield, what you need? (Sry, I'm quite unfamiliar > with asm in gcc) > > It gives: > > 00000001 746e6543 736c7561 48727561 > 000006d0 00000800 00004181 a7c9bbff > > and > > 0000000a 746e6543 736c7561 48727561 > 000006fa 00010800 008863a9 afc9fbff > > on the two machines, respectively. Erich, if you still have access to those CPUs, could you retry with: #include <stdio.h> static void cpuid( unsigned int ax, unsigned int cx, unsigned int *p ) { unsigned int flags; __asm __volatile ("movl %%ebx, %%esi\n\t" "cpuid\n\t" "xchgl %%ebx, %%esi" : "=a" (p[0]), "=S" (p[1]), "=c" (p[2]), "=d" (p[3]) : "0" (ax), "2" (cx)); } int main() { unsigned int regs[4]; for (int j=0; j<=18; j++) { int k = j > 10 ? -__INT_MAX__ - 1 + (j - 11) : j; cpuid( k, regs, 0 ); printf("%d %d ", k, 0); for (int i=0; i<4; i++) { printf("%08x ", regs[i]); } printf("\n"); cpuid( k, regs, 1 ); printf("%d %d ", k, 1); for (int i=0; i<4; i++) { printf("%08x ", regs[i]); } printf("\n"); } return 0; } Thanks. I've posted a different patch to gcc-patches: https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611632.html With that, I get a segfault in cpuid(): (gdb) run Starting program: /tmp/a.out [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x0000555555555191 in cpuid () (gdb) bt #0 0x0000555555555191 in cpuid () #1 0x00005555555551ef in main () (gdb) ... same on x86_64 - maybe, there's a typo, somewhere? (In reply to Erich Eckner from comment #17) > With that, I get a segfault in cpuid(): > > (gdb) run > Starting program: /tmp/a.out > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib/libthread_db.so.1". > > Program received signal SIGSEGV, Segmentation fault. > 0x0000555555555191 in cpuid () > (gdb) bt > #0 0x0000555555555191 in cpuid () > #1 0x00005555555551ef in main () > (gdb) > > ... same on x86_64 - maybe, there's a typo, somewhere? Sorry, wrote it without actually trying to compile it, I've swapped the last 2 arguments of cpuid. Use static void cpuid( unsigned int ax, unsigned int cx, unsigned int *p ) { unsigned int flags; __asm __volatile ("movl %%ebx, %%esi\n\t" "cpuid\n\t" "xchgl %%ebx, %%esi" : "=a" (p[0]), "=S" (p[1]), "=c" (p[2]), "=d" (p[3]) : "0" (ax), "2" (cx)); } int main() { unsigned int regs[4]; for (int j=0; j<=18; j++) { int k = j > 10 ? -__INT_MAX__ - 1 + (j - 11) : j; cpuid( k, 0, regs ); __builtin_printf("%d %d ", k, 0); for (int i=0; i<4; i++) { __builtin_printf("%08x ", regs[i]); } __builtin_printf("\n"); cpuid( k, 1, regs ); __builtin_printf("%d %d ", k, 1); for (int i=0; i<4; i++) { __builtin_printf("%08x ", regs[i]); } __builtin_printf("\n"); } return 0; } The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:b24e9c083093a9e1b1007933a184c02f7ff058db commit r13-5759-gb24e9c083093a9e1b1007933a184c02f7ff058db Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Feb 9 17:43:19 2023 +0100 i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758] get_available_features doesn't depend on cpu_model2->__cpu_{family,model} and just sets stuff up based on CPUID leaf 1, or some extended ones, so I wonder why are we calling it separately for Intel, AMD and Zhaoxin and not for all other CPUs too? I think various programs in the wild which aren't using __builtin_cpu_{is,supports} just check the various CPUID leafs and query bits in there, without blacklisting unknown CPU vendors, so I think even __builtin_cpu_supports ("sse2") etc. should be reliable if those VENDOR_{CENTAUR,CYRIX,NSC,OTHER} CPUs set those bits in CPUID leaf 1 or some extended ones. Calling it for all CPUs also means it can be inlined because there will be just a single caller. I have tested it on Intel and Martin tested it on AMD, but can't test it on non-Intel/AMD; for Intel/AMD/Zhaoxin it should be really no change in behavior. 2023-02-09 Jakub Jelinek <jakub@redhat.com> PR target/100758 * common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Formatting fixes. (cpu_indicator_init): Call get_available_features for all CPUs with max_level >= 1, rather than just Intel, AMD or Zhaoxin. Formatting fixes. Yeah, now it pulled some stuff :-) 0 0 0000000a 746e6543 736c7561 48727561 0 1 0000000a 746e6543 736c7561 48727561 1 0 000006fa 00010800 008863a9 afc9fbff 1 1 000006fa 00010800 008863a9 afc9fbff 2 0 02b3b001 00000000 00000000 2c04307d 2 1 02b3b001 00000000 00000000 2c04307d 3 0 00000000 00000000 00000000 00000000 3 1 00000000 00000000 00000000 00000000 4 0 00000000 00000000 00000000 00000000 4 1 00000000 00000000 00000000 00000000 5 0 00000040 00000040 00000003 00022220 5 1 00000040 00000040 00000003 00022220 6 0 00000002 00000000 00000000 00000000 6 1 00000002 00000000 00000000 00000000 7 0 00000000 00000000 00000000 00000000 7 1 00000000 00000000 00000000 00000000 8 0 00000000 00000000 00000000 00000000 8 1 00000000 00000000 00000000 00000000 9 0 00000000 00000000 00000000 00000000 9 1 00000000 00000000 00000000 00000000 10 0 06280202 00000000 00000000 00000503 10 1 06280202 00000000 00000000 00000503 -2147483648 0 80000008 00000000 00000000 00000000 -2147483648 1 80000008 00000000 00000000 00000000 -2147483647 0 00000000 00000000 00000001 20100800 -2147483647 1 00000000 00000000 00000001 20100800 -2147483646 0 20202020 20202020 20202020 20202020 -2147483646 1 20202020 20202020 20202020 20202020 -2147483645 0 20202020 20202020 49562020 614e2041 -2147483645 1 20202020 20202020 49562020 614e2041 -2147483644 0 55206f6e 30303433 30303840 007a484d -2147483644 1 55206f6e 30303433 30303840 007a484d -2147483643 0 00000000 08800880 40100140 40100140 -2147483643 1 00000000 08800880 40100140 40100140 -2147483642 0 00000000 00000000 04008140 00000000 -2147483642 1 00000000 00000000 04008140 00000000 -2147483641 0 00000000 00000000 00000000 00000000 -2147483641 1 00000000 00000000 00000000 00000000 Thank you for your time! VIA Eden Esther (32-bit core) 0 0 00000001 746e6543 736c7561 48727561 0 1 00000001 746e6543 736c7561 48727561 1 0 000006d0 00000800 00004181 a7c9bbff 1 1 000006d0 00000800 00004181 a7c9bbff 2 0 00000000 00000000 00000000 00000000 2 1 00000000 00000000 00000000 00000000 3 0 00000000 00000000 00000000 00000000 3 1 00000000 00000000 00000000 00000000 4 0 00000000 00000000 00000000 00000000 4 1 00000000 00000000 00000000 00000000 5 0 00000000 00000000 00000000 00000000 5 1 00000000 00000000 00000000 00000000 6 0 00000000 00000000 00000000 00000000 6 1 00000000 00000000 00000000 00000000 7 0 00000000 00000000 00000000 00000000 7 1 00000000 00000000 00000000 00000000 8 0 00000000 00000000 00000000 00000000 8 1 00000000 00000000 00000000 00000000 9 0 00000000 00000000 00000000 00000000 9 1 00000000 00000000 00000000 00000000 10 0 00000000 00000000 00000000 00000000 10 1 00000000 00000000 00000000 00000000 -2147483648 0 80000006 00000000 00000000 00000000 -2147483648 1 80000006 00000000 00000000 00000000 -2147483647 0 00000000 00000000 00000000 00100000 -2147483647 1 00000000 00000000 00000000 00100000 -2147483646 0 20202020 20202020 20202020 20202020 -2147483646 1 20202020 20202020 20202020 20202020 -2147483645 0 20202020 41495620 65644520 7250206e -2147483645 1 20202020 41495620 65644520 7250206e -2147483644 0 7365636f 20726f73 30303031 007a484d -2147483644 1 7365636f 20726f73 30303031 007a484d -2147483643 0 00000000 08800880 40040140 40040140 -2147483643 1 00000000 08800880 40040140 40040140 -2147483642 0 00000000 00000000 0080a140 00000000 -2147483642 1 00000000 00000000 0080a140 00000000 -2147483641 0 00000000 00000000 00000000 00000000 -2147483641 1 00000000 00000000 00000000 00000000 Thank you Jakub, please revert my documentation patch if you are convinced enough the change works only on old VIA CPUs. I believe in the #c20 case you'd get FEATURE_CMOV FEATURE_MMX FEATURE_SSE FEATURE_SSE2 FEATURE_CMPXCHG8B FEATURE_FXSAVE FEATURE_POPCNT FEATURE_SSE3 FEATURE_SSSE3 FEATURE_SSE4_1 FEATURE_CMPXCHG16B FEATURE_LAHF_LM FEATURE_LM FEATURE_X86_64_BASELINE set and in the #c21 case FEATURE_CMOV FEATURE_MMX FEATURE_SSE FEATURE_SSE2 FEATURE_CMPXCHG8B FEATURE_FXSAVE FEATURE_SSE3 If that matches what those CPUs provide (say compared to /proc/cpuinfo), then I think we are good. The change has been committed to trunk already, so you can try it yourself (or apply the commit patch to say gcc 12). Hi Jakub: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100758 Thanks for your patch. We test it works on all zhaoxin platforms. We find the same bug still exist in previous gcc releases , could you backport this patch to previous release? BR Mayshao The releases/gcc-12 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:454bf9f4d55058589ac6a76261356cbda599e831 commit r12-9273-g454bf9f4d55058589ac6a76261356cbda599e831 Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Feb 9 17:43:19 2023 +0100 i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758] get_available_features doesn't depend on cpu_model2->__cpu_{family,model} and just sets stuff up based on CPUID leaf 1, or some extended ones, so I wonder why are we calling it separately for Intel, AMD and Zhaoxin and not for all other CPUs too? I think various programs in the wild which aren't using __builtin_cpu_{is,supports} just check the various CPUID leafs and query bits in there, without blacklisting unknown CPU vendors, so I think even __builtin_cpu_supports ("sse2") etc. should be reliable if those VENDOR_{CENTAUR,CYRIX,NSC,OTHER} CPUs set those bits in CPUID leaf 1 or some extended ones. Calling it for all CPUs also means it can be inlined because there will be just a single caller. I have tested it on Intel and Martin tested it on AMD, but can't test it on non-Intel/AMD; for Intel/AMD/Zhaoxin it should be really no change in behavior. 2023-02-09 Jakub Jelinek <jakub@redhat.com> PR target/100758 * common/config/i386/cpuinfo.h (cpu_indicator_init): Call get_available_features for all CPUs with max_level >= 1, rather than just Intel or AMD. (cherry picked from commit b24e9c083093a9e1b1007933a184c02f7ff058db) The releases/gcc-11 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:7fce881241b4f6492d8d321322fcb3412dea2137 commit r11-10713-g7fce881241b4f6492d8d321322fcb3412dea2137 Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Feb 9 17:43:19 2023 +0100 i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758] get_available_features doesn't depend on cpu_model2->__cpu_{family,model} and just sets stuff up based on CPUID leaf 1, or some extended ones, so I wonder why are we calling it separately for Intel, AMD and Zhaoxin and not for all other CPUs too? I think various programs in the wild which aren't using __builtin_cpu_{is,supports} just check the various CPUID leafs and query bits in there, without blacklisting unknown CPU vendors, so I think even __builtin_cpu_supports ("sse2") etc. should be reliable if those VENDOR_{CENTAUR,CYRIX,NSC,OTHER} CPUs set those bits in CPUID leaf 1 or some extended ones. Calling it for all CPUs also means it can be inlined because there will be just a single caller. I have tested it on Intel and Martin tested it on AMD, but can't test it on non-Intel/AMD; for Intel/AMD/Zhaoxin it should be really no change in behavior. 2023-02-09 Jakub Jelinek <jakub@redhat.com> PR target/100758 * common/config/i386/cpuinfo.h (cpu_indicator_init): Call get_available_features for all CPUs with max_level >= 1, rather than just Intel or AMD. (cherry picked from commit b24e9c083093a9e1b1007933a184c02f7ff058db) The releases/gcc-10 branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>: https://gcc.gnu.org/g:be80a2e64a38540a371fa4f03513653d2741bc89 commit r10-11367-gbe80a2e64a38540a371fa4f03513653d2741bc89 Author: Jakub Jelinek <jakub@redhat.com> Date: Thu Feb 9 17:43:19 2023 +0100 i386: Call get_available_features for all CPUs with max_level >= 1 [PR100758] get_available_features doesn't depend on cpu_model2->__cpu_{family,model} and just sets stuff up based on CPUID leaf 1, or some extended ones, so I wonder why are we calling it separately for Intel, AMD and Zhaoxin and not for all other CPUs too? I think various programs in the wild which aren't using __builtin_cpu_{is,supports} just check the various CPUID leafs and query bits in there, without blacklisting unknown CPU vendors, so I think even __builtin_cpu_supports ("sse2") etc. should be reliable if those VENDOR_{CENTAUR,CYRIX,NSC,OTHER} CPUs set those bits in CPUID leaf 1 or some extended ones. Calling it for all CPUs also means it can be inlined because there will be just a single caller. I have tested it on Intel and Martin tested it on AMD, but can't test it on non-Intel/AMD; for Intel/AMD/Zhaoxin it should be really no change in behavior. 2023-02-09 Jakub Jelinek <jakub@redhat.com> PR target/100758 * config/i386/cpuinfo.c (cpu_indicator_init): Call get_available_features for all CPUs with max_level >= 1, rather than just Intel or AMD. (cherry picked from commit b24e9c083093a9e1b1007933a184c02f7ff058db) |
Created attachment 50868 [details] test.c: probe for sse2 I use the attached snippet to detect, whether the cpu supports sse2. This works most of the time, but fails to detect sse2 on two machines, which actually support sse2 (according to /proc/cpuinfo). The affected machines run archlinux32. /proc/cpuinfo shows: processor : 0 vendor_id : CentaurHauls cpu family : 6 model : 15 model name : VIA Nano U3400@800MHz stepping : 10 cpu MHz : 798.016 cache size : 2048 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fdiv_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush acpi mmx fxsr sse sse2 ss tm syscall nx lm constant_tsc arch_perfmon rep_good cpuid pni monitor vmx est tm2 ssse3 cx16 xtpr sse4_1 popcnt rng rng_en ace ace_en ace2 phe phe_en pmm pmm_en lahf_lm tpr_shadow vnmi vpid ida vmx flags : vnmi tsc_offset vtpr bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit bogomips : 1596.53 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual power management: and processor : 0 vendor_id : CentaurHauls cpu family : 6 model : 13 model name : VIA C7-D Processor 1800MHz stepping : 0 cpu MHz : 1596.326 cache size : 128 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fdiv_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge cmov pat clflush acpi mmx fxsr sse sse2 tm nx cpuid pni est tm2 xtpr rng rng_en ace ace_en ace2 ace2_en phe phe_en pmm pmm_en bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit bogomips : 3193.67 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 32 bits virtual power management: respectively. On these machines, the output is "sse2: 0" instead of some value unequal 0. Regards, Erich P.S.: I hope, I reported this in the correct category. Please let me know, if I did not.