Bug 85100 - __builtin_cpu_supports avx does not verify OS supports it
Summary: __builtin_cpu_supports avx does not verify OS supports it
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 7.3.1
: P3 normal
Target Milestone: 6.5
Assignee: H.J. Lu
URL:
Keywords:
: 55307 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-03-27 18:54 UTC by Julian Taylor
Modified: 2018-05-03 21:44 UTC (History)
2 users (show)

See Also:
Host:
Target: x86_64-*-*, i?86-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2018-03-28 00:00:00


Attachments
A patch (1.26 KB, patch)
2018-03-28 18:18 UTC, H.J. Lu
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Julian Taylor 2018-03-27 18:54:37 UTC
__builtin_cpu_supports("avx") checks that the cpu supports avx instructions, but without OS support the instructions cannot be used.

To reproduce launch linux with the noxsave boot option. The function will return nonzero as the feature is there but programs using avx will crash with SIGILL.

This makes this function significantly less useful as you need to add additional checks (using xgetbv) to verify operating system support.

At least this behavior it should be more clearly documented. Not everybody is aware the operating system may not support or that you can disable it in linux with a boot option.
Comment 1 Nathaniel J. Smith 2018-03-28 01:53:13 UTC
For context here: NumPy currently uses __builtin_cpu_supports("avx") to decide whether it can use AVX-accelerated numerical kernels. We've been getting regular bug reports from users where this __builtin_cpu_supports("avx") returned true, but then NumPy crashes with a SIGILL when it tries to use AVX. (It seems to be related to some kind of relatively common virtualization configurations.)

Examples:

https://github.com/numpy/numpy/issues/10787
https://github.com/numpy/numpy/issues/9532
https://github.com/numpy/numpy/issues/10330
https://github.com/numpy/numpy/issues/9534

Now that Julian finally figured it out, I guess we'll work around it by calling xgetbv ourselves:

https://github.com/numpy/numpy/pull/10814

but it really seems like it would be better if __builtin_cpu_supports would check this itself.
Comment 2 Richard Biener 2018-03-28 08:07:50 UTC
I think this is on purpose and unlikely to change.
Comment 3 Nathaniel J. Smith 2018-03-28 10:06:46 UTC
We're using it exactly like the docs recommend. What on earth is __builtin_cpu_supports for, if not to tell you whether you can use a given feature?

If this is by design, than at the very least the docs need to make clear that a return value of 1 does not mean you can actually use the feature. Also, it would be nice if gcc provided some function that *did* answer the question of what CPU features you can use, since AFAICT this is what every current user of __builtin_cpu_supports actually wants.
Comment 4 H.J. Lu 2018-03-28 18:18:58 UTC
Created attachment 43793 [details]
A patch

Please try this.
Comment 5 Nathaniel J. Smith 2018-03-29 01:19:41 UTC
Julian, are you able to test the patch? I don't have a reproduction setup currently...
Comment 6 Uroš Bizjak 2018-03-29 06:21:51 UTC
(In reply to Richard Biener from comment #2)
> I think this is on purpose and unlikely to change.

No, __builtin_cpu_supports is used to enable parts that are able to execute relevant instructions. So, HJ's patch is the way to go.
Comment 7 Uroš Bizjak 2018-03-29 06:23:25 UTC
We need this in all release branches.
Comment 8 Uroš Bizjak 2018-03-29 06:32:15 UTC
Comment on attachment 43793 [details]
A patch

>+  if ((ecx & bit_OSXSAVE))
>+    {
>+      /* Check if XMM state and YMM state are saved.  */
>+      unsigned int xcrlow;
>+      unsigned int xcrhigh;
>+      asm ("xgetbv" : "=a" (xcrlow), "=d" (xcrhigh) : "c" (0));
>+      if ((xcrlow & 0x6) == 0x6)

Please use .byte stream instead of xgetbv mnemonic (libgcc can be compiled with an assembler that doesn't support xgetbv. Actually, you can just copy the part from driver-i386.c, preferrably also with #defines instead of magic numbers.
Comment 9 hjl@gcc.gnu.org 2018-03-29 13:14:45 UTC
Author: hjl
Date: Thu Mar 29 13:14:06 2018
New Revision: 258954

URL: https://gcc.gnu.org/viewcvs?rev=258954&root=gcc&view=rev
Log:
i386: Enable AVX/AVX512 features only if supported by OSXSAVE

Enable AVX and AVX512 features only if their states are supported by
OSXSAVE.

	PR target/85100
	* config/i386/cpuinfo.c (XCR_XFEATURE_ENABLED_MASK): New.
	(XSTATE_FP): Likewise.
	(XSTATE_SSE): Likewise.
	(XSTATE_YMM): Likewise.
	(XSTATE_OPMASK): Likewise.
	(XSTATE_ZMM): Likewise.
	(XSTATE_HI_ZMM): Likewise.
	(XCR_AVX_ENABLED_MASK): Likewise.
	(XCR_AVX512F_ENABLED_MASK): Likewise.
	(get_available_features): Enable AVX and AVX512 features only
	if their states are supported by OSXSAVE.

Modified:
    trunk/libgcc/ChangeLog
    trunk/libgcc/config/i386/cpuinfo.c
Comment 10 hjl@gcc.gnu.org 2018-04-02 12:03:48 UTC
Author: hjl
Date: Mon Apr  2 12:03:16 2018
New Revision: 259006

URL: https://gcc.gnu.org/viewcvs?rev=259006&root=gcc&view=rev
Log:
i386: Enable AVX/AVX512 features only if supported by OSXSAVE

Enable AVX and AVX512 features only if their states are supported by
OSXSAVE.

	Backport from mainline
	PR target/85100
	* config/i386/cpuinfo.c (XCR_XFEATURE_ENABLED_MASK): New.
	(XSTATE_FP): Likewise.
	(XSTATE_SSE): Likewise.
	(XSTATE_YMM): Likewise.
	(XSTATE_OPMASK): Likewise.
	(XSTATE_ZMM): Likewise.
	(XSTATE_HI_ZMM): Likewise.
	(XCR_AVX_ENABLED_MASK): Likewise.
	(XCR_AVX512F_ENABLED_MASK): Likewise.
	(get_available_features): Enable AVX and AVX512 features only
	if their states are supported by OSXSAVE.

Modified:
    branches/gcc-7-branch/libgcc/ChangeLog
    branches/gcc-7-branch/libgcc/config/i386/cpuinfo.c
Comment 11 hjl@gcc.gnu.org 2018-04-02 12:10:20 UTC
Author: hjl
Date: Mon Apr  2 12:09:48 2018
New Revision: 259007

URL: https://gcc.gnu.org/viewcvs?rev=259007&root=gcc&view=rev
Log:
i386: Enable AVX/AVX512 features only if supported by OSXSAVE

Enable AVX and AVX512 features only if their states are supported by
OSXSAVE.

	Backport from mainline
	PR target/85100
	* config/i386/cpuinfo.c (XCR_XFEATURE_ENABLED_MASK): New.
	(XSTATE_FP): Likewise.
	(XSTATE_SSE): Likewise.
	(XSTATE_YMM): Likewise.
	(XSTATE_OPMASK): Likewise.
	(XSTATE_ZMM): Likewise.
	(XSTATE_HI_ZMM): Likewise.
	(XCR_AVX_ENABLED_MASK): Likewise.
	(XCR_AVX512F_ENABLED_MASK): Likewise.
	(get_available_features): Enable AVX and AVX512 features only
	if their states are supported by OSXSAVE.

Modified:
    branches/gcc-6-branch/libgcc/ChangeLog
    branches/gcc-6-branch/libgcc/config/i386/cpuinfo.c
Comment 12 H.J. Lu 2018-04-02 12:10:40 UTC
Fixed for GCC 8 and GCC 6/7 branches.
Comment 13 H.J. Lu 2018-05-03 21:44:04 UTC
*** Bug 55307 has been marked as a duplicate of this bug. ***