Bug 56068 - -march=native creates Illegal instruction on KVM guests
Summary: -march=native creates Illegal instruction on KVM guests
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.4.6
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-21 17:08 UTC by Jason Pyeron
Modified: 2013-01-23 16:42 UTC (History)
0 users

See Also:
Host:
Target: x86_64-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2013-01-21 00:00:00


Attachments
gcc -dM -E - (1.24 KB, text/plain)
2013-01-21 17:08 UTC, Jason Pyeron
Details
gcc -dM -E - -march=native (1.33 KB, text/plain)
2013-01-21 17:08 UTC, Jason Pyeron
Details
diff of defines (1.19 KB, patch)
2013-01-21 17:10 UTC, Jason Pyeron
Details | Diff
gcc -v test.c output (1.05 KB, text/plain)
2013-01-21 17:13 UTC, Jason Pyeron
Details
gcc test.c -march=native -v (1.10 KB, text/plain)
2013-01-21 17:14 UTC, Jason Pyeron
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jason Pyeron 2013-01-21 17:08:10 UTC
Created attachment 29238 [details]
gcc -dM -E -

gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4)

My guess is that it is creating SSE3 instructions.

discovered while working https://issues.asterisk.org/jira/browse/ASTERISK-20128
Comment 1 Jason Pyeron 2013-01-21 17:08:47 UTC
Created attachment 29239 [details]
gcc -dM -E - -march=native
Comment 2 Jason Pyeron 2013-01-21 17:10:34 UTC
Created attachment 29240 [details]
diff of defines
Comment 3 Jason Pyeron 2013-01-21 17:12:15 UTC
mockbuild@centos6-64bit-builder ~/build/BUILD/tmp (mock-chroot)
$ cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 13
model name      : QEMU Virtual CPU version (cpu64-rhel6)
stepping        : 3
cpu MHz         : 2194.498
cache size      : 512 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 4
wp              : yes
flags           : fpu de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx lm up unfair_spinlock pni cx16 hypervisor lahf_lm abm sse4a
bogomips        : 4388.99
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management:
Comment 4 Jason Pyeron 2013-01-21 17:13:25 UTC
Created attachment 29241 [details]
gcc -v test.c output
Comment 5 Jason Pyeron 2013-01-21 17:14:05 UTC
Created attachment 29242 [details]
gcc test.c  -march=native -v
Comment 6 Andrew Pinski 2013-01-21 17:14:54 UTC
> sse4a

There is the issue I think.
Comment 7 Andrew Pinski 2013-01-21 17:19:29 UTC
Please try a newer version of GCC, 4.4 is no longer supported.  Also since this is 4.4 is going to be modified by RedHat, please report it to them instead.  I don't think bdver2 support was in the official 4.4 release.
Comment 8 Jason Pyeron 2013-01-21 17:26:06 UTC
(In reply to comment #7)
> Please try a newer version of GCC, 4.4 is no longer supported.  Also since which version is the oldest supported version, I will use that version for testing.

I will also open a vendor issue. to support the RH GCC 4.4.
Comment 9 Jonathan Wakely 2013-01-21 17:30:38 UTC
What is the illegal instruction? What version of glibc are you using?

I debugged a very similar problem last week where GCC was generating vmovsd on a host without AVX support. /proc/cpuinfo showed no avx flag, but glibc's __cpuid reported the AVX bit set which makes GCC use -mavx.  This is a glibc bug, it should not set the AVX bit if the OSXSAVE bit is not also set. This seems to predominantly affect VMs, but in my case was not a VM. The glibc bug is fixed in 2.15, see http://sourceware.org/bugzilla/show_bug.cgi?id=14059 and http://sourceware.org/bugzilla/show_bug.cgi?id=13753

Using -mno-avx (or replacing -march=cirei7) worked for me.
Comment 10 Jonathan Wakely 2013-01-21 17:32:52 UTC
(In reply to comment #9)
> Using -mno-avx (or replacing -march=cirei7) worked for me.

Bah, that should be "replacing -march=native with -march=corei7"

My problem was with GCC 4.6, and the avx-detection code is the same in current releases, so if your illegal instruction is an AVX one then I doubt trying a newer GCC will help.  The problem is in glibc's __cpuid() function not GCC.
Comment 11 Jason Pyeron 2013-01-21 17:42:06 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > Using -mno-avx (or replacing -march=cirei7) worked for me.
> Bah, that should be "replacing -march=native with -march=corei7"
> My problem was with GCC 4.6, and the avx-detection code is the same in current
> releases, so if your illegal instruction is an AVX one then I doubt trying a
> newer GCC will help.  The problem is in glibc's __cpuid() function not GCC.

I will try 4.6 to prove that. Off to build 4.6...
Comment 12 Jonathan Wakely 2013-01-21 19:14:14 UTC
(In reply to comment #11)
> I will try 4.6 to prove that. Off to build 4.6...

Thanks for checking.  See http://gcc.gnu.org/wiki/InstallingGCC for the foolproof way to do build it.

You could also try this with any GCC version on the KVM guest:

#include <stdio.h>
#include "cpuid.h"

int main()
{
      unsigned bit_osxsave = bit_AVX >> 1;

      unsigned int eax, ebx, ecx, edx;

      __cpuid (1, eax, ebx, ecx, edx);

      printf("bit_OSXSAVE (%u) = %u\n", bit_osxsave, ecx & bit_osxsave);
      printf("bit_AVX (%u) = %u\n", bit_AVX, ecx & bit_AVX);
}

If that prints 0 for OSXSAVE and non-zero for AVX then it's the same problem I had.

Maybe GCC could work around it by checking both flags in the AVX detection logic.
Comment 13 Uroš Bizjak 2013-01-21 19:31:47 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > I will try 4.6 to prove that. Off to build 4.6...
> 
> Thanks for checking.  See http://gcc.gnu.org/wiki/InstallingGCC for the
> foolproof way to do build it.
> 
> You could also try this with any GCC version on the KVM guest:
> 
> #include <stdio.h>
> #include "cpuid.h"
> 
> int main()
> {
>       unsigned bit_osxsave = bit_AVX >> 1;
> 
>       unsigned int eax, ebx, ecx, edx;
> 
>       __cpuid (1, eax, ebx, ecx, edx);
> 
>       printf("bit_OSXSAVE (%u) = %u\n", bit_osxsave, ecx & bit_osxsave);
>       printf("bit_AVX (%u) = %u\n", bit_AVX, ecx & bit_AVX);
> }
> 
> If that prints 0 for OSXSAVE and non-zero for AVX then it's the same problem I
> had.
> 
> Maybe GCC could work around it by checking both flags in the AVX detection
> logic.

Recent 4.6+ does. Please see driver-i386.c around line 470:

--snip--
  /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv.  */
#define XCR_XFEATURE_ENABLED_MASK	0x0
#define XSTATE_FP			0x1
#define XSTATE_SSE			0x2
#define XSTATE_YMM			0x4
  if (has_osxsave)
    asm (".byte 0x0f; .byte 0x01; .byte 0xd0"
	 : "=a" (eax), "=d" (edx)
	 : "c" (XCR_XFEATURE_ENABLED_MASK));

  /* Check if SSE and YMM states are supported.  */
  if (!has_osxsave
      || (eax & (XSTATE_SSE | XSTATE_YMM)) != (XSTATE_SSE | XSTATE_YMM))
    {
      has_avx = 0;
      has_fma = 0;
      has_fma4 = 0;
      has_xop = 0;
    }
--snip--
Comment 14 Jonathan Wakely 2013-01-21 19:58:48 UTC
(In reply to comment #13)
> > Maybe GCC could work around it by checking both flags in the AVX detection
> > logic.
> 
> Recent 4.6+ does. Please see driver-i386.c around line 470:

Ah, thank you! I only looked in the 4.6.3 and 4.7.2 releases, not the branch heads.

That's good to know.
Comment 15 Jason Pyeron 2013-01-21 20:14:10 UTC
(In reply to comment #12)
> (In reply to comment #11)
> > I will try 4.6 to prove that. Off to build 4.6...
> ...
> If that prints 0 for OSXSAVE and non-zero for AVX then it's the same problem I
> had.
> Maybe GCC could work around it by checking both flags in the AVX detection
> logic.

mockbuild@centos6-64bit-builder ~/build/BUILD/gcc/test (mock-chroot)
$ gcc test.c -march=native

mockbuild@centos6-64bit-builder ~/build/BUILD/gcc/test (mock-chroot)
$ ./a.out
bit_OSXSAVE (134217728) = 0
bit_AVX (268435456) = 0

mockbuild@centos6-64bit-builder ~/build/BUILD/gcc/test (mock-chroot)
$ gcc test.c

mockbuild@centos6-64bit-builder ~/build/BUILD/gcc/test (mock-chroot)
$ ./a.out
bit_OSXSAVE (134217728) = 0
bit_AVX (268435456) = 0

on the other note, I am currently building svn://gcc.gnu.org/svn/gcc/branches/gcc-4_6-branch
Comment 16 Richard Biener 2013-01-23 16:42:19 UTC
mockbuild@centos6-64bit-builder ~/build/BUILD/tmp (mock-chroot)
$ cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 13

I think that's the old qemu bug of providing a non-sensical CPUID
family/model/vendor combination by default on x86_64.  AMD family 6
is 32bits only.  Intel family 6 has 64bit support.

You'll hit funny issues with GMP as well.  Thus, fix your KVM/QEMU
config.