Bug 54741 - GCC 4.4, 4.5, 4.6 4.7 (probably 4.8) Generates un-usable code on AVX supported CPUs (FreeBSD)
Summary: GCC 4.4, 4.5, 4.6 4.7 (probably 4.8) Generates un-usable code on AVX supporte...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.6.4
: P3 major
Target Milestone: 4.6.4
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-29 00:45 UTC by M.S. Babaei
Modified: 2012-10-03 18:48 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2012-09-29 00:00:00


Attachments
both test.ii and test.s files from save-temp output (78.81 KB, application/octet-stream)
2012-09-29 00:45 UTC, M.S. Babaei
Details
A patch (697 bytes, patch)
2012-10-01 12:55 UTC, H.J. Lu
Details | Diff
A patch with fixed ChangeLog (700 bytes, patch)
2012-10-01 13:32 UTC, H.J. Lu
Details | Diff
After applying patch - gcc4.7 (5.17 KB, text/x-csrc)
2012-10-02 17:00 UTC, M.S. Babaei
Details
Followup patch for config/i386/driver-i386.c (720 bytes, patch)
2012-10-03 17:36 UTC, Andrew W. Nosenko
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description M.S. Babaei 2012-09-29 00:45:14 UTC
Created attachment 28297 [details]
both test.ii and test.s files from save-temp output

It's been quite a while that this bug is around with GCC 4.4+ on FreeBSD systems (at least 8.2-Release and 9.0-Release which tested by me). If you have a sandy-bridge or ivy-bridge cpu a code like this get killed by SIGILL when compiled using -march=native:

#include <iostream>
#include <string>
#include <unordered_map>

int main() {
    std::unordered_map<std::string, std::string> hash; 
    std::cout << "Hello, World!" << std::endl;
    return 0;
}

# g++46 -std=c++0x -o test test.cpp
# ./test 
Hello, World!

# g++46 -std=c++0x -march=native -o test test.cpp
# ./test
Illegal instruction: 4

This is gdbs output:
Program received signal SIGILL, Illegal instruction.
0x00000000004011dc in std::_Hashtable<std::string, std::pair<std::string const, std::string>, std::allocator<std::pair<std::string const, std::string> >, std::_Select1st<std::pair<std::string const, std::string> >, std::equal_to<std::string>, std::hash<std::string>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, false, false, true>::_Hashtable ()

I know the code above is using C++11 standard headers, but this bug is not a C++11 related bug, the code above is just a known example to me. If you look at this thread (which was opened by me nearly 1.5 years ago) on FreeBSD forums http://forums.freebsd.org/showthread.php?t=23535 you'll see even C code (GCC itself and nearly anything compiled by -march=native on my system) affected by this bug.


# g++46 -v -save-temps -std=c++0x -march=native -o test test.cpp
Using built-in specs.
COLLECT_GCC=g++46
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/lto-wrapper
Target: x86_64-portbld-freebsd9.0
Configured with: ./../gcc-4.6-20120831/configure --disable-nls --enable-languages=c,c++,objc,fortran --libdir=/usr/local/lib/gcc46 --libexecdir=/usr/local/libexec/gcc46 --program-suffix=46 --with-as=/usr/local/bin/as --with-gmp=/usr/local --with-gxx-include-dir=/usr/local/lib/gcc46/include/c++/ --with-ld=/usr/local/bin/ld --with-libiconv-prefix=/usr/local --with-pkgversion='FreeBSD Ports Collection' --with-system-zlib --disable-libgcj --prefix=/usr/local --mandir=/usr/local/man --infodir=/usr/local/info/gcc46 --build=x86_64-portbld-freebsd9.0
Thread model: posix
gcc version 4.6.4 20120831 (prerelease) (FreeBSD Ports Collection) 
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-std=c++0x' '-march=native' '-o' 'test' '-shared-libgcc'
 /usr/local/libexec/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/cc1plus -E -quiet -v test.cpp -march=corei7-avx -mcx16 -msahf -mno-movbe -maes -mpclmul -mpopcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm -mavx -msse4.2 -msse4.1 -mno-rdrnd -mno-f16c -mno-fsgsbase --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=6144 -mtune=corei7-avx -std=c++0x -fpch-preprocess -o test.ii
ignoring nonexistent directory "/usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/../../../../../x86_64-portbld-freebsd9.0/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/lib/gcc46/include/c++/
 /usr/local/lib/gcc46/include/c++//x86_64-portbld-freebsd9.0
 /usr/local/lib/gcc46/include/c++//backward
 /usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/include
 /usr/local/include
 /usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/include-fixed
 /usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-std=c++0x' '-march=native' '-o' 'test' '-shared-libgcc'
 /usr/local/libexec/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/cc1plus -fpreprocessed test.ii -march=corei7-avx -mcx16 -msahf -mno-movbe -maes -mpclmul -mpopcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm -mavx -msse4.2 -msse4.1 -mno-rdrnd -mno-f16c -mno-fsgsbase --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=6144 -mtune=corei7-avx -quiet -dumpbase test.cpp -auxbase test -std=c++0x -version -o test.s
GNU C++ (FreeBSD Ports Collection) version 4.6.4 20120831 (prerelease) (x86_64-portbld-freebsd9.0)
	compiled by GNU C version 4.6.4 20120831 (prerelease), GMP version 5.0.5, MPFR version 3.1.1, MPC version 0.9
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C++ (FreeBSD Ports Collection) version 4.6.4 20120831 (prerelease) (x86_64-portbld-freebsd9.0)
	compiled by GNU C version 4.6.4 20120831 (prerelease), GMP version 5.0.5, MPFR version 3.1.1, MPC version 0.9
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: f66add45a86dc64383d28918a222f366
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-std=c++0x' '-march=native' '-o' 'test' '-shared-libgcc'
 /usr/local/bin/as -v -o test.o test.s
GNU assembler version 2.22 (x86_64-portbld-freebsd9.0) using BFD version (GNU Binutils) 2.22
COMPILER_PATH=/usr/local/libexec/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/:/usr/local/libexec/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/:/usr/local/libexec/gcc46/gcc/x86_64-portbld-freebsd9.0/:/usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/:/usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/:/usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/../../../../../x86_64-portbld-freebsd9.0/bin/
LIBRARY_PATH=/usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/:/usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/../../../../../x86_64-portbld-freebsd9.0/lib/:/usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-std=c++0x' '-march=native' '-o' 'test' '-shared-libgcc'
 /usr/local/libexec/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/collect2 --eh-frame-hdr -V -dynamic-linker /libexec/ld-elf.so.1 -o test /usr/lib/crt1.o /usr/lib/crti.o /usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/crtbegin.o -L/usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4 -L/usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/../../../../../x86_64-portbld-freebsd9.0/lib -L/usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/../../.. test.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/local/lib/gcc46/gcc/x86_64-portbld-freebsd9.0/4.6.4/crtend.o /usr/lib/crtn.o
GNU ld (GNU Binutils) 2.22
  Supported emulations:
   elf_x86_64_fbsd
   elf_i386_fbsd
   elf_x86_64
   elf_i386
   elf_l1om
   elf_l1om_fbsd
   elf_k1om
   elf_k1om_fbsd
Comment 1 Andrew Pinski 2012-09-29 01:07:54 UTC
What is the instruction it is causing an illegal instruction signal?
Run the resulting program using gdb to find out.
Comment 2 M.S. Babaei 2012-09-29 05:43:13 UTC
(In reply to comment #1)
> What is the instruction it is causing an illegal instruction signal?
> Run the resulting program using gdb to find out.

As I mentiond above running the program inside gdb produces these:

Program received signal SIGILL, Illegal instruction.
0x00000000004011dc in std::_Hashtable<std::string, std::pair<std::string const,
std::string>, std::allocator<std::pair<std::string const, std::string> >,
std::_Select1st<std::pair<std::string const, std::string> >,
std::equal_to<std::string>, std::hash<std::string>,
std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash,
std::__detail::_Prime_rehash_policy, false, false, true>::_Hashtable ()
Comment 3 Richard Biener 2012-10-01 10:20:43 UTC
use disassemble from inside gdb and look for the faulting instruction.
Comment 4 Mikael Pettersson 2012-10-01 11:20:21 UTC
I suspect that the BSD kernel in question doesn't support AVX, so CPUID reports AVX but not OSXSAVE.  That would cause any AVX insn to #UD.  If this is the case, then gcc's -march=native is in error for failing to check OSXSAVE.
Comment 5 H.J. Lu 2012-10-01 12:55:07 UTC
Created attachment 28311 [details]
A patch

Please try this.
Comment 6 Jakub Jelinek 2012-10-01 13:26:54 UTC
s/FAM/FMA/ in the ChangeLog entry.
Comment 7 H.J. Lu 2012-10-01 13:32:51 UTC
Created attachment 28312 [details]
A patch with fixed ChangeLog
Comment 8 M.S. Babaei 2012-10-02 07:14:32 UTC
(In reply to comment #3)
> use disassemble from inside gdb and look for the faulting instruction.

Sorry I'm not very familiar with GDB, but I assume you need this:


(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400bc4 <main+0>:	push   %rbp
0x0000000000400bc5 <main+1>:	mov    %rsp,%rbp
0x0000000000400bc8 <main+4>:	push   %rbx
0x0000000000400bc9 <main+5>:	sub    $0x48,%rsp
0x0000000000400bcd <main+9>:	lea    -0x13(%rbp),%rax
0x0000000000400bd1 <main+13>:	mov    %rax,%rdi
0x0000000000400bd4 <main+16>:	callq  0x400e7c <_ZNSaISt4pairIKSsSsEEC2Ev>
0x0000000000400bd9 <main+21>:	lea    -0x13(%rbp),%rsi
0x0000000000400bdd <main+25>:	lea    -0x12(%rbp),%rcx
0x0000000000400be1 <main+29>:	lea    -0x11(%rbp),%rdx
0x0000000000400be5 <main+33>:	lea    -0x50(%rbp),%rax
0x0000000000400be9 <main+37>:	mov    %rsi,%r8
0x0000000000400bec <main+40>:	mov    $0xa,%esi
0x0000000000400bf1 <main+45>:	mov    %rax,%rdi
0x0000000000400bf4 <main+48>:	callq  0x400eb0 <_ZNSt13unordered_mapISsSsSt4hashISsESt8equal_toISsESaISt4pairIKSsSsEEEC2EmRKS1_RKS3_RKS7_>
0x0000000000400bf9 <main+53>:	lea    -0x13(%rbp),%rax
0x0000000000400bfd <main+57>:	mov    %rax,%rdi
0x0000000000400c00 <main+60>:	callq  0x400e96 <_ZNSaISt4pairIKSsSsEED2Ev>
0x0000000000400c05 <main+65>:	mov    $0x4016b2,%esi
0x0000000000400c0a <main+70>:	mov    $0x602c40,%edi
0x0000000000400c0f <main+75>:	callq  0x4009f0 basic_ostreamIcT_ES5_PKc@plt>
0x0000000000400c14 <main+80>:	mov    $0x400a40,%esi
0x0000000000400c19 <main+85>:	mov    %rax,%rdi
0x0000000000400c1c <main+88>:	callq  0x400a20 <_ZNSolsEPFRSoS_E@plt>
0x0000000000400c21 <main+93>:	mov    $0x0,%ebx
0x0000000000400c26 <main+98>:	lea    -0x50(%rbp),%rax
0x0000000000400c2a <main+102>:	mov    %rax,%rdi
0x0000000000400c2d <main+105>:	callq  0x400dba <_ZNSt13unordered_mapISsSsSt4hashISsESt8equal_toISsESaISt4pairIKSsSsEEED2Ev>
0x0000000000400c32 <main+110>:	mov    %ebx,%eax
0x0000000000400c34 <main+112>:	add    $0x48,%rsp
0x0000000000400c38 <main+116>:	pop    %rbx
0x0000000000400c39 <main+117>:	pop    %rbp
0x0000000000400c3a <main+118>:	retq   
0x0000000000400c3b <main+119>:	mov    %rax,%rbx
0x0000000000400c3e <main+122>:	lea    -0x13(%rbp),%rax
0x0000000000400c42 <main+126>:	mov    %rax,%rdi
0x0000000000400c45 <main+129>:	callq  0x400e96 <_ZNSaISt4pairIKSsSsEED2Ev>
0x0000000000400c4a <main+134>:	mov    %rbx,%rax
0x0000000000400c4d <main+137>:	mov    %rax,%rdi
0x0000000000400c50 <main+140>:	callq  0x400a80 <_Unwind_Resume@plt>
0x0000000000400c55 <main+145>:	mov    %rax,%rbx
0x0000000000400c58 <main+148>:	lea    -0x50(%rbp),%rax
0x0000000000400c5c <main+152>:	mov    %rax,%rdi
0x0000000000400c5f <main+155>:	callq  0x400dba <_ZNSt13unordered_mapISsSsSt4hashISsESt8equal_toISsESaISt4pairIKSsSsEEED2Ev>
0x0000000000400c64 <main+160>:	mov    %rbx,%rax
0x0000000000400c67 <main+163>:	mov    %rax,%rdi
0x0000000000400c6a <main+166>:	callq  0x400a80 <_Unwind_Resume@plt>
End of assembler dump.


If not then tell me the steps and I'll post what you need.
Comment 9 M.S. Babaei 2012-10-02 07:18:23 UTC
(In reply to comment #7)
> Created attachment 28312 [details]
> A patch with fixed ChangeLog

Now I'm at work. I'll try your patch to build GCC and post later.
Comment 10 Uroš Bizjak 2012-10-02 09:59:39 UTC
(In reply to comment #7)
> Created attachment 28312 [details]
> A patch with fixed ChangeLog

The patch looks OK, but please introduce some #defines:

#define XCR_XFEATURE_ENABLED_MASK       0x0

#define XSTATE_FP       0x1
#define XSTATE_SSE      0x2
#define XSTATE_YMM      0x4

(Please see testsuite/gcc.target/i386/avx-os-support.h)
Comment 11 M.S. Babaei 2012-10-02 16:58:57 UTC
Well well, something happened here!! This bug does not affect me anymore; Now with or without your patch the above example code works just fine! I even tried crypto++ which I had problem with in the past but it works fine too.

# uname -a
FreeBSD 13x17.localhost 9.1-PRERELEASE FreeBSD 9.1-PRERELEASE #0: Sat Sep 29 20:53:22 IRST 2012     babaei@13x17.localhost:/usr/obj/usr/src/sys/GENERIC  amd64

Two days ago I upgraded my system to 9-STABLE branch (a development branch), looks like the FreeBSD folks implemented AVX support into thier kernel.

Hopefully I kept the old kernel. I rebuilt GCC 4.7 with your patch (GCC 4.6.4 is my system wide GCC and I'll won't mess with that one).
Then I booted to the old kernel and rebuilt the above example code and, hell yeah!! Your patch does the job, the program finished normally with 'Hello, World!' printed out on screen (Note: with old kernel without your patch, it still get killed by SIGILL).

Since this is a development branch and is not out yet I believe your patch is still relevant. Because out there 9.0, 8.3 and 7.4 is being used by people.  

And one more thing. Your patch is little different from my 'driver-i386.c' file (gcc-4.7-20120929) and my patch command failed to merge it. I merged your patch manually which is attached.
Comment 12 M.S. Babaei 2012-10-02 17:00:28 UTC
Created attachment 28327 [details]
After applying patch - gcc4.7
Comment 13 hjl@gcc.gnu.org 2012-10-02 19:49:05 UTC
Author: hjl
Date: Tue Oct  2 19:49:01 2012
New Revision: 191998

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=191998
Log:
Check SSE and YMM state support for -march=native

2012-10-02  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/54741
	*  config/i386/driver-i386.c (XCR_XFEATURE_ENABLED_MASK): New.
	(XSTATE_FP): Likewise.
	(XSTATE_SSE): Likewise.
	(XSTATE_YMM): Likewise.
	(host_detect_local_cpu): Disable AVX, AVX2, FMA, FMA4 and XOP if
	SSE and YMM states aren't supported.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/driver-i386.c
Comment 14 hjl@gcc.gnu.org 2012-10-02 20:25:13 UTC
Author: hjl
Date: Tue Oct  2 20:25:04 2012
New Revision: 192003

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=192003
Log:
Check SSE and YMM state support for -march=native

	Backported from mainline

	PR target/54741
	*  config/i386/driver-i386.c (XCR_XFEATURE_ENABLED_MASK): New.
	(XSTATE_FP): Likewise.
	(XSTATE_SSE): Likewise.
	(XSTATE_YMM): Likewise.
	(host_detect_local_cpu): Disable AVX, AVX2, FMA, FMA4 and XOP if
	SSE and YMM states aren't supported.

Modified:
    branches/gcc-4_7-branch/gcc/ChangeLog
    branches/gcc-4_7-branch/gcc/config/i386/driver-i386.c
Comment 15 hjl@gcc.gnu.org 2012-10-02 20:31:43 UTC
Author: hjl
Date: Tue Oct  2 20:31:40 2012
New Revision: 192004

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=192004
Log:
Check SSE and YMM state support for -march=native

	Backported from mainline
	PR target/54741
	*  config/i386/driver-i386.c (XCR_XFEATURE_ENABLED_MASK): New.
	(XSTATE_FP): Likewise.
	(XSTATE_SSE): Likewise.
	(XSTATE_YMM): Likewise.
	(host_detect_local_cpu): Disable AVX, FMA, FMA4 and XOP if SSE
	and YMM states aren't supported.

Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/config/i386/driver-i386.c
Comment 16 H.J. Lu 2012-10-02 20:32:39 UTC
Fixed.
Comment 17 Andrew W. Nosenko 2012-10-03 17:32:51 UTC
Sorry, but commited patch does NOT fixes the problem.
It does just reverse -- disables AVX & Co on systems the has OSXSAVE, XSTATE_SSE and XSTATE_YMM bits set, while the intention was to disable if any of these bits are not set.

Proposed followup patch is attached.
Comment 18 Andrew W. Nosenko 2012-10-03 17:36:28 UTC
Created attachment 28342 [details]
Followup patch for config/i386/driver-i386.c
Comment 19 Jakub Jelinek 2012-10-03 18:06:29 UTC
The patch looks good to me, but patches should be posted to gcc-patches at gcc.gnu.org mailing list instead.