Bug 40266 - march-native gives -mno-sse4, but cpuinfo sse4_1
Summary: march-native gives -mno-sse4, but cpuinfo sse4_1
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.4.0
: P3 normal
Target Milestone: 4.4.1
Assignee: Not yet assigned to anyone
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-27 05:28 UTC by sean darcy
Modified: 2009-07-23 02:34 UTC (History)
1 user (show)

See Also:
Host: x86_64-redhat-linux
Target: x86_64-redhat-linux
Build: x86_64-redhat-linux
Known to work:
Known to fail:
Last reconfirmed: 2009-05-27 14:08:39


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description sean darcy 2009-05-27 05:28:58 UTC
If I run gcc -fverbose-asm -mtune=native -march=native -S x.c

I get
cat x.s:
       .file   "x.c"
# GNU C (GCC) version 4.4.0 20090506 (Red Hat 4.4.0-4) (x86_64-redhat-linux)
#       compiled by GNU C version 4.4.0 20090506 (Red Hat 4.4.0-4), GMP
version 4.2.4, MPFR version 2.4.1.
# GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
# options passed:  x.c -march=core2 -mcx16 -msahf --param l1-cache-size=32
# --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=core2
# -fverbose-asm
# options enabled:  -falign-loops -fargument-alias
...................
# -mfp-ret-in-387 -mfused-madd -mglibc -mieee-fp -mmmx -mno-sse4
# -mpush-args -mred-zone -msahf -msse -msse2 -msse3 -mssse3
............

cat /proc/cpuinfo:

flags           : .....sse sse2 .... ssse3 .... sse4_1 ...

It appears that since the cpu doesn't support both sse4.1 and sse4.2, gcc march-native disables sse4 altogether. Obviously, -msse4.1 -mno-sse4.2 would be better.
Comment 1 Uroš Bizjak 2009-05-27 08:06:48 UTC
(In reply to comment #0)

> cat /proc/cpuinfo:
> 
> flags           : .....sse sse2 .... ssse3 .... sse4_1 ...

Please post complete /proc/cpuinfo.
Comment 2 H.J. Lu 2009-05-27 14:08:39 UTC
A patch is posted at

http://gcc.gnu.org/ml/gcc-patches/2009-05/msg01739.html
Comment 3 hjl@gcc.gnu.org 2009-05-27 14:39:43 UTC
Subject: Bug 40266

Author: hjl
Date: Wed May 27 14:39:23 2009
New Revision: 147913

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147913
Log:
2009-05-27  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/40266
	* config/i386/driver-i386.c (host_detect_local_cpu): Support
	AVX, SSE4, AES, PCLMUL and POPCNT.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/driver-i386.c

Comment 4 hjl@gcc.gnu.org 2009-05-27 14:54:18 UTC
Subject: Bug 40266

Author: hjl
Date: Wed May 27 14:54:00 2009
New Revision: 147914

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147914
Log:
2009-05-27  H.J. Lu  <hongjiu.lu@intel.com>

	Backport from mainline:
	2009-05-27  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/40266
	* config/i386/driver-i386.c (host_detect_local_cpu): Support
	AVX, SSE4, AES, PCLMUL and POPCNT.

Modified:
    branches/gcc-4_4-branch/gcc/ChangeLog
    branches/gcc-4_4-branch/gcc/config/i386/driver-i386.c

Comment 5 H.J. Lu 2009-05-27 14:54:36 UTC
Fixed.
Comment 6 sean darcy 2009-05-27 15:10:37 UTC
(In reply to comment #1)
> (In reply to comment #0)
> 
> > cat /proc/cpuinfo:
> > 
> > flags           : .....sse sse2 .... ssse3 .... sse4_1 ...
> 
> Please post complete /proc/cpuinfo.
> 

It's quad-core, so here's just 0 cpu:

[root@intel64-office ffmpeg]# cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 23
model name	: Intel(R) Core(TM)2 Quad CPU    Q8300  @ 2.50GHz
stepping	: 10
cpu MHz		: 2003.000
cache size	: 2048 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm sse4_1 xsave lahf_lm
bogomips	: 6000.06
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

Comment 7 sean darcy 2009-07-22 23:24:23 UTC
The target milestone was just changed to 4.4.2 from 4.4.1.

Why? Doesn't the fix from H.J.Lu below work?

Indeed, isn't the fix already in?

If so, shouldn't this be marked FIXED?

sean

Comment 8 H.J. Lu 2009-07-23 02:34:39 UTC
Fixed.