95400 – -march=native and -march=icelake-client produce different results on icelake client

Bug 95400 - -march=native and -march=icelake-client produce different results on icelake client

Summary: -march=native and -march=icelake-client produce different results on icelake ...

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	9.2.1

Importance:	P3 normal
Target Milestone:	8.5
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-05-29 04:06 UTC by Travis Downs
Modified:	2020-06-13 22:58 UTC (History)
CC List:	4 users (show)

See Also:
Host:
Target:	x86_64--
Build:
Known to work:	11.0
Known to fail:
Last reconfirmed:	2020-05-29 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Travis Downs 2020-05-29 04:06:02 UTC

On an Ice Lake client machine, using -O3 -march=native produces 512-bit AVX-512 instructions, whereas -O3 -march=icelake-client produces 256-bit instructions.

Since this machine *is* Ice Lake client, I would expect both options to do the same thing.

Comment 1 Martin Liška 2020-05-29 07:18:39 UTC

Can you please provide output of:

$ gcc -march=native -c /tmp/foo.c --verbose
?

Comment 2 H.J. Lu 2020-05-29 11:19:07 UTC

[hjl@gnu-icl-1 gcc]$ head /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 126
model name	: Intel(R) Core(TM) i7-1065G7 CPU @ 1.30GHz
stepping	: 5
microcode	: 0x78
cpu MHz		: 705.337
cache size	: 8192 KB
physical id	: 0
[hjl@gnu-icl-1 gcc]$ gcc -v -S -v -march=native x.i 
Using built-in specs.
COLLECT_GCC=gcc
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --with-multilib-list=m32,m64,mx32 --build=x86_64-redhat-linux
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.1.1 20200507 (Red Hat 10.1.1-1) (GCC) 
COLLECT_GCC_OPTIONS='-v' '-S' '-v' '-march=native'
 /usr/libexec/gcc/x86_64-redhat-linux/10/cc1 -fpreprocessed x.i -march=icelake-client -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mmovbe -maes -msha -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi -msgx -mbmi2 -mno-pconfig -mno-wbnoinvd -mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mrdrnd -mf16c -mfsgsbase -mrdseed -mprfchw -madx -mfxsr -mxsave -mxsaveopt -mavx512f -mno-avx512er -mavx512cd -mno-avx512pf -mno-prefetchwt1 -mclflushopt -mxsavec -mxsaves -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mno-avx5124fmaps -mno-avx5124vnniw -mno-clwb -mno-mwaitx -mno-clzero -mpku -mrdpid -mgfni -mno-shstk -mavx512vbmi2 -mavx512vnni -mvaes -mvpclmulqdq -mavx512bitalg -mno-movdiri -mno-movdir64b -mno-waitpkg -mno-cldemote -mno-ptwrite -mno-avx512bf16 -mno-enqcmd -mno-avx512vp2intersect --param l1-cache-size=48 --param l1-cache-line-size=64 --param l2-cache-size=8192 -mtune=generic -quiet -dumpbase x.i -auxbase x -version -o x.s
GNU C17 (GCC) version 10.1.1 20200507 (Red Hat 10.1.1-1) (x86_64-redhat-linux)
	compiled by GNU C version 10.1.1 20200507 (Red Hat 10.1.1-1), GMP version 6.1.2, MPFR version 4.0.2-p7, MPC version 1.1.0, isl version isl-0.16.1-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C17 (GCC) version 10.1.1 20200507 (Red Hat 10.1.1-1) (x86_64-redhat-linux)
	compiled by GNU C version 10.1.1 20200507 (Red Hat 10.1.1-1), GMP version 6.1.2, MPFR version 4.0.2-p7, MPC version 1.1.0, isl version isl-0.16.1-GMP

GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 24c549deebb977f25c26f8c5acf76cc6
COMPILER_PATH=/usr/libexec/gcc/x86_64-redhat-linux/10/:/usr/libexec/gcc/x86_64-redhat-linux/10/:/usr/libexec/gcc/x86_64-redhat-linux/:/usr/lib/gcc/x86_64-redhat-linux/10/:/usr/lib/gcc/x86_64-redhat-linux/
LIBRARY_PATH=/usr/lib/gcc/x86_64-redhat-linux/10/:/usr/lib/gcc/x86_64-redhat-linux/10/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/gcc/x86_64-redhat-linux/10/../../../:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-v' '-S' '-v' '-march=native'
[hjl@gnu-icl-1 gcc]$ ./xgcc -B./ -v -S -v -march=native x.i 
Reading specs from ./specs
COLLECT_GCC=./xgcc
Target: x86_64-pc-linux-gnu
Configured with: /home/hjl/work/git/gitlab/x86-gcc/configure --disable-bootstrap --with-demangler-in-ld --prefix=/usr/gcc-11.0.0-x86-64 --with-local-prefix=/usr/local --enable-gnu-indirect-function --enable-clocale=gnu --with-system-zlib --with-target-system-zlib --with-fpmath=sse --disable-libcc1 --disable-libcilkrts --disable-libsanitizer --disable-libmpx --enable-languages=c,c++ : (reconfigured) /home/hjl/work/git/gitlab/x86-gcc/configure --disable-bootstrap --with-demangler-in-ld --prefix=/usr/gcc-11.0.0-x86-64 --with-local-prefix=/usr/local --enable-gnu-indirect-function --enable-clocale=gnu --with-system-zlib --with-target-system-zlib --with-fpmath=sse --disable-libcc1 --disable-libcilkrts --disable-libsanitizer --disable-libmpx CC=cc CFLAGS=-g CXX=g++ CXXFLAGS=-g --enable-languages=c,c++,lto --no-create --no-recursion
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.0.0 20200522 (experimental) (GCC) 
COLLECT_GCC_OPTIONS='-B' './' '-v' '-S' '-v' '-march=native'
 ./cc1 -fpreprocessed x.i -march=icelake-client -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -mno-sse4a -mcx16 -msahf -mmovbe -maes -msha -mpclmul -mpopcnt -mabm -mno-lwp -mfma -mno-fma4 -mno-xop -mbmi -msgx -mbmi2 -mno-pconfig -mno-wbnoinvd -mno-tbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mrdrnd -mf16c -mfsgsbase -mrdseed -mprfchw -madx -mfxsr -mxsave -mxsaveopt -mavx512f -mno-avx512er -mavx512cd -mno-avx512pf -mno-prefetchwt1 -mclflushopt -mxsavec -mxsaves -mavx512dq -mavx512bw -mavx512vl -mavx512ifma -mavx512vbmi -mno-avx5124fmaps -mno-avx5124vnniw -mno-clwb -mno-mwaitx -mno-clzero -mno-pku -mrdpid -mgfni -mno-shstk -mavx512vbmi2 -mavx512vnni -mvaes -mvpclmulqdq -mavx512bitalg -mavx512vpopcntdq -mno-movdiri -mno-movdir64b -mno-waitpkg -mno-cldemote -mno-ptwrite -mno-avx512bf16 -mno-enqcmd -mno-avx512vp2intersect -mno-serialize -mno-tsxldtrk --param l1-cache-size=48 --param l1-cache-line-size=64 --param l2-cache-size=8192 -mtune=icelake-client -quiet -dumpbase x.i -auxbase x -version -o x.s
GNU C17 (GCC) version 11.0.0 20200522 (experimental) (x86_64-pc-linux-gnu)
	compiled by GNU C version 10.1.1 20200507 (Red Hat 10.1.1-1), GMP version 6.1.2, MPFR version 4.0.2-p7, MPC version 1.1.0, isl version isl-0.16.1-GMP

GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
GNU C17 (GCC) version 11.0.0 20200522 (experimental) (x86_64-pc-linux-gnu)
	compiled by GNU C version 10.1.1 20200507 (Red Hat 10.1.1-1), GMP version 6.1.2, MPFR version 4.0.2-p7, MPC version 1.1.0, isl version isl-0.16.1-GMP

GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
Compiler executable checksum: 11f4b64fb9bc00bb503ef65b7f37bf0a
COMPILER_PATH=./
LIBRARY_PATH=./:/lib/../lib64/:/usr/lib/../lib64/:/lib/:/usr/lib/
COLLECT_GCC_OPTIONS='-B' './' '-v' '-S' '-v' '-march=native'
[hjl@gnu-icl-1 gcc]$

Comment 3 H.J. Lu 2020-05-29 11:24:44 UTC

Fixed on master by

commit d83e28f47f5467b435667122add2aa9730e1a89b
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Mon May 18 05:35:27 2020 -0700

    x86: Update Intel processor detection
    
    Add cpu model numbers for Intel Airmont, Tremont, Comet Lake, Ice Lake
    and Tiger Lake processor families.
    
            * config/i386/driver-i386.c (host_detect_local_cpu): Support
            Intel Airmont, Tremont, Comet Lake, Ice Lake and Tiger Lake
            processor families.

Comment 4 Martin Liška 2020-06-01 08:42:53 UTC

Can we backport the change to active branches?

Comment 5 Hongtao.liu 2020-06-05 04:37:45 UTC

(In reply to Martin Liška from comment #4)
> Can we backport the change to active branches?

Backport to GCC9, GCC10.
Partially backport to GCC8.(drop tremont and tigerlake part).

Comment 6 H.J. Lu 2020-06-13 22:58:40 UTC

Fixed GCC 11, GCC 10.2 (r10-8246), GCC 9.4 (r9-8652) and GCC 8.5 (r8-10298).