[PATCH V2 00/10] Initial support for AVX512FP16

liuhongt hongtao.liu@intel.com
Wed Jul 21 07:43:37 GMT 2021


Hi:
  As discussed in [1], this patch support _Float16 under target sse2
and above, w/o avx512fp16, _Float16 type is storage only, all operations
are emulated by soft-fp and float instructions. Soft-fp keeps the intermediate
result of the operation at 32-bit precision by defaults, which may lead to
inconsistent behavior between soft-fp and avx512fp16 instructions, using option
-fexcess-precision=standard will force round back after every operation.
 
[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574112.html

There's 10 patches in this series:

1)  Update hf soft-fp from glibc.
2)  [i386] Enable _Float16 type for TARGET_SSE2 and above.
3)  [i386] libgcc: Enable hfmode soft-sf/df/xf/tf extensions and
    truncations.
4) AVX512FP16: Initial support for AVX512FP16 feature and scalar _Float16
instructions.
5) AVX512FP16: Support vector init/broadcast/set/extract for FP16.
6) AVX512FP16: Add testcase for vector init and broadcast intrinsics.
7) AVX512FP16: Add tests for vector passing in variable arguments.
8) AVX512FP16: Add ABI tests for xmm.
9) AVX512FP16: Add ABI test for ymm.
10) AVX512FP16: Add abi test for zmm

  Bootstrapped and regtested on x86_64-linux-gnu{-m32,} on CLX.
  Boostrappped and regtested on x86_64-linux-gnu{-m32\ -march=native,\ -march=native} on SPR.
  Pass 300+ new tests under gcc.dg/torture/*float16*
  
  On SPR, there're regressions related to FLT_EVAL_METHODS for pr69225-[1234567].c
 since TARGET_AVX512FP16 will set FLT_EVAL_MATHOD as FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16.

 gcc/common/config/i386/cpuinfo.h              |    2 +
 gcc/common/config/i386/i386-common.c          |   26 +-
 gcc/common/config/i386/i386-cpuinfo.h         |    1 +
 gcc/common/config/i386/i386-isas.h            |    1 +
 gcc/config.gcc                                |    2 +-
 gcc/config/i386/avx512fp16intrin.h            |  225 ++++
 gcc/config/i386/cpuid.h                       |    1 +
 gcc/config/i386/i386-builtin-types.def        |    7 +-
 gcc/config/i386/i386-builtins.c               |   23 +
 gcc/config/i386/i386-c.c                      |    2 +
 gcc/config/i386/i386-expand.c                 |  129 +-
 gcc/config/i386/i386-isa.def                  |    1 +
 gcc/config/i386/i386-modes.def                |   13 +-
 gcc/config/i386/i386-options.c                |    4 +-
 gcc/config/i386/i386.c                        |  238 +++-
 gcc/config/i386/i386.h                        |   28 +-
 gcc/config/i386/i386.md                       |  304 ++++-
 gcc/config/i386/i386.opt                      |    4 +
 gcc/config/i386/immintrin.h                   |    4 +
 gcc/config/i386/sse.md                        |  395 ++++--
 gcc/doc/extend.texi                           |   16 +
 gcc/doc/invoke.texi                           |   10 +-
 gcc/lto/lto-lang.c                            |    3 +
 gcc/optabs-query.c                            |   10 +-
 gcc/testsuite/g++.dg/other/i386-2.C           |    2 +-
 gcc/testsuite/g++.dg/other/i386-3.C           |    2 +-
 gcc/testsuite/g++.target/i386/float16-1.C     |    8 +
 gcc/testsuite/g++.target/i386/float16-2.C     |   14 +
 gcc/testsuite/g++.target/i386/float16-3.C     |   10 +
 gcc/testsuite/gcc.target/i386/avx-1.c         |    2 +-
 gcc/testsuite/gcc.target/i386/avx-2.c         |    2 +-
 gcc/testsuite/gcc.target/i386/avx512-check.h  |    3 +
 .../gcc.target/i386/avx512fp16-10a.c          |   14 +
 .../gcc.target/i386/avx512fp16-10b.c          |   25 +
 .../gcc.target/i386/avx512fp16-12a.c          |   21 +
 .../gcc.target/i386/avx512fp16-12b.c          |   27 +
 gcc/testsuite/gcc.target/i386/avx512fp16-1a.c |   24 +
 gcc/testsuite/gcc.target/i386/avx512fp16-1b.c |   32 +
 gcc/testsuite/gcc.target/i386/avx512fp16-1c.c |   26 +
 gcc/testsuite/gcc.target/i386/avx512fp16-1d.c |   33 +
 gcc/testsuite/gcc.target/i386/avx512fp16-1e.c |   30 +
 gcc/testsuite/gcc.target/i386/avx512fp16-2a.c |   28 +
 gcc/testsuite/gcc.target/i386/avx512fp16-2b.c |   33 +
 gcc/testsuite/gcc.target/i386/avx512fp16-2c.c |   36 +
 gcc/testsuite/gcc.target/i386/avx512fp16-3a.c |   36 +
 gcc/testsuite/gcc.target/i386/avx512fp16-3b.c |   35 +
 gcc/testsuite/gcc.target/i386/avx512fp16-3c.c |   40 +
 gcc/testsuite/gcc.target/i386/avx512fp16-4.c  |   31 +
 gcc/testsuite/gcc.target/i386/avx512fp16-5.c  |  133 ++
 gcc/testsuite/gcc.target/i386/avx512fp16-6.c  |   57 +
 gcc/testsuite/gcc.target/i386/avx512fp16-7.c  |   86 ++
 gcc/testsuite/gcc.target/i386/avx512fp16-8.c  |   53 +
 gcc/testsuite/gcc.target/i386/avx512fp16-9a.c |   27 +
 gcc/testsuite/gcc.target/i386/avx512fp16-9b.c |   49 +
 .../gcc.target/i386/avx512fp16-vararg-1.c     |  122 ++
 .../gcc.target/i386/avx512fp16-vararg-2.c     |  107 ++
 .../gcc.target/i386/avx512fp16-vararg-3.c     |  114 ++
 .../gcc.target/i386/avx512fp16-vararg-4.c     |  115 ++
 .../gcc.target/i386/avx512fp16-vec_set_var.c  |   30 +
 gcc/testsuite/gcc.target/i386/float16-3a.c    |   10 +
 gcc/testsuite/gcc.target/i386/float16-3b.c    |   10 +
 gcc/testsuite/gcc.target/i386/float16-4a.c    |   10 +
 gcc/testsuite/gcc.target/i386/float16-4b.c    |   10 +
 gcc/testsuite/gcc.target/i386/funcspec-56.inc |    2 +
 gcc/testsuite/gcc.target/i386/m512-check.h    |   38 +-
 gcc/testsuite/gcc.target/i386/pr54855-12.c    |   14 +
 gcc/testsuite/gcc.target/i386/pr54855-13.c    |   14 +
 gcc/testsuite/gcc.target/i386/sse-13.c        |    2 +-
 gcc/testsuite/gcc.target/i386/sse-14.c        |    2 +-
 gcc/testsuite/gcc.target/i386/sse-22.c        |    4 +-
 gcc/testsuite/gcc.target/i386/sse-23.c        |    2 +-
 .../gcc.target/i386/sse2-float16-1.c          |    8 +
 .../gcc.target/i386/sse2-float16-2.c          |   16 +
 .../gcc.target/i386/sse2-float16-3.c          |   12 +
 .../abi/avx512fp16/abi-avx512fp16-xmm.exp     |   48 +
 .../gcc.target/x86_64/abi/avx512fp16/args.h   |  190 +++
 .../x86_64/abi/avx512fp16/asm-support.S       |   81 ++
 .../x86_64/abi/avx512fp16/avx512fp16-check.h  |   74 ++
 .../abi/avx512fp16/avx512fp16-xmm-check.h     |    3 +
 .../x86_64/abi/avx512fp16/defines.h           |  150 +++
 .../avx512fp16/m256h/abi-avx512fp16-ymm.exp   |   45 +
 .../x86_64/abi/avx512fp16/m256h/args.h        |  182 +++
 .../x86_64/abi/avx512fp16/m256h/asm-support.S |   81 ++
 .../avx512fp16/m256h/avx512fp16-ymm-check.h   |    3 +
 .../avx512fp16/m256h/test_m256_returning.c    |   54 +
 .../abi/avx512fp16/m256h/test_passing_m256.c  |  370 ++++++
 .../avx512fp16/m256h/test_passing_structs.c   |  113 ++
 .../avx512fp16/m256h/test_passing_unions.c    |  337 ++++++
 .../abi/avx512fp16/m256h/test_varargs-m256.c  |  160 +++
 .../avx512fp16/m512h/abi-avx512fp16-zmm.exp   |   48 +
 .../x86_64/abi/avx512fp16/m512h/args.h        |  186 +++
 .../x86_64/abi/avx512fp16/m512h/asm-support.S |   97 ++
 .../avx512fp16/m512h/avx512fp16-zmm-check.h   |    4 +
 .../avx512fp16/m512h/test_m512_returning.c    |   62 +
 .../abi/avx512fp16/m512h/test_passing_m512.c  |  380 ++++++
 .../avx512fp16/m512h/test_passing_structs.c   |  123 ++
 .../avx512fp16/m512h/test_passing_unions.c    |  415 +++++++
 .../abi/avx512fp16/m512h/test_varargs-m512.c  |  164 +++
 .../gcc.target/x86_64/abi/avx512fp16/macros.h |   53 +
 .../test_3_element_struct_and_unions.c        |  692 +++++++++++
 .../abi/avx512fp16/test_basic_alignment.c     |   45 +
 .../test_basic_array_size_and_align.c         |   43 +
 .../abi/avx512fp16/test_basic_returning.c     |   87 ++
 .../x86_64/abi/avx512fp16/test_basic_sizes.c  |   43 +
 .../test_basic_struct_size_and_align.c        |   42 +
 .../test_basic_union_size_and_align.c         |   40 +
 .../abi/avx512fp16/test_complex_returning.c   |  104 ++
 .../abi/avx512fp16/test_m64m128_returning.c   |   73 ++
 .../abi/avx512fp16/test_passing_floats.c      | 1066 +++++++++++++++++
 .../abi/avx512fp16/test_passing_m64m128.c     |  510 ++++++++
 .../abi/avx512fp16/test_passing_structs.c     |  332 +++++
 .../abi/avx512fp16/test_passing_unions.c      |  335 ++++++
 .../abi/avx512fp16/test_struct_returning.c    |  274 +++++
 .../x86_64/abi/avx512fp16/test_varargs-m128.c |  164 +++
 gcc/testsuite/lib/target-supports.exp         |   13 +-
 libgcc/config.host                            |    5 +-
 libgcc/config/i386/32/sfp-machine.h           |    1 +
 libgcc/config/i386/64/sfp-machine.h           |    1 +
 libgcc/config/i386/64/t-softfp                |    1 +
 libgcc/config/i386/sfp-machine.h              |    1 +
 libgcc/config/i386/t-softfp                   |    5 +
 libgcc/soft-fp/eqhf2.c                        |   49 +
 libgcc/soft-fp/extendhfdf2.c                  |   53 +
 libgcc/soft-fp/extendhfsf2.c                  |   49 +
 libgcc/soft-fp/half.h                         |    1 +
 libgcc/soft-fp/truncdfhf2.c                   |   52 +
 libgcc/soft-fp/truncsfhf2.c                   |   48 +
 127 files changed, 10324 insertions(+), 238 deletions(-)
 create mode 100644 gcc/config/i386/avx512fp16intrin.h
 create mode 100644 gcc/testsuite/g++.target/i386/float16-1.C
 create mode 100644 gcc/testsuite/g++.target/i386/float16-2.C
 create mode 100644 gcc/testsuite/g++.target/i386/float16-3.C
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-10a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-10b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-12a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-12b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-1a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-1b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-1c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-1d.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-1e.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-2b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-2c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-3a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-3b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-3c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-8.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-9a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-9b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vararg-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vararg-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vararg-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vararg-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-vec_set_var.c
 create mode 100644 gcc/testsuite/gcc.target/i386/float16-3a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/float16-3b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/float16-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/float16-4b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-12.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-13.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-float16-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-float16-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/sse2-float16-3.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/abi-avx512fp16-xmm.exp
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/args.h
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/asm-support.S
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/avx512fp16-check.h
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/avx512fp16-xmm-check.h
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/defines.h
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m256h/abi-avx512fp16-ymm.exp
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m256h/args.h
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m256h/asm-support.S
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m256h/avx512fp16-ymm-check.h
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m256h/test_m256_returning.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_m256.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_structs.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m256h/test_passing_unions.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m256h/test_varargs-m256.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m512h/abi-avx512fp16-zmm.exp
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m512h/args.h
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m512h/asm-support.S
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m512h/avx512fp16-zmm-check.h
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m512h/test_m512_returning.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_m512.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_structs.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m512h/test_passing_unions.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/m512h/test_varargs-m512.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/macros.h
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_3_element_struct_and_unions.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_basic_alignment.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_basic_array_size_and_align.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_basic_returning.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_basic_sizes.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_basic_struct_size_and_align.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_basic_union_size_and_align.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_complex_returning.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_m64m128_returning.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_passing_floats.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_passing_m64m128.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_passing_structs.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_passing_unions.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_struct_returning.c
 create mode 100644 gcc/testsuite/gcc.target/x86_64/abi/avx512fp16/test_varargs-m128.c
 create mode 100644 libgcc/config/i386/64/t-softfp
 create mode 100644 libgcc/soft-fp/eqhf2.c
 create mode 100644 libgcc/soft-fp/extendhfdf2.c
 create mode 100644 libgcc/soft-fp/extendhfsf2.c
 create mode 100644 libgcc/soft-fp/truncdfhf2.c
 create mode 100644 libgcc/soft-fp/truncsfhf2.c

-- 
2.18.1



More information about the Gcc-patches mailing list