Summary: | Generate vzeroupper with -mavx512f -mno-avx512er -O2 | ||
---|---|---|---|
Product: | gcc | Reporter: | H.J. Lu <hjl.tools> |
Component: | target | Assignee: | Not yet assigned to anyone <unassigned> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | ubizjak |
Priority: | P3 | ||
Version: | 8.0 | ||
Target Milestone: | 8.0 | ||
Host: | Target: | x86_64-*-*, i?86-*-* | |
Build: | Known to work: | ||
Known to fail: | Last reconfirmed: | 2017-11-10 00:00:00 | |
Bug Depends on: | |||
Bug Blocks: | 82941 | ||
Attachments: |
An untested patch
An untested patch |
Description
H.J. Lu
2017-11-10 17:09:11 UTC
class pass_insert_vzeroupper : public rtl_opt_pass { public: pass_insert_vzeroupper(gcc::context *ctxt) : rtl_opt_pass(pass_data_insert_vzeroupper, ctxt) {} /* opt_pass methods: */ virtual bool gate (function *) { return TARGET_AVX && !TARGET_AVX512F && TARGET_VZEROUPPER && flag_expensive_optimizations && !optimize_size; } virtual unsigned int execute (function *) { return rest_of_handle_insert_vzeroupper (); } }; // class pass_insert_vzeroupper Created attachment 42583 [details]
An untested patch
(In reply to Uroš Bizjak from comment #1) > return TARGET_AVX && !TARGET_AVX512F Should !TARGET_AVX512F be changed to !TARGET_AVX152ER in gate function? Created attachment 42584 [details]
An untested patch
(In reply to Uroš Bizjak from comment #3) > (In reply to Uroš Bizjak from comment #1) > > return TARGET_AVX && !TARGET_AVX512F > > Should !TARGET_AVX512F be changed to !TARGET_AVX152ER in gate function? Yes, the untested patch is updated. Patch has been sent: https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01052.html Author: speryt Date: Wed Nov 15 12:27:31 2017 New Revision: 254763 URL: https://gcc.gnu.org/viewcvs?rev=254763&root=gcc&view=rev Log: Fix PR82941 and PR82942 by adding proper vzeroupper generation on SKX. 2017-11-15 Sebastian Peryt <sebastian.peryt@intel.com> gcc/ PR target/82941 PR target/82942 * config/i386/i386.c (pass_insert_vzeroupper): Modify gate condition to return true on Xeon and not on Xeon Phi. (ix86_check_avx256_register): Changed to ... (ix86_check_avx_upper_register): ... this. Add extra check for VALID_AVX512F_REG_OR_XI_MODE. (ix86_avx_u128_mode_needed): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_check_avx256_stores): Changed to ... (ix86_check_avx_upper_stores): ... this. Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_after): Changed avx_reg256_found to avx_upper_reg_found. Changed ix86_check_avx256_stores to ix86_check_avx_upper_stores. (ix86_avx_u128_mode_entry): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_exit): Ditto. * config/i386/i386.h: (host_detect_local_cpu): New define. 2017-11-15 Sebastian Peryt <sebastian.peryt@intel.com> gcc/testsuite/ PR target/82941 PR target/82942 * gcc.target/i386/pr82941-1.c: New test. * gcc.target/i386/pr82941-2.c: New test. * gcc.target/i386/pr82942-1.c: New test. * gcc.target/i386/pr82942-2.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr82941-1.c trunk/gcc/testsuite/gcc.target/i386/pr82941-2.c trunk/gcc/testsuite/gcc.target/i386/pr82942-1.c trunk/gcc/testsuite/gcc.target/i386/pr82942-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/i386.h trunk/gcc/testsuite/ChangeLog Author: speryt Date: Mon Dec 4 11:03:37 2017 New Revision: 255378 URL: https://gcc.gnu.org/viewcvs?rev=255378&root=gcc&view=rev Log: Fix PR82941 and PR82942 by adding proper vzeroupper generation on SKX. Add X86_TUNE_EMIT_VZEROUPPER to indicate if vzeroupper instruction should be inserted before a transfer of control flow out of the function. It is turned on by default unless we are tuning for KNL. Users can always use -mzeroupper or -mno-zeroupper to override X86_TUNE_EMIT_VZEROUPPER. 2017-12-04 Sebastian Peryt <sebastian.peryt@intel.com> H.J. Lu <hongjiu.lu@intel.com> gcc/ Bakcported from trunk PR target/82941 PR target/82942 PR target/82990 * config/i386/i386.c (pass_insert_vzeroupper): Remove TARGET_AVX512F check from gate condition. (ix86_check_avx256_register): Changed to ... (ix86_check_avx_upper_register): ... this. Add extra check for VALID_AVX512F_REG_OR_XI_MODE. (ix86_avx_u128_mode_needed): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_check_avx256_stores): Changed to ... (ix86_check_avx_upper_stores): ... this. Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_after): Changed avx_reg256_found to avx_upper_reg_found. Changed ix86_check_avx256_stores to ix86_check_avx_upper_stores. (ix86_avx_u128_mode_entry): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_exit): Ditto. (ix86_option_override_internal): Set MASK_VZEROUPPER if neither -mzeroupper nor -mno-zeroupper is used and TARGET_EMIT_VZEROUPPER is set. * config/i386/i386.h: (host_detect_local_cpu): New define. (TARGET_EMIT_VZEROUPPER): New. * config/i386/x86-tune.def: Add X86_TUNE_EMIT_VZEROUPPER. 2017-12-04 Sebastian Peryt <sebastian.peryt@intel.com> H.J. Lu <hongjiu.lu@intel.com> gcc/testsuite/ Backported from trunk PR target/82941 PR target/82942 PR target/82990 * gcc.target/i386/pr82941-1.c: New test. * gcc.target/i386/pr82941-2.c: Likewise. * gcc.target/i386/pr82942-1.c: Likewise. * gcc.target/i386/pr82942-2.c: Likewise. * gcc.target/i386/pr82990-1.c: Likewise. * gcc.target/i386/pr82990-2.c: Likewise. * gcc.target/i386/pr82990-3.c: Likewise. * gcc.target/i386/pr82990-4.c: Likewise. * gcc.target/i386/pr82990-5.c: Likewise. * gcc.target/i386/pr82990-6.c: Likewise. * gcc.target/i386/pr82990-7.c: Likewise. Added: branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr82941-1.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr82941-2.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr82942-1.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr82942-2.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr82990-1.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr82990-2.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr82990-3.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr82990-4.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr82990-5.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr82990-6.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr82990-7.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/i386/i386.c branches/gcc-7-branch/gcc/config/i386/i386.h branches/gcc-7-branch/gcc/config/i386/x86-tune.def branches/gcc-7-branch/gcc/testsuite/ChangeLog Author: speryt Date: Mon Dec 4 11:40:44 2017 New Revision: 255379 URL: https://gcc.gnu.org/viewcvs?rev=255379&root=gcc&view=rev Log: Fix PR82941 and PR82942 by adding proper vzeroupper generation on SKX. Add X86_TUNE_EMIT_VZEROUPPER to indicate if vzeroupper instruction should be inserted before a transfer of control flow out of the function. It is turned on by default unless we are tuning for KNL. Users can always use -mzeroupper or -mno-zeroupper to override X86_TUNE_EMIT_VZEROUPPER. 2017-12-04 Sebastian Peryt <sebastian.peryt@intel.com> H.J. Lu <hongjiu.lu@intel.com> gcc/ Bakcported from trunk PR target/82941 PR target/82942 PR target/82990 * config/i386/i386.c (pass_insert_vzeroupper): Remove TARGET_AVX512F check from gate condition. (ix86_check_avx256_register): Changed to ... (ix86_check_avx_upper_register): ... this. Add extra check for VALID_AVX512F_REG_OR_XI_MODE. (ix86_avx_u128_mode_needed): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_check_avx256_stores): Changed to ... (ix86_check_avx_upper_stores): ... this. Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_after): Changed avx_reg256_found to avx_upper_reg_found. Changed ix86_check_avx256_stores to ix86_check_avx_upper_stores. (ix86_avx_u128_mode_entry): Changed ix86_check_avx256_register to ix86_check_avx_upper_register. (ix86_avx_u128_mode_exit): Ditto. (ix86_option_override_internal): Set MASK_VZEROUPPER if neither -mzeroupper nor -mno-zeroupper is used and TARGET_EMIT_VZEROUPPER is set. * config/i386/i386.h: (host_detect_local_cpu): New define. (TARGET_EMIT_VZEROUPPER): New. * config/i386/x86-tune.def: Add X86_TUNE_EMIT_VZEROUPPER. gcc/testsuite/ Backported from trunk PR target/82941 PR target/82942 PR target/82990 * gcc.target/i386/pr82941-1.c: New test. * gcc.target/i386/pr82941-2.c: Likewise. * gcc.target/i386/pr82942-1.c: Likewise. * gcc.target/i386/pr82942-2.c: Likewise. * gcc.target/i386/pr82990-1.c: Likewise. * gcc.target/i386/pr82990-2.c: Likewise. * gcc.target/i386/pr82990-3.c: Likewise. * gcc.target/i386/pr82990-4.c: Likewise. * gcc.target/i386/pr82990-5.c: Likewise. * gcc.target/i386/pr82990-6.c: Likewise. * gcc.target/i386/pr82990-7.c: Likewise. Added: branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr82941-1.c branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr82941-2.c branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr82942-1.c branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr82942-2.c branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr82990-1.c branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr82990-2.c branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr82990-3.c branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr82990-4.c branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr82990-5.c branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr82990-6.c branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr82990-7.c Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/config/i386/i386.c branches/gcc-6-branch/gcc/config/i386/i386.h branches/gcc-6-branch/gcc/config/i386/x86-tune.def branches/gcc-6-branch/gcc/testsuite/ChangeLog Fixed for GCC 8 and on GCC 6/7 branches. |