This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Vzeroupper placement/47440
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: gcc-patches at gcc dot gnu dot org
- Cc: Vladimir Yakovlev <vbyakovl23 at gmail dot com>
- Date: Sun, 4 Nov 2012 14:29:45 +0100
- Subject: Re: [PATCH] Vzeroupper placement/47440
Hello!
2012-11-04 Vladimir Yakovlev <vladimir.b.yakovlev@intel.com>
* mode-switching.c (create_pre_exit): Added code for
maybe_builtin_apply case.
* config/i386/i386-protos.h (emit_i387_cw_initialization): Deleted.
(emit_vzero): Added prototype.
(ix86_mode_entry): Likewise.
(ix86_mode_exit): Likewise.
(ix86_emit_mode_set): Likewise.
* config/i386/i386.c (VALID_AVX256_REG_OR_OI_MODE): New.
(typedef struct block_info_def): Deleted.
(define BLOCK_INFO): Deleted.
(check_avx256_stores): Added checking for MEM_P.
(move_or_delete_vzeroupper_2): Deleted.
(move_or_delete_vzeroupper_1): Deleted.
(move_or_delete_vzeroupper): Deleted.
(ix86_maybe_emit_epilogue_vzeroupper): Deleted.
(function_pass_avx256_p): Deleted.
(ix86_function_ok_for_sibcall): Deleted disabling sibcall.
(nit_cumulative_args): Deleted initialization of of avx256 fields of
cfun->machine.
(ix86_emit_restore_sse_regs_using_mov): Deleted vzeroupper generation.
(ix86_expand_epilogue): Likewise.
(is_vzeroupper): New.
(is_vzeroall): New.
(ix86_avx_u128_mode_needed): New.
(ix86_i387_mode_needed): Renamed ix86_mode_needed.
(ix86_mode_needed): New.
(ix86_avx_u128_mode_after): New.
(ix86_mode_after): New.
(ix86_avx_u128_mode_entry): New.
(ix86_mode_entry): New.
(ix86_avx_u128_mode_exit): New.
(ix86_mode_exit): New.
(ix86_emit_vzeroupper): New.
(ix86_emit_mode_set): New.
(ix86_expand_call): Deleted vzeroupper generation.
(ix86_split_call_vzeroupper): Deleted.
(ix86_init_machine_status): Initialzed optimize_mode_switching.
(ix86_expand_special_args_builtin): Changed.
(ix86_reorg): Deleted a call of move_or_delete_vzeroupper.
* config/i386/i386.h (AVX_U128): New.
(avx_u128_state): New.
(NUM_MODES_FOR_MODE_SWITCHING): Added AVX_U128_ANY.
(MODE_AFTER): New.
(MODE_ENTRY): New.
(MODE_EXIT): New.
(EMIT_MODE_SET): Changed.
(machine_function): Deleted avx256 fields.
* config/i386/i386.md (UNSPEC_CALL_NEEDS_VZEROUPPER): Deleted.
(define_insn_and_split "*call_vzeroupper"): Deleted.
(define_insn_and_split "*call_rex64_ms_sysv_vzeroupper"): Deleted.
(define_insn_and_split "*sibcall_vzeroupper"): Deleted.
(define_insn_and_split "*call_pop_vzeroupper"): Deleted.
(define_insn_and_split "*sibcall_pop_vzeroupper"): Deleted.
(define_insn_and_split "*call_value_vzeroupper"): Deleted.
(define_insn_and_split "*sibcall_value_vzeroupper"): Deleted.
(define_insn_and_split "*call_value_rex64_ms_sysv_vzeroupper"): Deleted.
(define_insn_and_split "*call_value_pop_vzeroupper"): Deleted.
(define_insn_and_split "*sibcall_value_pop_vzeroupper"): Deleted.
(define_expand "return"): Deleted vzeroupper emitting.
(define_expand "simple_return"): Deleted.
2012-11-04 Vladimir Yakovlev <vladimir.b.yakovlev@intel.com>
* gcc.target/i386/avx-vzeroupper-5.c: Changed scan-assembler-times.
gcc.target/i386/avx-vzeroupper-8.c: Likewise.
gcc.target/i386/avx-vzeroupper-9.c: Likewise.
gcc.target/i386/avx-vzeroupper-10.c: Likewise.
gcc.target/i386/avx-vzeroupper-11.c: Likewise.
gcc.target/i386/avx-vzeroupper-12.c: Likewise.
gcc.target/i386/avx-vzeroupper-19.c: Likewis.
gcc.target/i386/avx-vzeroupper-27.c: New.
Target part (without mode-switching.c change) is OK for mainline, with
a few small changes below:
+#define VALID_AVX256_REG_OR_OI_MODE(m) (VALID_AVX256_REG_MODE (m) ||
(m) == OImode)
enum upper_128bits_state
Put this definition in i386.h, after VALID_AVX256_REG_MODE.
+static void
+ix86_emit_vzeroupper (void)
+{
+ emit_insn (gen_avx_vzeroupper (GEN_INT (9)));
+}
No need to pass argument to vzeroupper anymore. We have only one
vzeroupper type now, so following definition in sse.md could also be
changed from:
(define_insn "avx_vzeroupper"
[(unspec_volatile [(match_operand 0 "const_int_operand")]
UNSPECV_VZEROUPPER)]
to:
(define_insn "avx_vzeroupper"
[(unspec_volatile [(const_int 0)]
UNSPECV_VZEROUPPER)]
Please call gen_avx_vzeroupper directly, so ix86_emit_vzeroupper
wrapper function can be simply deleted.
+/* Check insn for vzeroupper intrinsic. */
+
+static bool
+is_vzeroupper (rtx pat)
+{
+ return pat
+ && GET_CODE (pat) == UNSPEC_VOLATILE
+ && XINT (pat, 1) == UNSPECV_VZEROUPPER;
+}
+
+/* Check insn for vzeroall intrinsic. */
+
+static bool
+is_vzeroall (rtx pat)
+{
+ return pat
+ && GET_CODE (pat) == PARALLEL
+ && GET_CODE (XVECEXP (pat, 0, 0)) == UNSPEC_VOLATILE
+ && XINT (XVECEXP (pat, 0, 0), 1) == UNSPECV_VZEROALL;
+}
These should be put in predicates.md. This can be in a follow-up patch.
case VOID_FTYPE_VOID:
if (icode == CODE_FOR_avx_vzeroupper)
- target = GEN_INT (vzeroupper_intrinsic);
+ target = GEN_INT (9);
emit_insn (GEN_FCN (icode) (target));
return 0;
Please use:
case VOID_FTYPE_VOID:
emit_insn (GEN_FCN (icode) ());
return 0;
Otherwise other VOID_FTYPE_VOID patterns will get excessive argument.
-/* { dg-final { scan-assembler-not "avx_vzeroupper" } } */
+/* { dg-final { scan-assembler-times "avx_vzeroupper" 3 } } */
(... and a couple of similar testsuite changes ...)
These asm scans were put there for a reason. I assume you have looked
at these differences and are correct (this also implies that current
vzeroupper placement code is not optimal or even wrong).
I will split out the mode-switching part and re-post it to mailing
list with an explanation. After this change is approved, please commit
the patch to mainline SVN with requested changes.
Thanks,
Uros.