[PATCH] [8/9 Regression] i386: Add pass_remove_partial_avx_dependency
H.J. Lu
hjl.tools@gmail.com
Thu Feb 21 17:43:00 GMT 2019
On Thu, Feb 21, 2019 at 5:58 AM Jan Hubicka <hubicka@ucw.cz> wrote:
>
> Hello,
>
> 2019-02-01 H.J. Lu <hongjiu.lu@intel.com>
> Hongtao Liu <hongtao.liu@intel.com>
> Sunil K Pandey <sunil.k.pandey@intel.com>
>
> PR target/87007
> * config/i386/i386-passes.def: Add
> pass_remove_partial_avx_dependency.
> * config/i386/i386-protos.h
> (make_pass_remove_partial_avx_dependency): New.
> * config/i386/i386.c (make_pass_remove_partial_avx_dependency):
> New function.
> (pass_data_remove_partial_avx_dependency): New.
> (pass_remove_partial_avx_dependency): Likewise.
> (make_pass_remove_partial_avx_dependency): Likewise.
> * config/i386/i386.md (partial_xmm_update): New attribute.
> (*extendsfdf2): Add partial_xmm_update.
> (truncdfsf2): Likewise.
> (*float<SWI48:mode><MODEF:mode>2): Likewise.
> (SF/DF conversion splitters): Disabled for TARGET_AVX.
>
> gcc/testsuite/
>
> 2019-02-01 H.J. Lu <hongjiu.lu@intel.com>
> Hongtao Liu <hongtao.liu@intel.com>
> Sunil K Pandey <sunil.k.pandey@intel.com>
>
> PR target/87007
> * gcc.target/i386/pr87007-1.c: New test.
> * gcc.target/i386/pr87007-2.c: Likewise.
>
>
> It seems to me that more systematic way would be to use mode switching
> pass that uses the LCM framework and possibly tweak LCM to do the right
> thing with respect to loops (easy solution would be to lift insertion
> points to the dominators with smaller frequency even if there may be path
> that does not execute the instruction needing the pxor).
>
> Teaching LCM framework is however more intrusive than self contained
> minipass and Since the patch solves a regression and is self contained I
> guess we should go ahead with it for this release and look for more
> systematic solutions later.
>
> Patch is OK with the following change.
>
> +static unsigned int
> +remove_partial_avx_dependency (void)
> +{
> + timevar_push (TV_MACH_DEP);
> +
> + calculate_dominance_info (CDI_DOMINATORS);
> + df_set_flags (DF_DEFER_INSN_RESCAN);
> + df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
> + df_md_add_problem ();
> + df_analyze ();
>
> Please delay the initialization after you hit first instruction that
I changed it to:
if (v4sf_const0)
{
calculate_dominance_info (CDI_DOMINATORS);
df_set_flags (DF_DEFER_INSN_RESCAN);
df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
df_md_add_problem ();
df_analyze ();
/* (Re-)discover loops so that bb->loop_father can be used in the
analysis below. */
loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
/* Generate a vxorps at entry of the nearest dominator for basic
blocks with conversions, which is in the the fake loop that
contains the whole function, so that there is only a single
vxorps in the whole function. */
bb = nearest_common_dominator_for_set (CDI_DOMINATORS,
convert_bbs);
while (bb->loop_father->latch
!= EXIT_BLOCK_PTR_FOR_FN (cfun))
bb = get_immediate_dominator (CDI_DOMINATORS,
bb->loop_father->header);
insn = BB_HEAD (bb);
if (!NONDEBUG_INSN_P (insn))
insn = next_nonnote_nondebug_insn (insn);
set = gen_rtx_SET (v4sf_const0, CONST0_RTX (V4SFmode));
set_insn = emit_insn_before (set, insn);
df_insn_rescan (set_insn);
df_process_deferred_rescans ();
loop_optimizer_finalize ();
}
> needs processing. The pass is run unconditionally and in many functions
> it will do noting. Can you also gate the pass to run only of AVX is
> enabled?
There are
virtual bool gate (function *)
{
return (TARGET_AVX
&& TARGET_SSE_PARTIAL_REG_DEPENDENCY
&& TARGET_SSE_MATH
&& optimize
&& optimize_function_for_speed_p (cfun));
}
> Patch is OK with this change. Please way a day for possible Uros' or RM
> reactions. Sorry for the delayed reaction.
> Honza
This is the updated patch I am going to check in tomorrow.
Thanks.
--
H.J.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-8-9-Regression-i386-Add-pass_remove_partial_avx_depe.patch
Type: text/x-patch
Size: 14989 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20190221/9e9286d5/attachment.bin>
More information about the Gcc-patches
mailing list