[PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

Hongtao Liu crazylht@gmail.com
Thu Sep 3 02:11:14 GMT 2020


On Wed, Sep 2, 2020 at 5:58 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Wed, Sep 02, 2020 at 09:57:08AM +0800, Hongtao Liu via Gcc-patches wrote:
> > +
> > +      first = XVECEXP (constant, 0, 0);
> > +      /* There could be some rtx like
> > +      (mem/u/c:V16QI (symbol_ref/u:DI ("*.LC1")))
> > +      but with "*.LC1" refer to V2DI constant vector.  */
> > +      if (GET_MODE (constant) != mode)
> > +     {
> > +       constant = simplify_subreg (mode, constant, GET_MODE (constant), 0);
> > +       if (constant == NULL_RTX || GET_CODE (constant) != CONST_VECTOR)
> > +         return;
> > +     }
>
> The
>       first = XVECEXP (constant, 0, 0);
> line needs to be after this if, not before it, otherwise it will miscompile
> things or just ICE.
>

Changed.

> > @@ -2197,6 +2272,10 @@ remove_partial_avx_dependency (void)
> >         if (!NONDEBUG_INSN_P (insn))
> >           continue;
> >
> > +       /* Hanlde AVX512 embedded broadcast here to save compile time.  */
>
> s/Hanlde/Handle/
>

Changed, sorry for the typo.

> > +  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
> > +    {
> > +      if (!INSN_P (insn))
> > +     continue;
> > +      replace_constant_pool_with_broadcast (insn);
> > +    }
>
> Perhaps instead do:
>   for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
>     if (INSN_P (insn))
>       replace_constant_pool_with_broadcast (insn);
> ?
>

Changed.

> > +  /* opt_pass methods: */
> > +  virtual bool gate (function *)
> > +    {
> > +      /* Return false if rpad pass gate is true.
> > +      replace_constant_pool_with_broadcast is called
> > +      from both this pass and rpad pass.  */
> > +      return (TARGET_AVX512F
> > +           && !(TARGET_AVX
> > +                && TARGET_SSE_PARTIAL_REG_DEPENDENCY
> > +                && TARGET_SSE_MATH
> > +                && optimize
> > +                && optimize_function_for_speed_p (cfun)));
>
> I think this could be a maintainance nightmare.
> Perhaps instead add
>

Yes, a common interface should be added as bellow, changed.

> static bool
> remove_partial_avx_dependency_gate ()
> {
>   return (TARGET_AVX
>           && TARGET_SSE_PARTIAL_REG_DEPENDENCY
>           && TARGET_SSE_MATH
>           && optimize
>           && optimize_function_for_speed_p (cfun));
> }
> after the remove_partial_avx_dependency function definition,
> change pass_remove_partial_avx_dependency gate body to
>       return remove_partial_avx_dependency_gate ();
> and in pass_constant_pool_broadcast::gate do
>       return (TARGET_AVX512F && !remove_partial_avx_dependency_gate ();
> (with the comment you have there)?
>
> LGTM with those changes.
>
>         Jakub
>

Thanks for the review, update patch.

-- 
BR,
Hongtao
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Optimize-memory-broadcast-for-constant-vector-under-_V5.patch
Type: text/x-patch
Size: 39244 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20200903/dada2902/attachment-0001.bin>


More information about the Gcc-patches mailing list