[RFC, Instruction Scheduler, Stage1] New hook/code to perform fusion of dependent instructions

Richard Biener richard.guenther@gmail.com
Wed Apr 8 07:46:02 GMT 2020


On Tue, Apr 7, 2020 at 10:45 PM Pat Haugen via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> The Power processor has the ability to fuse certain pairs of dependent
> instructions to improve their performance if they appear back-to-back in
> the instruction stream. In looking at the current support for
> instruction fusion in GCC I saw the following 2 options.
>
> 1) TARGET_SCHED_MACRO_FUSION target hooks: Only looks at existing
> back-to-back instructions and will ensure the scheduler keeps them together.

The i386 backend uses this for compare + branch fusion.

> 2) -fsched-fusion/TARGET_SCHED_FUSION_PRIORITY: Runs as a separate
> scheduling pass before peephole2. Operates independently on a single
> insn. Used by ARM backend to assign higher priorities to base/disp loads
> and stores so that the scheduling pass will schedule loads/stores to
> adjacent memory back-to-back. Later these insns will be transformed into
> load/store pair insns.
>
> Neither of these work for Power's purpose because they don't deal with
> fusion of dependent insns that may not already be back-to-back. The
> TARGET_SCHED_REORDER[2] hooks also don't work since the dependent insn
> more than likely gets queued for N cycles so wouldn't be on the ready
> list for the reorder hooks to process. We want the ability for the
> scheduler to schedule dependent insn pairs back-to-back when possible
> (i.e. other dependencies of both insns have been satisfied).
>
> I have coded up a proof of concept that implements our needs via a new
> target hook. The hook is passed a pair of dependent insns and returns if
> they are a fusion candidate. It is called while removing the forward
> dependencies of the just scheduled insn. If a dependent insn becomes
> available to schedule and it's a fusion candidate with the just
> scheduled insn, then the new code moves it to the ready list (if
> necessary) and marks it as SCHED_GROUP (piggy-backing on the existing
> code used by TARGET_SCHED_MACRO_FUSION) to make sure the fusion
> candidate will be scheduled next. Following is the scheduling part of
> the diff. Does this sound like a feasible approach? I welcome any
> comments/discussion.

It would be nice test if the i386 use of TARGET_SCHED_MACRO_FUSION hooks
for compare + branch fusion can be transitioned to that scheme.  I think your
proposal doesn't improve things there since we can never schedule
branches earlier
but we could schedule the compare later.  But your proposal only
affects scheduling
of later insns, not the order of insns getting onto the ready list?

Richard.

> Thanks,
> Pat
>
>
> diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
> index 80687fb5359..7a62136d497 100644
> --- a/gcc/haifa-sched.c
> +++ b/gcc/haifa-sched.c
> @@ -4152,6 +4152,39 @@ schedule_insn (rtx_insn *insn)
>               && SCHED_GROUP_P (next)
>               && advance < effective_cost)
>             advance = effective_cost;
> +
> +         /* If all dependencies of this insn have now been resolved and the
> +            just scheduled insn was not part of a SCHED_GROUP, check if
> +            this dependent insn can be fused with the just scheduled insn.  */
> +         else if (effective_cost >= 0 && !SCHED_GROUP_P (insn)
> +                  && targetm.sched.dep_fusion
> +                  && targetm.sched.dep_fusion (insn, next))
> +           {
> +             /* Move to ready list if necessary.  */
> +             if (effective_cost > 0)
> +               {
> +                 queue_remove (next);
> +                 ready_add (&ready, next, true);
> +               }
> +
> +             /* Mark as sched_group.  */
> +             SCHED_GROUP_P (next) = 1;
> +
> +             /* Fix insn_tick.  */
> +             INSN_TICK (next) = INSN_TICK (insn);
> +
> +             /* Dump some debug output for success.  */
> +             if (sched_verbose >= 5)
> +               {
> +                 fprintf (sched_dump, ";;\t\tFusing dependent insns: ");
> +                 fprintf (sched_dump, "%4d %-30s --> ", INSN_UID (insn),
> +                          str_pattern_slim (PATTERN (insn)));
> +                 fprintf (sched_dump, "%4d %-30s\n", INSN_UID (next),
> +                          str_pattern_slim (PATTERN (next)));
> +               }
> +           }
>         }
>        else
>         /* Check always has only one forward dependence (to the first insn in


More information about the Gcc-patches mailing list