This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][ARM] Implement TARGET_SCHED_MACRO_FUSION_PAIR_P


Ping.

Thanks,
Kyrill

On 18/12/14 15:55, Kyrill Tkachov wrote:
Ping.

Thanks,
Kyrill

On 11/12/14 15:06, Kyrill Tkachov wrote:
Ping.
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00340.html

Thanks,
Kyrill

On 04/12/14 09:19, Kyrill Tkachov wrote:
On 02/12/14 22:58, Ramana Radhakrishnan wrote:
On Tue, Nov 11, 2014 at 11:55 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
Hi all,

This is the arm implementation of the macro fusion hook.
It tries to fuse movw+movt operations together. It also tries to take lo_sum
RTXs into account since those generate movt instructions as well.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?
     if (current_tune->fuseable_ops & ARM_FUSE_MOVW_MOVT)
+    {
+      /* We are trying to fuse
+         movw imm / movt imm
+         instructions as a group that gets scheduled together.  */
+
A comment here about the insn structure would be useful.
Done. It's similar to the aarch64 adrp+add case. It does make it easier
to read, thanks.

2014-12-04  Kyrylo Tkachov  kyrylo.tkachov@arm.com\

          * config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
          * config/arm/arm.c (arm_macro_fusion_p): New function.
          (arm_macro_fusion_pair_p): Likewise.
          (TARGET_SCHED_MACRO_FUSION_P): Define.
          (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
          (ARM_FUSE_NOTHING): Likewise.
          (ARM_FUSE_MOVW_MOVT): Likewise.
          (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
          arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
          arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
          arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
          arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
          arm_cortex_a5_tune): Specify fuseable_ops value.

+      set_dest = SET_DEST (curr_set);
+      if (GET_CODE (set_dest) == ZERO_EXTRACT)
+        {
+          if (CONST_INT_P (SET_SRC (curr_set))
+          && CONST_INT_P (SET_SRC (prev_set))
+          && REG_P (XEXP (set_dest, 0))
+          && REG_P (SET_DEST (prev_set))
+          && REGNO (XEXP (set_dest, 0)) == REGNO (SET_DEST (prev_set)))
+        return true;
+        }
+      else if (GET_CODE (SET_SRC (curr_set)) == LO_SUM
+               && REG_P (SET_DEST (curr_set))
+               && REG_P (SET_DEST (prev_set))
+               && GET_CODE (SET_SRC (prev_set)) == HIGH
+               && REGNO (SET_DEST (curr_set)) == REGNO (SET_DEST (prev_set)))
+        {
+          return true;
+        }
Can we add a fast path exit to be

if (GET_MODE (set_dest) != SImode)
      return false;
Done, but if/when we extend the function to handle more fusion cases it
will need to be
refactored, since we will want to just bail out of this MOVW+MOVT case
rather than the whole function.

I did think whether we wanted to use reg_overlap_mentioned_p as that
may simplify the logic a bit but that's  overkill here as we still
want to restrict it to the cases above.

Otherwise OK.
Here's the updated patch. I've tested on arm-none-eabi and made sure
that the
fusion still happens on the benchmarks I looked at.
Ok?

Thanks,
Kyrill

Ramana




+    }
+  return false;
Thanks,
Kyrill

2014-11-11  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

        * config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
        * config/arm/arm.c (arm_macro_fusion_p): New function.
        (arm_macro_fusion_pair_p): Likewise.
        (TARGET_SCHED_MACRO_FUSION_P): Define.
        (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
        (ARM_FUSE_NOTHING): Likewise.
        (ARM_FUSE_MOVW_MOVT): Likewise.
        (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
        arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
        arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
        arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
        arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
        arm_cortex_a5_tune): Specify fuseable_ops value.






Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]