[PATCH][ARM] Implement TARGET_SCHED_MACRO_FUSION_PAIR_P

Kyrill Tkachov kyrylo.tkachov@arm.com
Thu Dec 4 09:19:00 GMT 2014


On 02/12/14 22:58, Ramana Radhakrishnan wrote:
> On Tue, Nov 11, 2014 at 11:55 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>> Hi all,
>>
>> This is the arm implementation of the macro fusion hook.
>> It tries to fuse movw+movt operations together. It also tries to take lo_sum
>> RTXs into account since those generate movt instructions as well.
>>
>> Bootstrapped and tested on arm-none-linux-gnueabihf.
>>
>> Ok for trunk?
>
>
>>   if (current_tune->fuseable_ops & ARM_FUSE_MOVW_MOVT)
>> +    {
>> +      /* We are trying to fuse
>> +         movw imm / movt imm
>> +         instructions as a group that gets scheduled together.  */
>> +
> A comment here about the insn structure would be useful.

Done. It's similar to the aarch64 adrp+add case. It does make it easier 
to read, thanks.

2014-12-04  Kyrylo Tkachov  kyrylo.tkachov@arm.com\

       * config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
       * config/arm/arm.c (arm_macro_fusion_p): New function.
       (arm_macro_fusion_pair_p): Likewise.
       (TARGET_SCHED_MACRO_FUSION_P): Define.
       (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
       (ARM_FUSE_NOTHING): Likewise.
       (ARM_FUSE_MOVW_MOVT): Likewise.
       (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
       arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
       arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
       arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
       arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
       arm_cortex_a5_tune): Specify fuseable_ops value.

>
>> +      set_dest = SET_DEST (curr_set);
>> +      if (GET_CODE (set_dest) == ZERO_EXTRACT)
>> +        {
>> +          if (CONST_INT_P (SET_SRC (curr_set))
>> +          && CONST_INT_P (SET_SRC (prev_set))
>> +          && REG_P (XEXP (set_dest, 0))
>> +          && REG_P (SET_DEST (prev_set))
>> +          && REGNO (XEXP (set_dest, 0)) == REGNO (SET_DEST (prev_set)))
>> +        return true;
>> +        }
>> +      else if (GET_CODE (SET_SRC (curr_set)) == LO_SUM
>> +               && REG_P (SET_DEST (curr_set))
>> +               && REG_P (SET_DEST (prev_set))
>> +               && GET_CODE (SET_SRC (prev_set)) == HIGH
>> +               && REGNO (SET_DEST (curr_set)) == REGNO (SET_DEST (prev_set)))
>> +        {
>> +          return true;
>> +        }
> Can we add a fast path exit to be
>
> if (GET_MODE (set_dest) != SImode)
>    return false;

Done, but if/when we extend the function to handle more fusion cases it 
will need to be
refactored, since we will want to just bail out of this MOVW+MOVT case 
rather than the whole function.

>
> I did think whether we wanted to use reg_overlap_mentioned_p as that
> may simplify the logic a bit but that's  overkill here as we still
> want to restrict it to the cases above.
>
> Otherwise OK.

Here's the updated patch. I've tested on arm-none-eabi and made sure 
that the
fusion still happens on the benchmarks I looked at.
Ok?

Thanks,
Kyrill

>
> Ramana
>
>
>
>
>> +    }
>> +  return false;
>> Thanks,
>> Kyrill
>>
>> 2014-11-11  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>>
>>      * config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
>>      * config/arm/arm.c (arm_macro_fusion_p): New function.
>>      (arm_macro_fusion_pair_p): Likewise.
>>      (TARGET_SCHED_MACRO_FUSION_P): Define.
>>      (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
>>      (ARM_FUSE_NOTHING): Likewise.
>>      (ARM_FUSE_MOVW_MOVT): Likewise.
>>      (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
>>      arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
>>      arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
>>      arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
>>      arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
>>      arm_cortex_a5_tune): Specify fuseable_ops value.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: arm-macro-fusion-2.patch
Type: text/x-patch
Size: 13896 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20141204/32f72165/attachment.bin>


More information about the Gcc-patches mailing list