[google] remove redundant push {lr} for -mthumb (issue4441050)

Carrot Wei carrot@google.com
Wed Apr 20 09:17:00 GMT 2011


On Tue, Apr 19, 2011 at 8:55 PM, Richard Earnshaw <rearnsha@arm.com> wrote:
>
> On Tue, 2011-04-19 at 17:41 +0800, Guozhi Wei wrote:
>> Reload pass tries to determine the stack frame, so it needs to check the
>> push/pop lr optimization opportunity. One of the criteria is if there is any
>> far jump inside the function. Unfortunately at this time gcc can't decide each
>> instruction's length and basic block layout, so it can't know the offset of
>> a jump. To be conservative it assumes every jump is a far jump. So any jump
>> in a function will prevent this push/pop lr optimization.
>>
>> To enable the push/pop lr optimization in reload pass, I compute the possible
>> maximum length of the function body. If the length is not large enough, far
>> jump is not necessary, so we can safely do push/pop lr optimization.
>>
>> Tested on arm qemu with options -march=armv5te -mthumb, without regression.
>>
>> This patch is for google/main.
>>
>> 2011-04-19  Guozhi Wei  <carrot@google.com>
>>
>>       Google ref 40255.
>>       * gcc/config/arm/arm.c (SHORTEST_FAR_JUMP_LENGTH): New constant.
>>       (estimate_function_length): New function.
>>       (thumb_far_jump_used_p): No far jump is needed in short function.
>>
>
> Setting aside for the moment Richi's issue with hot/cold sections, this
> isn't safe.  Firstly get_attr_length() doesn't return the worst case
> length; and secondly, it doesn't take into account the size of reload
> insns that are still on the reloads stack -- these are only emitted
> right at the end of the reload pass.  Both of these would need to be
> addressed before this can be safely done.
>
> It's worth noting here that in the dim and distant past we used to try
> to estimate the size of the function and eliminate redundant saves of
> R14, but the code had to be removed because it was too fragile; but it
> looks like some vestiges of the code are still in the compiler.
>
> A slightly less optimistic approach, but one that is much safer is to
> scan the function after reload has completed and see if we can avoid
> having to push LR.  We can do this if:
>
I guess "less optimistic" is relative to the ideal optimization
situation, I believe it is still much better than current result. Do
you think if arm_reorg() is appropriate place to do this?

thanks
Carrot



More information about the Gcc-patches mailing list