[google] remove redundant push {lr} for -mthumb (issue4441050)

Carrot Wei carrot@google.com
Wed Apr 20 11:00:00 GMT 2011


I will try this method for trunk later.

thanks
Carrot

On Wed, Apr 20, 2011 at 4:48 PM, Richard Earnshaw <rearnsha@arm.com> wrote:
>
> On Wed, 2011-04-20 at 16:26 +0800, Carrot Wei wrote:
>> On Tue, Apr 19, 2011 at 8:55 PM, Richard Earnshaw <rearnsha@arm.com> wrote:
>> >
>> > On Tue, 2011-04-19 at 17:41 +0800, Guozhi Wei wrote:
>> >> Reload pass tries to determine the stack frame, so it needs to check the
>> >> push/pop lr optimization opportunity. One of the criteria is if there is any
>> >> far jump inside the function. Unfortunately at this time gcc can't decide each
>> >> instruction's length and basic block layout, so it can't know the offset of
>> >> a jump. To be conservative it assumes every jump is a far jump. So any jump
>> >> in a function will prevent this push/pop lr optimization.
>> >>
>> >> To enable the push/pop lr optimization in reload pass, I compute the possible
>> >> maximum length of the function body. If the length is not large enough, far
>> >> jump is not necessary, so we can safely do push/pop lr optimization.
>> >>
>> >> Tested on arm qemu with options -march=armv5te -mthumb, without regression.
>> >>
>> >> This patch is for google/main.
>> >>
>> >> 2011-04-19  Guozhi Wei  <carrot@google.com>
>> >>
>> >>       Google ref 40255.
>> >>       * gcc/config/arm/arm.c (SHORTEST_FAR_JUMP_LENGTH): New constant.
>> >>       (estimate_function_length): New function.
>> >>       (thumb_far_jump_used_p): No far jump is needed in short function.
>> >>
>> >
>> > Setting aside for the moment Richi's issue with hot/cold sections, this
>> > isn't safe.  Firstly get_attr_length() doesn't return the worst case
>> > length; and secondly, it doesn't take into account the size of reload
>> > insns that are still on the reloads stack -- these are only emitted
>> > right at the end of the reload pass.  Both of these would need to be
>> > addressed before this can be safely done.
>> >
>> > It's worth noting here that in the dim and distant past we used to try
>> > to estimate the size of the function and eliminate redundant saves of
>> > R14, but the code had to be removed because it was too fragile; but it
>> > looks like some vestiges of the code are still in the compiler.
>> >
>> > A slightly less optimistic approach, but one that is much safer is to
>> > scan the function after reload has completed and see if we can avoid
>> > having to push LR.  We can do this if:
>> >
>> I guess "less optimistic" is relative to the ideal optimization
>> situation, I believe it is still much better than current result. Do
>> you think if arm_reorg() is appropriate place to do this?
>>
>
> Making the decision in a single pass would certainly be the best
> approach; and arm_reorg is certainly going to come after all other major
> code re-arrangements.  Indeed, you should probably do this after the
> minipool placement so that you can be sure that these don't bulk up the
> body of the function too much.
>
> As you are doing the elimination late on in the compilation you can do a
> better job of estimation by calling shorten_branches() to work out the
> precise length of each insn.  Then you can simply scan over the insns to
> work out if there is a branch that still needs r14.
>
> R.
>
>
>
>



More information about the Gcc-patches mailing list