Where can I put the optimization of got for arm back end at?

Carrot Wei carrot@google.com
Fri Apr 2 04:06:00 GMT 2010

This is really a good question!

Consider the requirement of this optimization.

1. There should be at least 2 methods to load a global variable's
address from GOT. Usually it means using different relocation types.

2. By default all global variables access use the same one method.

3. In some cases (less than X global variables access) method A is
better, in other cases method B is better.

With these constraints a simplify_GOT optimization pass is applicable.
But these constraints are too weak. The new optimization pass nearly
can do nothing except a call to target specific hook. I suspect such a
pass is acceptable.

We can also add more constraints:

4. If we can restrict method A as following: first load the base
address of GOT into a register pic_reg, then the real global
variable's address is loaded as
            load offset_reg, the offset from GOT base to the GOT entry
            load address, [pic_reg + offset_reg]

With this constraint the new pass knows there is a special register
pic_reg, it can look for and count all usage of pic_reg. If all usages
are method A and the count is more than the target specific threshold,
then the usages can be rewritten as method B. The method detection and
rewritten should be target specific.

I don't know how other targets handle global address access with
-fpic. And how many targets satisfy these 4 constraints.


On Fri, Apr 2, 2010 at 4:31 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Thu, Apr 1, 2010 at 8:10 PM, Andrew Haley <aph@redhat.com> wrote:
>> On 28/03/10 15:45, Carrot Wei wrote:
>>> Hi
>>> The detailed description of the optimization is at
>>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43129. This is an ARM
>>> specific optimization.
>>> This optimization uses one less register (the register hold the GOT
>>> base), to get this beneficial the ideal place for it should be before
>>> register allocation.
>>> Usually expand pass generates instructions to load global variable's
>>> address from GOT entry for each access of the global variable. Later
>>> cse/gcse passes can remove many of them. In order to precisely model
>>> the cost, this optimization should be put after some cse/gcse passes.
>>> So what is the best place for this optimization? Is there any existed
>>> pass can be enhanced with this optimization? Or should I add a new
>>> pass?
>> The obvious place is machine-dependent reorg, which is a very late pass.
> Yes, and after register allocation, i.e. too late for Guozhi.
> Basically there is no place right now to stuff a pass like that.
> Question is: Is this optimization really, reallyreallyreally so target
> specific that a target-independent pass is not the better option?
> Ciao!
> Steven

More information about the Gcc mailing list