This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RFC: Improve MIPS code generation.


Recently I have been thinking about gcc's code generation especially
WRT the MIPS o32 ABI, but I think that the ideas might be relevant to
other architectures as well.

I am not quite sure how to implement this, but wanted to get feedback.


For the MIPS o32 ABI (when compiling most application code) all non-leaf functions have a prolog generated by the .cpload pseudo-op which results in code being generated (by as) to set up the $gp (GOT pointer register) for the rest of the function.

A typical disassembly is something like this:

00400a9c <_Z2f1v>:
 400a9c:    3c1c0fc0     lui    gp,0xfc0
 400aa0:    279c7704     addiu    gp,gp,30468
 400aa4:    0399e021     addu    gp,gp,t9
 400aa8:    27bdffe0     addiu    sp,sp,-32
.
.
.


And a typical calling sequence would be: . . . 400c88: 8f998084 lw t9,-32636(gp) 400c8c: 0320f809 jalr t9 . . .

Now we know (or at least I think we know) that the gp will have the
same value for any code from a single compilation unit (.o input to
ld) because of how ld works WRT creating GOTs.

Here is the big question:

For function calls within a single compilation unit why couldn't the
jalr be replaced by a bal (branch and link) to the 4th instruction of
the target function?

You would save setting up $t9 before the call and setting up $gp in
the called function.  A total savings of 4 instructions for each
intra-compilation-unit call.

There might be even greater savings if conditional branches around
jalr could be replaced by conditional bal of the opposite condition.


Thanks for considering this,


David Daney



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]