This is the mail archive of the
mailing list for the GCC project.
RFC: Improve MIPS code generation.
- From: David Daney <ddaney at avtrex dot com>
- To: gcc at gcc dot gnu dot org
- Date: Fri, 03 Oct 2003 15:13:04 -0700
- Subject: RFC: Improve MIPS code generation.
Recently I have been thinking about gcc's code generation especially
WRT the MIPS o32 ABI, but I think that the ideas might be relevant to
other architectures as well.
I am not quite sure how to implement this, but wanted to get feedback.
For the MIPS o32 ABI (when compiling most application code) all
non-leaf functions have a prolog generated by the .cpload pseudo-op
which results in code being generated (by as) to set up the $gp (GOT
pointer register) for the rest of the function.
A typical disassembly is something like this:
400a9c: 3c1c0fc0 lui gp,0xfc0
400aa0: 279c7704 addiu gp,gp,30468
400aa4: 0399e021 addu gp,gp,t9
400aa8: 27bdffe0 addiu sp,sp,-32
And a typical calling sequence would be:
400c88: 8f998084 lw t9,-32636(gp)
400c8c: 0320f809 jalr t9
Now we know (or at least I think we know) that the gp will have the
same value for any code from a single compilation unit (.o input to
ld) because of how ld works WRT creating GOTs.
Here is the big question:
For function calls within a single compilation unit why couldn't the
jalr be replaced by a bal (branch and link) to the 4th instruction of
the target function?
You would save setting up $t9 before the call and setting up $gp in
the called function. A total savings of 4 instructions for each
There might be even greater savings if conditional branches around
jalr could be replaced by conditional bal of the opposite condition.
Thanks for considering this,