Just like regparm on x86, we should be able to not sign extend the return value (and arguments) for ppc64 for local functions which don't have their address taken. An example is: static int f(int a) __attribute__((noinline)); static int f(int a) { return a+1; } int g(int a) { return f(a+1); } For the example above, we remove two extsw which are useless. I have no idea how much this will help real programs but it should help and not hurt.
I will be looking into this after working a libjava patch. Note the current asm is: _f: addi r3,r3,1 extsw r3,r3 blr .align 2 .p2align 4,,15 .globl _g _g: addi r3,r3,1 extsw r3,r3 b _f This is most likely can apply to x86_64 also so if someone over there should look into it. This also applies to all non-register sized types really and 32bit. for another example: static char f(char a) __attribute__((noinline)); static char f(char a) { return a+1; } char g(char a) { return f(a+1); } (this is much worse) (at -O2 -m32): _f: addi r3,r3,1 extsb r3,r3 blr .align 2 .globl _g _g: mflr r0 addi r3,r3,1 extsb r3,r3 stw r0,8(r1) stwu r1,-80(r1) bl _f addi r1,r1,80 lwz r0,8(r1) mtlr r0 blr
I don't have time to work on this.
Even though the generated code has changed, this is still a problem with 6.0 and in all supported versions prior to it as noted in bug 65010, for both powerpc64 and powerpc64le, where GCC emits: f: addi 3,3,1 extsw 3,3 blr g: addi 3,3,1 extsw 3,3 b f
*** Bug 65010 has been marked as a duplicate of this bug. ***