Bug 23450 - local functions should not sign extend results (and arguments) for speed reasons
Summary: local functions should not sign extend results (and arguments) for speed reasons
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.1.0
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2005-08-18 03:41 UTC by Andrew Pinski
Modified: 2016-01-28 03:30 UTC (History)
4 users (show)

See Also:
Host:
Target: powerpc64-*-*
Build:
Known to work:
Known to fail: 4.9.3, 5.3.0, 6.0
Last reconfirmed: 2016-01-27 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Pinski 2005-08-18 03:41:54 UTC
Just like regparm on x86, we should be able to not sign extend the return value (and arguments) for 
ppc64 for local functions which don't have their address taken.
An example is:

static int f(int a) __attribute__((noinline));

static int f(int a)
{
  return a+1;
}

int g(int a)
{
  return f(a+1);
}

For the example above, we remove two extsw which are useless.
I have no idea how much this will help real programs but it should help and not hurt.
Comment 1 Andrew Pinski 2005-08-18 03:48:22 UTC
I will be looking into this after working a libjava patch.

Note the current asm is:
_f:
        addi r3,r3,1
        extsw r3,r3
        blr
        .align 2
        .p2align 4,,15
        .globl _g
_g:
        addi r3,r3,1
        extsw r3,r3
        b _f

This is most likely can apply to x86_64 also so if someone over there should look into it.

This also applies to all non-register sized types really and 32bit.
for another example:
static char f(char a) __attribute__((noinline));

static char f(char a)
{
  return a+1;
}

char g(char a)
{
  return f(a+1);
}


(this is much worse) (at -O2 -m32):
_f:
        addi r3,r3,1
        extsb r3,r3
        blr
        .align 2
        .globl _g
_g:
        mflr r0
        addi r3,r3,1
        extsb r3,r3
        stw r0,8(r1)
        stwu r1,-80(r1)
        bl _f
        addi r1,r1,80
        lwz r0,8(r1)
        mtlr r0
        blr
Comment 2 Andrew Pinski 2005-10-23 00:05:32 UTC
I don't have time to work on this.
Comment 3 Martin Sebor 2016-01-28 03:29:35 UTC
Even though the generated code has changed, this is still a problem with 6.0 and in all supported versions prior to it as noted in bug 65010, for both powerpc64 and powerpc64le, where GCC emits:

f:
	addi 3,3,1
	extsw 3,3
	blr

g:
	addi 3,3,1
	extsw 3,3
	b f
Comment 4 Martin Sebor 2016-01-28 03:30:52 UTC
*** Bug 65010 has been marked as a duplicate of this bug. ***