36693 – missed optimization for pointer access with offset on powerpc

Bug 36693 - missed optimization for pointer access with offset on powerpc

Summary: missed optimization for pointer access with offset on powerpc

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	rtl-optimization (show other bugs)
Version:	4.2.3

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	missed-optimization

Depends on:
Blocks:

Reported:	2008-07-02 10:20 UTC by René Bürgel
Modified:	2015-11-22 17:37 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:	powerpc-linux-uclibc
Build:
Known to work:
Known to fail:	4.1.2, 4.2.3, 4.3.1
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description René Bürgel 2008-07-02 10:20:56 UTC

the following two equivalent functions are compiled into different asm-code. The bad thing is, that the more readable function (get_and_increment2) creates worse code. It is bigger and slower. This is because it uses one register more than the more optimized version get_and_increment1.


struct IntPtr
{
        int* m_ReadPtr;
};

int get_and_increment1(struct IntPtr* i)
{
        return *(i->m_ReadPtr++);
}

int get_and_increment2(struct IntPtr* i)
{
        i->m_ReadPtr++;
        return *(i->m_ReadPtr - 1);
}


00000000 <get_and_increment1>:
   0:   81 23 00 00     lwz     r9,0(r3)
   4:   80 09 00 00     lwz     r0,0(r9)
   8:   39 29 00 04     addi    r9,r9,4
   c:   91 23 00 00     stw     r9,0(r3)
  10:   7c 03 03 78     mr      r3,r0
  14:   4e 80 00 20     blr

00000018 <get_and_increment2>:
  18:   81 23 00 00     lwz     r9,0(r3)
  1c:   39 29 00 04     addi    r9,r9,4
  20:   91 23 00 00     stw     r9,0(r3)
  24:   80 69 ff fc     lwz     r3,-4(r9)
  28:   4e 80 00 20     blr

Comment 1 Andrew Pinski 2008-08-11 01:31:23 UTC

Hmm, this works on the trunk with -O2 -mtune=cell which means this is a scheduling issue:
[apinski@dhcp-10-98-10-216 ~]$ ~/gcc-mainline/bin/gcc -O2 -o - -S t.c -mtune=cell
        .file   "t.c"
        .section        ".text"
        .align 2
        .p2align 3,,7
        .globl get_and_increment1
        .type   get_and_increment1, @function
get_and_increment1:
        lwz 9,0(3)
        addi 0,9,4
        stw 0,0(3)
        lwz 3,0(9)
        blr
        .size   get_and_increment1,.-get_and_increment1
        .align 2
        .p2align 3,,7
        .globl get_and_increment2
        .type   get_and_increment2, @function
get_and_increment2:
        lwz 9,0(3)
        addi 0,9,4
        stw 0,0(3)
        lwz 3,0(9)
        blr
        .size   get_and_increment2,.-get_and_increment2
        .ident  "GCC: (GNU) 4.4.0 20080810 (experimental) [trunk revision 138922]"
        .section        .note.GNU-stack,"",@progbits

Comment 2 Segher Boessenkool 2015-11-22 17:37:53 UTC

GCC now generates the same code for both, no matter what tuning.

get_and_increment1:
        lwz 9,0(3)
        addi 10,9,4
        stw 10,0(3)
        lwz 3,0(9)
        blr

Closing as fixed.