This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Performance regression
- From: Dale Johannesen <dalej at apple dot com>
- To: David Edelsohn <dje at watson dot ibm dot com>
- Cc: Dale Johannesen <dalej at apple dot com>, Richard Henderson <rth at redhat dot com>, gcc-patches at gcc dot gnu dot org
- Date: Fri, 27 Sep 2002 13:51:25 -0700
- Subject: Re: Performance regression
On Friday, September 27, 2002, at 10:47 AM, David Edelsohn wrote:
Dale Johannesen writes:
Dale> The behavior we're seeing is that the inline happens on AIX but
not
Dale> on Darwin. I don't think binds_local explains this. (Darwin
does
Dale> set flag_pic true by default.)
binds_local does explain this.
OK, got it now, that call way over there from nowhere near the inlining
code does do that. Sorry.
Setting flag_pic true by default may not be a wise choice for
Darwin. It definitely inhibits optimizations. This is why I separated
flag_pic on AIX. GCC uses flag_pic to affect both data placement and
shared library behavior. A target which always is PIC may only want
the
former by default, as does AIX.
I'll kick that around, thanks. The default is supposed to be suitable
for
shared library inclusion at present (most of the OS is shared
libraries).
In the meantime I tried with
-static -O3 -fomit-frame-pointer -funroll-loops
Now the inline does happen, but the bug still does not appear. Here is
the code for main(). You see I have 3 add insns while the AIX version
has only 1.
.globl _main
_main:
lis r2,ha16(_masktab)
mflr r10
stw r10,8(r1)
la r11,lo16(_masktab)(r2)
stwu r1,-64(r1)
lis r9,ha16(_psd)
la r5,lo16(_psd)(r9)
lha r0,0(r11)
lhz r3,0(r5)
lis r8,ha16(_bndpsd)
la r2,lo16(_bndpsd)(r8)
slwi r4,r0,1
lhz r7,2(r5)
sthx r3,r4,r2
li r3,0
lhzx r6,r4,r2
add r12,r6,r7
lhz r7,4(r5)
rlwinm r6,r12,0,0xffff
add r12,r6,r7
lhz r7,6(r5)
rlwinm r6,r12,0,0xffff
add r5,r6,r7
sthx r5,r4,r2
lha r0,2(r2)
cmpwi cr0,r0,140
bne- cr0,L19
addi r1,r1,64
lwz r4,8(r1)
mtlr r4
blr
L19:
bl _abort