This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/37525] New: IVOPTS difference causing 20% degradation in 173.applu benchmark
- From: "pthaugen at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 15 Sep 2008 18:41:15 -0000
- Subject: [Bug tree-optimization/37525] New: IVOPTS difference causing 20% degradation in 173.applu benchmark
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
The following code is trimmed down from procedure buts() in the cpu2000
benchmark 173.applu. We noticed a degradation from this patch,
http://gcc.gnu.org/viewcvs?view=rev&revision=139712. Looking at the tree dumps,
106t.ivopts is when things first diverge from the prior revision.
work/spec_err> cat applu_reduced.f
subroutine buts ( nx, tmat, d)
implicit real*8 ( a-h, o-z )
real*8 d
dimension d( 5, 5, *)
dimension tmat(5,5,*)
do i = nx-1, 2, -1
do m = 1, 2
do l = 1, 2
tmat( m, l,i ) = d( m, l, i )
end do
end do
end do
return
end
work/spec_err> ~/install/gcc/rev139712/bin/gfortran -O3 -S applu_reduced.f
rev139711:
.L3:
lfd 0,0(9) # (* d), tmp179
lfd 13,40(9) # (* d), tmp180
lfd 12,8(9) # (* d), tmp181
lfd 11,48(9) # (* d), tmp182
addi 9,9,-200 # ivtmp.34, ivtmp.34,
stfd 0,0(4) # (* tmat), tmp179
stfd 12,8(4) # (* tmat), tmp181
cmpw 7,9,5 # d, tmp183, ivtmp.34
stfd 13,40(4) # (* tmat), tmp180
stfd 11,48(4) # (* tmat), tmp182
addi 4,4,-200 # ivtmp.37, ivtmp.37,
bne 7,.L3 #
blr
rev139712:
.L3:
addi 0,9,5 # temp.45, ivtmp.34,
addi 11,9,1 # temp.46, ivtmp.34,
addi 10,9,6 # temp.47, ivtmp.34,
slwi 8,9,3 # tmp146, ivtmp.34,
slwi 0,0,3 # tmp151, temp.45,
slwi 11,11,3 # tmp156, temp.46,
slwi 10,10,3 # tmp161, temp.47,
lfdx 0,5,0 # (* d), tmp155
lfdx 13,5,11 # (* d), tmp160
addic. 9,9,-25 # ivtmp.34, ivtmp.34,
lfdx 12,5,10 # (* d), tmp165
lfdx 11,5,8 # (* d), tmp150
stfdx 11,4,8 # (* tmat), tmp150
stfdx 0,4,0 # (* tmat), tmp155
stfdx 13,4,11 # (* tmat), tmp160
stfdx 12,4,10 # (* tmat), tmp165
bne 0,.L3 #
Notice the change to using indexed ld/st, and resulting computations inside the
loop to compute index register value. The benchmark had inner loop ranges of
1..5 which made things even worse (more computations and spill in some
situations due to high register pressure).
--
Summary: IVOPTS difference causing 20% degradation in 173.applu
benchmark
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: pthaugen at gcc dot gnu dot org
GCC build triplet: powerpc64-linux
GCC host triplet: powerpc64-linux
GCC target triplet: powerpc64-linux
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37525