This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/37525] New: IVOPTS difference causing 20% degradation in 173.applu benchmark


The following code is trimmed down from procedure buts() in the cpu2000
benchmark 173.applu. We noticed a degradation from this patch,
http://gcc.gnu.org/viewcvs?view=rev&revision=139712. Looking at the tree dumps,
106t.ivopts is when things first diverge from the prior revision.

work/spec_err> cat applu_reduced.f
      subroutine buts ( nx, tmat, d)

      implicit real*8 ( a-h, o-z )
      real*8  d
      dimension d( 5, 5, *)
      dimension tmat(5,5,*)

            do i = nx-1, 2, -1

               do m = 1, 2
                  do l = 1, 2
                     tmat( m, l,i ) = d( m, l, i )
                  end do
               end do
            end do

      return
      end
work/spec_err> ~/install/gcc/rev139712/bin/gfortran -O3 -S applu_reduced.f


rev139711:
.L3:
        lfd 0,0(9)       # (* d), tmp179
        lfd 13,40(9)     # (* d), tmp180
        lfd 12,8(9)      # (* d), tmp181
        lfd 11,48(9)     # (* d), tmp182
        addi 9,9,-200    # ivtmp.34, ivtmp.34,
        stfd 0,0(4)      # (* tmat), tmp179
        stfd 12,8(4)     # (* tmat), tmp181
        cmpw 7,9,5       # d, tmp183, ivtmp.34
        stfd 13,40(4)    # (* tmat), tmp180
        stfd 11,48(4)    # (* tmat), tmp182
        addi 4,4,-200    # ivtmp.37, ivtmp.37,
        bne 7,.L3        #
        blr


rev139712:
.L3:
        addi 0,9,5       # temp.45, ivtmp.34,
        addi 11,9,1      # temp.46, ivtmp.34,
        addi 10,9,6      # temp.47, ivtmp.34,
        slwi 8,9,3       # tmp146, ivtmp.34,
        slwi 0,0,3       # tmp151, temp.45,
        slwi 11,11,3     # tmp156, temp.46,
        slwi 10,10,3     # tmp161, temp.47,
        lfdx 0,5,0       # (* d), tmp155
        lfdx 13,5,11     # (* d), tmp160
        addic. 9,9,-25   # ivtmp.34, ivtmp.34,
        lfdx 12,5,10     # (* d), tmp165
        lfdx 11,5,8      # (* d), tmp150
        stfdx 11,4,8     # (* tmat), tmp150
        stfdx 0,4,0      # (* tmat), tmp155
        stfdx 13,4,11    # (* tmat), tmp160
        stfdx 12,4,10    # (* tmat), tmp165
        bne 0,.L3        #


Notice the change to using indexed ld/st, and resulting computations inside the
loop to compute index register value. The benchmark had inner loop ranges of
1..5 which made things even worse (more computations and spill in some
situations due to high register pressure).


-- 
           Summary: IVOPTS difference causing 20% degradation in 173.applu
                    benchmark
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: pthaugen at gcc dot gnu dot org
 GCC build triplet: powerpc64-linux
  GCC host triplet: powerpc64-linux
GCC target triplet: powerpc64-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37525


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]