This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/86020] New: Performance regression in Eigen geometry.cpp test starting with r248333


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86020

            Bug ID: 86020
           Summary: Performance regression in Eigen geometry.cpp test
                    starting with r248333
           Product: gcc
           Version: 8.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: wschmidt at gcc dot gnu.org
  Target Milestone: ---

GCC 8.1 has regressed by about 30% compared with GCC 7.3 on one of the Eigen
test cases when measured on Power9 hardware (powerpc64le-linux-gnu).  Similar
performance loss is seen on Power8 hardware as well.  Pat Haugen did some
bisecting and found that this began with r248334.

2017-05-22  Jan Hubicka  <hubicka@ucw.cz>

                * ipa-inline.c (edge_badness): Use inlined_time instead of
                inline_summaries->get.

The performance difference seems to be due to a change in inlining as a result
of the above patch.  The symptom is a marked increase in stack load/store
activity.

An example occurs in one of the hottest routines: 
_ZN5Eigen8internal20generic_product_implINS_5BlockIKNS_6MatrixIfLi4ELi4ELi0ELi4ELi4EEELi3ELi3ELb0EEENS3_IfLi3ELi8ELi0ELi3ELi8EEENS_10DenseShapeES8_Li3EE5addToINS2_IS7_Li3ELi8ELb1EEEEEvRT_RKS6_RKS7_

Below is the code with GCC 8.1.0 : 
0.15 :        1000b274:   lfs     f7,8(r9)
0.00 :        1000b278:   li      r5,2
0.00 :        1000b27c:   addi    r4,r1,464
0.06 :        1000b280:   lfs     f8,4(r9)
0.09 :        1000b284:   lfs     f9,0(r9)
0.00 :        1000b288:   lfs     f0,4(r31)
0.00 :        1000b28c:   addi    r3,r1,648
2.61 :        1000b290:   lfs     f10,32(r10)
0.71 :        1000b294:   lfs     f11,16(r10)
0.46 :        1000b298:   lfs     f12,0(r10)
1.99 :        1000b29c:   fmuls   f10,f10,f7
4.36 :        1000b2a0:   fmadds  f11,f11,f8,f10
4.20 :        1000b2a4:   fmadds  f12,f12,f9,f11
4.08 :        1000b2a8:   fadds   f0,f0,f12
7.67 :        1000b2ac:   stfs    f0,4(r31)
0.00 :        1000b2b0:   bl      10004838
0.00 :        1000b2b4:   nop
0.06 :        1000b2b8:   ld      r5,664(r1)
0.00 :        1000b2bc:   ld      r10,680(r1)
0.00 :        1000b2c0:   ld      r9,512(r1)
0.00 :        1000b2c4:   addi    r4,r1,32
0.00 :        1000b2c8:   ld      r6,688(r1)
0.06 :        1000b2cc:   ld      r7,696(r1)
0.00 :        1000b2d0:   ld      r8,712(r1)
0.03 :        1000b2d4:   ld      r0,648(r1)
0.00 :        1000b2d8:   addi    r3,r1,1104
0.00 :        1000b2dc:   ld      r11,704(r1)
0.09 :        1000b2e0:   std     r28,144(r1)
0.00 :        1000b2e4:   std     r30,152(r1)
0.00 :        1000b2e8:   std     r26,160(r1)
0.18 :        1000b2ec:   std     r5,48(r1)
0.09 :        1000b2f0:   std     r10,64(r1)
0.03 :        1000b2f4:   ld      r5,728(r1)
0.00 :        1000b2f8:   ld      r10,720(r1)
0.06 :        1000b2fc:   std     r9,136(r1)
0.00 :        1000b300:   add     r9,r9,r29
0.09 :        1000b304:   std     r0,32(r1)
0.46 :        1000b308:   std     r6,600(r1)
0.06 :        1000b30c:   std     r6,72(r1)
0.18 :        1000b310:   std     r7,608(r1)

0.09 :        1000b314:   std     r7,80(r1)
0.12 :        1000b318:   std     r11,88(r1)
0.40 :        1000b31c:   std     r9,120(r1)
0.09 :        1000b320:   std     r8,624(r1)
0.06 :        1000b324:   std     r8,96(r1)
0.03 :        1000b328:   std     r10,632(r1)
0.06 :        1000b32c:   std     r10,104(r1)
0.06 :        1000b330:   std     r5,112(r1)
0.15 :        1000b334:   bl      1000a248  <calls the map base evaluator
function>
The additional instructions can be seen between addresses 1000b2b8 and
1000b330. 

Below is the code for GCC 7.3.0 : 
0.16 :        1000ac88:   lfs     f7,8(r9)
0.04 :        1000ac8c:   ld      r10,488(r1)
0.08 :        1000ac90:   addi    r4,r1,176
0.35 :        1000ac94:   lfs     f8,4(r9)
0.28 :        1000ac98:   lfs     f9,0(r9)
0.12 :        1000ac9c:   lfs     f0,0(r31)
0.00 :        1000aca0:   ld      r9,496(r1)
0.00 :        1000aca4:   addi    r3,r1,1160
3.39 :        1000aca8:   lfs     f10,32(r8)
0.87 :        1000acac:   lfs     f11,16(r8)
0.83 :        1000acb0:   lfs     f12,0(r8)
0.32 :        1000acb4:   std     r10,864(r1)
0.12 :        1000acb8:   std     r10,776(r1)
0.04 :        1000acbc:   std     r9,872(r1)
0.24 :        1000acc0:   std     r9,784(r1)
1.93 :        1000acc4:   fmuls   f10,f10,f7
4.38 :        1000acc8:   fmadds  f11,f11,f8,f10
5.48 :        1000accc:   fmadds  f12,f12,f9,f11
5.99 :        1000acd0:   fadds   f0,f0,f12
10.25 :        1000acd4:   stfs    f0,0(r31)
0.00 :        1000acd8:   std     r7,304(r1)
0.20 :        1000acdc:   std     r9,224(r1)
0.00 :        1000ace0:   std     r21,176(r1)
0.24 :        1000ace4:   std     r27,192(r1)
0.00 :        1000ace8:   std     r23,208(r1)
0.00 :        1000acec:   std     r24,232(r1)
0.04 :        1000acf0:   std     r26,240(r1)
0.00 :        1000acf4:   std     r30,248(r1)
0.12 :        1000acf8:   std     r26,256(r1)
0.79 :        1000acfc:   std     r28,264(r1)
0.16 :        1000ad00:   std     r25,280(r1)
0.08 :        1000ad04:   std     r30,288(r1)
0.28 :        1000ad08:   std     r29,296(r1)
0.00 :        1000ad0c:   std     r10,216(r1)
0.00 :        1000ad10:   bl      1000aae8  <calls the map base evaluator
function>

The GCC 8.1 code has an additional call, no longer inlined, to the constructor
Eigen::Matrix().  We suspect this and similar changes in other variants of this
function are responsible.

It is hard to understand the patch that made this change out of context.  I
went back to the gcc-patches mailing list to look for discussion, but I wasn't
able to find a justification for this change.  Jan, could you explain its
purpose?

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]