[Bug tree-optimization/60890] New: Performance regression in 4.8 for memory postinc
hariharan.gcc at gmail dot com
gcc-bugzilla@gcc.gnu.org
Fri Apr 18 18:50:00 GMT 2014
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60890
Bug ID: 60890
Summary: Performance regression in 4.8 for memory postinc
Product: gcc
Version: 4.8.1
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: hariharan.gcc at gmail dot com
Created attachment 32632
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=32632&action=edit
The source testcase
The attached testcase is a simplified version of the code where i saw this
problem initially. At the end of tree stages, 4.7 compiler used to generate two
consecutive stores, and then add to update the pointer as shown below for the
stores in the innermost loop.
MEM[base: aptr_54, offset: 0B] = res1_22;
MEM[base: aptr_54, offset: 4B] = res2_27;
D.1771_63 = (sizetype) aptr_54;
D.1772_64 = D.1771_63 + 8;
4.8 compiler generates
MEM[base: base_76, offset: 0B] = res1_32;
_29 = (unsigned long) base_76;
_83 = _29 + 8;
base_84 = (int *) _83;
MEM[base: base_84, offset: -4B] = res2_37;
for the same 2 stores. In our private port, which can do post-inc on load/store
operations, 4.7 used to generate optimal code whereas 4.8 code is not very
pretty.
The problem seems to stem from the fix made to Bug 48814, which generates
post-inc operations in a different order from 4.7. Should the tree optimization
passes have fixed it up?
At the end of tree-optimization passes, i can see the problem in x86 as well.
Compile the attached code with 4.7.x and 4.8.x to see the difference.
More information about the Gcc-bugs
mailing list