This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/60172] ARM performance regression from trunk at 207239
- From: "steven at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 19 Feb 2014 23:06:14 +0000
- Subject: [Bug tree-optimization/60172] ARM performance regression from trunk at 207239
- Auto-submitted: auto-generated
- References: <bug-60172-4 at http dot gcc dot gnu dot org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172
Steven Bosscher <steven at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |steven at gcc dot gnu.org
--- Comment #12 from Steven Bosscher <steven at gcc dot gnu.org> ---
(In reply to Joey Ye from comment #11)
Sometimes it helps to use -fdump-rtl-slim. Matter of taste but I find
that much easier to interpret than LISP-like RTL dumps.
Annotated "good expansion":
;; _41 = _42 * 4;
20: r126=r131<<2
;; _40 = _2 + _41;
21: r136=r130+r119 // r136=Arr_2_Par_Ref+r119
22: r125=r136+r126 // r125=Arr_2_Par_Ref+r119+r131<<2
;; MEM[(int[25] *)_51 + 20B] = _34;
29: r139=r130+r119 // r139=Arr_2_Par_Ref+r119
30: r140=r139+r126 // r140=Arr_2_Par_Ref+r119+r131<<2 (==r125)
31: r141=r140+1000 // r141=Arr_2_Par_Ref+r119+r131<<2+1000 (==r125+1000)
32: [r141+20]=r124
In this case, the RHS for the SETs of r140 and r125 are lexically
identical for value numbering, so the job for CSE is easy.
Annotated "bad expansion":
;; _40 = Arr_2_Par_Ref_22(D) + _12;
22: r138=r128+r121
23: r127=r132+r138 // r127=Arr_2_Par_Ref+r128+r121
;; _32 = _20 + 1000;
29: r124=r121+1000
;; MEM[(int[25] *)_51 + 20B] = _34;
32: r141=r132+r124 // r141=Arr_2_Par_Ref+r121+1000
33: r142=r141+r128 // r142=Arr_2_Par_Ref+r128+r121+1000 (==r127+1000)
34: [r142+20]=r126
Here, the "+1000" confuses CSE. The sets of r127 and r142 have a common
sub-expression as value, but none of the sub-expressions are lexically
identical. RTL CSE has limited ability to look through sub-expressions
to identify "same value" sub-expressions (anchors, base regs, etc.) but
apparently this case is too complex for it to handle.