[Bug tree-optimization/60172] ARM performance regression from trunk@207239
rguenther at suse dot de
gcc-bugzilla@gcc.gnu.org
Mon Feb 17 10:07:00 GMT 2014
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172
--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 17 Feb 2014, joey.ye at arm dot com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172
>
> --- Comment #8 from Joey Ye <joey.ye at arm dot com> ---
> Here is tree dump and diff of 133t.forwprop4
> <bb 2>:
> Int_Index_4 = Int_1_Par_Val_3(D) + 5;
> Int_Loc.0_5 = (unsigned int) Int_Index_4;
> _6 = Int_Loc.0_5 * 4;
> _8 = Arr_1_Par_Ref_7(D) + _6;
> *_8 = Int_2_Par_Val_10(D);
> _13 = _6 + 4;
> _14 = Arr_1_Par_Ref_7(D) + _13;
> *_14 = Int_2_Par_Val_10(D);
> _17 = _6 + 60;
> _18 = Arr_1_Par_Ref_7(D) + _17;
> *_18 = Int_Index_4;
> pretmp_20 = Int_Loc.0_5 * 100;
> pretmp_2 = Arr_2_Par_Ref_22(D) + pretmp_20;
> _42 = (sizetype) Int_1_Par_Val_3(D);
> _41 = _42 * 4;
> - _40 = pretmp_2 + _41; // good
> + _12 = _41 + pretmp_20; // bad
> + _40 = Arr_2_Par_Ref_22(D) + _12; // bad
> MEM[(int[25] *)_40 + 20B] = Int_Index_4;
> MEM[(int[25] *)_40 + 24B] = Int_Index_4;
> _29 = MEM[(int[25] *)_40 + 16B];
> _30 = _29 + 1;
> MEM[(int[25] *)_40 + 16B] = _30;
> _32 = pretmp_20 + 1000;
> _33 = Arr_2_Par_Ref_22(D) + _32;
> _34 = *_8;
> - _51 = _33 + _41; // good
> + _16 = _41 + _32; // bad
> + _51 = Arr_2_Par_Ref_22(D) + _16; // bad
>
> MEM[(int[25] *)_51 + 20B] = _34;
> Int_Glob = 5;
> return;
But that doesn't make sense - it means that -fdisable-tree-forwprop4
should get numbers back to good speed, no? Because that's the
only change forwprop4 does.
For completeness please base checks on r207316 (it contains a fix
for the blamed revision, but as far as I can see it shouldn't make
a difference for the testcase).
Did you check whether my hackish patch fixes things?
More information about the Gcc-bugs
mailing list