[Bug tree-optimization/60172] ARM performance regression from trunk@207239

rguenther at suse dot de gcc-bugzilla@gcc.gnu.org
Mon Feb 17 10:07:00 GMT 2014


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172

--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 17 Feb 2014, joey.ye at arm dot com wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172
> 
> --- Comment #8 from Joey Ye <joey.ye at arm dot com> ---
> Here is tree dump and diff of 133t.forwprop4
>   <bb 2>:
>   Int_Index_4 = Int_1_Par_Val_3(D) + 5;
>   Int_Loc.0_5 = (unsigned int) Int_Index_4;
>   _6 = Int_Loc.0_5 * 4;
>   _8 = Arr_1_Par_Ref_7(D) + _6;
>   *_8 = Int_2_Par_Val_10(D);
>   _13 = _6 + 4;
>   _14 = Arr_1_Par_Ref_7(D) + _13;
>   *_14 = Int_2_Par_Val_10(D);
>   _17 = _6 + 60;
>   _18 = Arr_1_Par_Ref_7(D) + _17;
>   *_18 = Int_Index_4;
>   pretmp_20 = Int_Loc.0_5 * 100;
>   pretmp_2 = Arr_2_Par_Ref_22(D) + pretmp_20;
>   _42 = (sizetype) Int_1_Par_Val_3(D);
>   _41 = _42 * 4;
> -  _40 = pretmp_2 + _41; // good
> +  _12 = _41 + pretmp_20; // bad
> +  _40 = Arr_2_Par_Ref_22(D) + _12;  // bad
>   MEM[(int[25] *)_40 + 20B] = Int_Index_4;
>   MEM[(int[25] *)_40 + 24B] = Int_Index_4;
>   _29 = MEM[(int[25] *)_40 + 16B];
>   _30 = _29 + 1;
>   MEM[(int[25] *)_40 + 16B] = _30;
>   _32 = pretmp_20 + 1000;
>   _33 = Arr_2_Par_Ref_22(D) + _32;
>   _34 = *_8;
> -  _51 = _33 + _41;  // good
> +  _16 = _41 + _32;  // bad
> +  _51 = Arr_2_Par_Ref_22(D) + _16;  // bad
> 
>   MEM[(int[25] *)_51 + 20B] = _34;
>   Int_Glob = 5;
>   return;

But that doesn't make sense - it means that -fdisable-tree-forwprop4
should get numbers back to good speed, no?  Because that's the
only change forwprop4 does.

For completeness please base checks on r207316 (it contains a fix
for the blamed revision, but as far as I can see it shouldn't make
a difference for the testcase).

Did you check whether my hackish patch fixes things?



More information about the Gcc-bugs mailing list